From af34948523c88a099f06cdb8410037b107b462ce Mon Sep 17 00:00:00 2001 From: Leo Vivier Date: Tue, 1 Nov 2022 20:06:18 +0100 Subject: Push audio mastering workflow --- 2022/organizers-notebook.md | 119 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 117 insertions(+), 2 deletions(-) (limited to '2022/organizers-notebook.md') diff --git a/2022/organizers-notebook.md b/2022/organizers-notebook.md index 3ff1dfe3..88c13656 100644 --- a/2022/organizers-notebook.md +++ b/2022/organizers-notebook.md @@ -2213,6 +2213,121 @@ before the conference! Sacha Chua +### Mastering the prerec’s audio-track + +Mastering is the process of preparing an audio-track for a purpose. For +us, the purpose is quite simple: maximize the intelligibility of the +speaker and minimize the noise. + +We can get great results with Audacity for the vast majority of +audio-tracks. Sometimes, however, some audio-tracks have intractable +noise-profile that require the use of model-based denoising filters that +can applied with ffmpeg. + +We’ll start with the average Audacity workflow, and we’ll move on to the +model-based filters after. + + +#### Audacity workflow + +When we process a prerec, we extract the audio of the original upload +and add it to the backstage. You should be able to find it under the +name –original.$audio\_format. If it’s not there, it’s easy to extract +the audio from the original video, but we’d prefer if you warned +core-organizers about it because it’s not normal. + +We’ve simplified the process down to these steps: + +1. Open the audio file in Audacity. + + You might want to increase the size of the waveform by pulling on the + bottom of the bottom of the track. + + + +1. Find a moment of quiet in the video, and select it. + + We ask our speakers to include 5 seconds of quiet at the beginning or + end of their prerecs, but even if they don’t, it’s relatively. + +2. Effects → Noise Reduction → Get Noise Profile + +3. Select → All + +4. Effects → Noise Reduction → OK + + You can select a spoken portion of the track before applying the + effect and preview it to test your settings. The default are usually + enough (Noise reduction (dB): 12, Sensitivity: 6.00, Frequency smoothing + (bands): 3). + +5. Tools → Apply Macro → Alpha + + Before you can apply the Alpha macro, you need to save its content to + disk and import it via Tools → Macro Manager → Import. + + Reverb:Delay="20" DryGain="5" HfDamping="99" Reverberance="15" RoomSize="70" StereoWidth="25" ToneHigh="0" ToneLow="100" WetGain="-13" WetOnly="0" + Amplify:Ratio="1" + FilterCurve:f0="79.621641" f1="101.02321" FilterLength="8191" InterpolateLin="0" InterpolationMethod="B-spline" v0="5.9148936" v1="0.042552948" + Normalize:ApplyGain="1" PeakLevel="-3" RemoveDcOffset="1" StereoIndependent="1" + Compressor:AttackTime="0.1" NoiseFloor="-50" Normalize="1" Ratio="2" ReleaseTime="1" Threshold="-30" UsePeak="0" + +1. Save the file to disk with libopus (.opus format) + Use the following settings: + +> Bit Rate: 64 kbps +> VBR Mode: On +> Compression: 10 +> Application: Audio +> Frame Duration: 20 ms +> Cutoff: Disabled + +![img](audacity-export-settings.png) + + +#### Model-based denoising filter + +If you can’t manage to get a good result with Audacity, chances are it’s +because there’s too much noise in the video, even after profile-based +denoising. This usually happens when the noise-pattern of an +audio-track evolves over the video, or if has an aperiodic quality. For +those, we’re going to need a bigger boat. + +Model-based denoising means using an AI-generated model to remove the +audio frequencies that are usually associated to noise and preserve +those that aren’t. A different context (e.g. noisy room with statics, +noisy room with people chatting, etc.) means a different model; for us, +this means a model that minimizes background noise and maximizes clear +voices (the speakers’). + +This is the model we’ve been using: +Source: [rnnoise-models](https://github.com/GregorR/rnnoise-models), Model: [marathon-prescription](https://raw.githubusercontent.com/GregorR/rnnoise-models/master/marathon-prescription-2018-08-29/mp.rnnn) + +You should always apply the filter on the original’s audio, as opposed +to an Audacity-processed audio. This is to ensure that we have the most +information about the signal, which means we can have gather the most +information about the noise-profile. + +Following is the ffmpeg incantation to use to apply the filter-model. +Make sure to modify the `DENOISER` variable and adapt input/output. + + DENOISER="/path/to/audio-denoiser-model-mp.rnnn" + input="original.opus" + output="denoised.opus" + ffmpeg -i "$input" -af "$DENOISER" "$output" + +There’s no need to customize the libopus export information; the default +is more than enough for human-speech. + +When you’re done with this step, you can then process the outputted +audio-track with Audacity, skipping the denoising steps (1 to 5). + + +#### Questions? + +If you’ve got any question on the process, you canget in touch with me (zaeph)! + + ## When a talk is captioned @@ -2717,7 +2832,7 @@ Probably focus on grabbing the audio first and seeing what’s worth keeping Make a table of the form - +
@@ -3685,7 +3800,7 @@ Where: Nice if there’s an Ansible playbook sachac’s notes: - + - probably good to set it up on front It’s now on front. -- cgit v1.2.3 From c4ab006e27a59f544d8559ac1ef5c2e612b99dea Mon Sep 17 00:00:00 2001 From: Leo Vivier Date: Tue, 1 Nov 2022 20:31:01 +0100 Subject: Fix count --- 2022/organizers-notebook.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) (limited to '2022/organizers-notebook.md') diff --git a/2022/organizers-notebook.md b/2022/organizers-notebook.md index 88c13656..a7935493 100644 --- a/2022/organizers-notebook.md +++ b/2022/organizers-notebook.md @@ -2242,26 +2242,26 @@ We’ve simplified the process down to these steps: You might want to increase the size of the waveform by pulling on the bottom of the bottom of the track. + + - - -1. Find a moment of quiet in the video, and select it. +2. Find a moment of quiet in the video, and select it. We ask our speakers to include 5 seconds of quiet at the beginning or end of their prerecs, but even if they don’t, it’s relatively. -2. Effects → Noise Reduction → Get Noise Profile +3. Effects → Noise Reduction → Get Noise Profile -3. Select → All +4. Select → All -4. Effects → Noise Reduction → OK +5. Effects → Noise Reduction → OK You can select a spoken portion of the track before applying the effect and preview it to test your settings. The default are usually enough (Noise reduction (dB): 12, Sensitivity: 6.00, Frequency smoothing (bands): 3). -5. Tools → Apply Macro → Alpha +6. Tools → Apply Macro → Alpha Before you can apply the Alpha macro, you need to save its content to disk and import it via Tools → Macro Manager → Import. @@ -2832,7 +2832,7 @@ Probably focus on grabbing the audio first and seeing what’s worth keeping Make a table of the form -
+
-- cgit v1.2.3 From 334808c1d14590a527bb8d431888965f46a87975 Mon Sep 17 00:00:00 2001 From: Leo Vivier Date: Tue, 1 Nov 2022 20:31:57 +0100 Subject: Fix script --- 2022/organizers-notebook.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to '2022/organizers-notebook.md') diff --git a/2022/organizers-notebook.md b/2022/organizers-notebook.md index a7935493..5da0d4c2 100644 --- a/2022/organizers-notebook.md +++ b/2022/organizers-notebook.md @@ -2314,7 +2314,7 @@ Make sure to modify the `DENOISER` variable and adapt input/output. DENOISER="/path/to/audio-denoiser-model-mp.rnnn" input="original.opus" output="denoised.opus" - ffmpeg -i "$input" -af "$DENOISER" "$output" + ffmpeg -i "$input" -af "arnndn=m=$DENOISER" "$output" There’s no need to customize the libopus export information; the default is more than enough for human-speech. @@ -2832,7 +2832,7 @@ Probably focus on grabbing the audio first and seeing what’s worth keeping Make a table of the form -
+
-- cgit v1.2.3 From 5cf248246dd035a97395b7fff2a53846f954b6ad Mon Sep 17 00:00:00 2001 From: Leo Vivier Date: Wed, 2 Nov 2022 06:07:31 +0100 Subject: Add links to media and slight rewords --- 2022/organizers-notebook.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) (limited to '2022/organizers-notebook.md') diff --git a/2022/organizers-notebook.md b/2022/organizers-notebook.md index 5da0d4c2..668ebcfb 100644 --- a/2022/organizers-notebook.md +++ b/2022/organizers-notebook.md @@ -2243,7 +2243,7 @@ We’ve simplified the process down to these steps: You might want to increase the size of the waveform by pulling on the bottom of the bottom of the track. - + [audacity-demo-noise-reduction.webm](https://media.emacsconf.org/misc/audacity-demo-noise-reduction.webm) 2. Find a moment of quiet in the video, and select it. @@ -2260,6 +2260,8 @@ We’ve simplified the process down to these steps: effect and preview it to test your settings. The default are usually enough (Noise reduction (dB): 12, Sensitivity: 6.00, Frequency smoothing (bands): 3). + + [audacity-demo-noise-reduction.webm](https://media.emacsconf.org/misc/audacity-demo-noise-reduction.webm) 6. Tools → Apply Macro → Alpha @@ -2272,7 +2274,7 @@ We’ve simplified the process down to these steps: Normalize:ApplyGain="1" PeakLevel="-3" RemoveDcOffset="1" StereoIndependent="1" Compressor:AttackTime="0.1" NoiseFloor="-50" Normalize="1" Ratio="2" ReleaseTime="1" Threshold="-30" UsePeak="0" -1. Save the file to disk with libopus (.opus format) +1. Export → Export Audio… → Opus Files (.opus format) Use the following settings: > Bit Rate: 64 kbps @@ -2282,7 +2284,7 @@ We’ve simplified the process down to these steps: > Frame Duration: 20 ms > Cutoff: Disabled -![img](audacity-export-settings.png) +[audacity-export-settings.png](https://media.emacsconf.org/misc/audacity-export-settings.png) #### Model-based denoising filter @@ -2832,7 +2834,7 @@ Probably focus on grabbing the audio first and seeing what’s worth keeping Make a table of the form -
+
-- cgit v1.2.3 From 02efc6edd346ef865856f7411c56cba7c243f9f0 Mon Sep 17 00:00:00 2001 From: Leo Vivier Date: Wed, 2 Nov 2022 06:09:09 +0100 Subject: Indent better --- 2022/organizers-notebook.md | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) (limited to '2022/organizers-notebook.md') diff --git a/2022/organizers-notebook.md b/2022/organizers-notebook.md index 668ebcfb..1efa8c6e 100644 --- a/2022/organizers-notebook.md +++ b/2022/organizers-notebook.md @@ -2275,16 +2275,17 @@ We’ve simplified the process down to these steps: Compressor:AttackTime="0.1" NoiseFloor="-50" Normalize="1" Ratio="2" ReleaseTime="1" Threshold="-30" UsePeak="0" 1. Export → Export Audio… → Opus Files (.opus format) + Use the following settings: - -> Bit Rate: 64 kbps -> VBR Mode: On -> Compression: 10 -> Application: Audio -> Frame Duration: 20 ms -> Cutoff: Disabled - -[audacity-export-settings.png](https://media.emacsconf.org/misc/audacity-export-settings.png) + + [audacity-export-settings.png](https://media.emacsconf.org/misc/audacity-export-settings.png) + + > Bit Rate: 64 kbps + > VBR Mode: On + > Compression: 10 + > Application: Audio + > Frame Duration: 20 ms + > Cutoff: Disabled #### Model-based denoising filter @@ -2834,7 +2835,7 @@ Probably focus on grabbing the audio first and seeing what’s worth keeping Make a table of the form -
+
-- cgit v1.2.3 From ab79bbee173f08a82a211ff8e1325af42597e8f0 Mon Sep 17 00:00:00 2001 From: Leo Vivier Date: Wed, 2 Nov 2022 06:11:01 +0100 Subject: Add link to model --- 2022/organizers-notebook.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to '2022/organizers-notebook.md') diff --git a/2022/organizers-notebook.md b/2022/organizers-notebook.md index 1efa8c6e..32849311 100644 --- a/2022/organizers-notebook.md +++ b/2022/organizers-notebook.md @@ -2304,6 +2304,9 @@ this means a model that minimizes background noise and maximizes clear voices (the speakers’). This is the model we’ve been using: + +[audio-denoiser-model-mp.rnnn](https://media.emacsconf.org/misc/audio-denoiser-model-mp.rnnn) (download link) + Source: [rnnoise-models](https://github.com/GregorR/rnnoise-models), Model: [marathon-prescription](https://raw.githubusercontent.com/GregorR/rnnoise-models/master/marathon-prescription-2018-08-29/mp.rnnn) You should always apply the filter on the original’s audio, as opposed @@ -2835,7 +2838,7 @@ Probably focus on grabbing the audio first and seeing what’s worth keeping Make a table of the form -
+
-- cgit v1.2.3