summaryrefslogtreecommitdiffstats
path: root/2022
diff options
context:
space:
mode:
authorLeo Vivier <zaeph@zaeph.net>2022-11-01 20:06:18 +0100
committerLeo Vivier <zaeph@zaeph.net>2022-11-01 20:06:53 +0100
commitaf34948523c88a099f06cdb8410037b107b462ce (patch)
tree1f6b05fd351288d7c5a5107e0b32ce403c9a91b5 /2022
parentb2456352bb6f8046c61c352ecf388b169f21336c (diff)
downloademacsconf-wiki-af34948523c88a099f06cdb8410037b107b462ce.tar.xz
emacsconf-wiki-af34948523c88a099f06cdb8410037b107b462ce.zip
Push audio mastering workflow
Diffstat (limited to '2022')
-rw-r--r--2022/organizers-notebook.md119
-rw-r--r--2022/organizers-notebook/index.org113
2 files changed, 230 insertions, 2 deletions
diff --git a/2022/organizers-notebook.md b/2022/organizers-notebook.md
index 3ff1dfe3..88c13656 100644
--- a/2022/organizers-notebook.md
+++ b/2022/organizers-notebook.md
@@ -2213,6 +2213,121 @@ before the conference!
Sacha Chua
+### Mastering the prerec’s audio-track
+
+Mastering is the process of preparing an audio-track for a purpose. For
+us, the purpose is quite simple: maximize the intelligibility of the
+speaker and minimize the noise.
+
+We can get great results with Audacity for the vast majority of
+audio-tracks. Sometimes, however, some audio-tracks have intractable
+noise-profile that require the use of model-based denoising filters that
+can applied with ffmpeg.
+
+We’ll start with the average Audacity workflow, and we’ll move on to the
+model-based filters after.
+
+
+#### Audacity workflow
+
+When we process a prerec, we extract the audio of the original upload
+and add it to the backstage. You should be able to find it under the
+name &#x2013;original.$audio\_format. If it’s not there, it’s easy to extract
+the audio from the original video, but we’d prefer if you warned
+core-organizers about it because it’s not normal.
+
+We’ve simplified the process down to these steps:
+
+1. Open the audio file in Audacity.
+
+ You might want to increase the size of the waveform by pulling on the
+ bottom of the bottom of the track.
+
+<audacity-demo-resize.webm>
+
+1. Find a moment of quiet in the video, and select it.
+
+ We ask our speakers to include 5 seconds of quiet at the beginning or
+ end of their prerecs, but even if they don’t, it’s relatively.
+
+2. Effects → Noise Reduction → Get Noise Profile
+
+3. Select → All
+
+4. Effects → Noise Reduction → OK
+
+ You can select a spoken portion of the track before applying the
+ effect and preview it to test your settings. The default are usually
+ enough (Noise reduction (dB): 12, Sensitivity: 6.00, Frequency smoothing
+ (bands): 3).
+
+5. Tools → Apply Macro → Alpha
+
+ Before you can apply the Alpha macro, you need to save its content to
+ disk and import it via Tools → Macro Manager → Import.
+
+ Reverb:Delay="20" DryGain="5" HfDamping="99" Reverberance="15" RoomSize="70" StereoWidth="25" ToneHigh="0" ToneLow="100" WetGain="-13" WetOnly="0"
+ Amplify:Ratio="1"
+ FilterCurve:f0="79.621641" f1="101.02321" FilterLength="8191" InterpolateLin="0" InterpolationMethod="B-spline" v0="5.9148936" v1="0.042552948"
+ Normalize:ApplyGain="1" PeakLevel="-3" RemoveDcOffset="1" StereoIndependent="1"
+ Compressor:AttackTime="0.1" NoiseFloor="-50" Normalize="1" Ratio="2" ReleaseTime="1" Threshold="-30" UsePeak="0"
+
+1. Save the file to disk with libopus (.opus format)
+ Use the following settings:
+
+> Bit Rate: 64 kbps
+> VBR Mode: On
+> Compression: 10
+> Application: Audio
+> Frame Duration: 20 ms
+> Cutoff: Disabled
+
+![img](audacity-export-settings.png)
+
+
+#### Model-based denoising filter
+
+If you can’t manage to get a good result with Audacity, chances are it’s
+because there’s too much noise in the video, even after profile-based
+denoising. This usually happens when the noise-pattern of an
+audio-track evolves over the video, or if has an aperiodic quality. For
+those, we’re going to need a bigger boat.
+
+Model-based denoising means using an AI-generated model to remove the
+audio frequencies that are usually associated to noise and preserve
+those that aren’t. A different context (e.g. noisy room with statics,
+noisy room with people chatting, etc.) means a different model; for us,
+this means a model that minimizes background noise and maximizes clear
+voices (the speakers’).
+
+This is the model we’ve been using:
+Source: [rnnoise-models](https://github.com/GregorR/rnnoise-models), Model: [marathon-prescription](https://raw.githubusercontent.com/GregorR/rnnoise-models/master/marathon-prescription-2018-08-29/mp.rnnn)
+
+You should always apply the filter on the original’s audio, as opposed
+to an Audacity-processed audio. This is to ensure that we have the most
+information about the signal, which means we can have gather the most
+information about the noise-profile.
+
+Following is the ffmpeg incantation to use to apply the filter-model.
+Make sure to modify the `DENOISER` variable and adapt input/output.
+
+ DENOISER="/path/to/audio-denoiser-model-mp.rnnn"
+ input="original.opus"
+ output="denoised.opus"
+ ffmpeg -i "$input" -af "$DENOISER" "$output"
+
+There’s no need to customize the libopus export information; the default
+is more than enough for human-speech.
+
+When you’re done with this step, you can then process the outputted
+audio-track with Audacity, skipping the denoising steps (1 to 5).
+
+
+#### Questions?
+
+If you’ve got any question on the process, you canget in touch with me (zaeph)!
+
+
<a id="when-captioned"></a>
## When a talk is captioned
@@ -2717,7 +2832,7 @@ Probably focus on grabbing the audio first and seeing what&rsquo;s worth keeping
Make a table of the form
-<table id="org4850a3d" border="2" cellspacing="0" cellpadding="6" rules="groups" frame="hsides">
+<table id="orgdacd95f" border="2" cellspacing="0" cellpadding="6" rules="groups" frame="hsides">
<colgroup>
@@ -3685,7 +3800,7 @@ Where:
Nice if there&rsquo;s an Ansible playbook
sachac&rsquo;s notes:
- <file:///home/sacha/code/docker/emacsconf-publish/>
+ <file:///home/zaeph/code/docker/emacsconf-publish/>
- probably good to set it up on front
It&rsquo;s now on front.
diff --git a/2022/organizers-notebook/index.org b/2022/organizers-notebook/index.org
index 48d50561..1b431945 100644
--- a/2022/organizers-notebook/index.org
+++ b/2022/organizers-notebook/index.org
@@ -1579,6 +1579,119 @@ EmacsConf ${year}, and thank you for submitting the prerecorded video
before the conference!
Sacha Chua
+*** Mastering the prerec’s audio-track
+Mastering is the process of preparing an audio-track for a purpose. For
+us, the purpose is quite simple: maximize the intelligibility of the
+speaker and minimize the noise.
+
+We can get great results with Audacity for the vast majority of
+audio-tracks. Sometimes, however, some audio-tracks have intractable
+noise-profile that require the use of model-based denoising filters that
+can applied with ffmpeg.
+
+We’ll start with the average Audacity workflow, and we’ll move on to the
+model-based filters after.
+
+**** Audacity workflow
+When we process a prerec, we extract the audio of the original upload
+and add it to the backstage. You should be able to find it under the
+name --original.$audio_format. If it’s not there, it’s easy to extract
+the audio from the original video, but we’d prefer if you warned
+core-organizers about it because it’s not normal.
+
+We’ve simplified the process down to these steps:
+
+1. Open the audio file in Audacity.
+
+ You might want to increase the size of the waveform by pulling on the
+ bottom of the bottom of the track.
+
+[[file:audacity-demo-resize.webm]]
+
+2. Find a moment of quiet in the video, and select it.
+
+ We ask our speakers to include 5 seconds of quiet at the beginning or
+ end of their prerecs, but even if they don’t, it’s relatively.
+
+3. Effects → Noise Reduction → Get Noise Profile
+
+4. Select → All
+
+5. Effects → Noise Reduction → OK
+
+ You can select a spoken portion of the track before applying the
+ effect and preview it to test your settings. The default are usually
+ enough (Noise reduction (dB): 12, Sensitivity: 6.00, Frequency smoothing
+ (bands): 3).
+
+6. Tools → Apply Macro → Alpha
+
+ Before you can apply the Alpha macro, you need to save its content to
+ disk and import it via Tools → Macro Manager → Import.
+
+#+begin_src txt :eval no :tangle audacity-macro-alpha.txt
+Reverb:Delay="20" DryGain="5" HfDamping="99" Reverberance="15" RoomSize="70" StereoWidth="25" ToneHigh="0" ToneLow="100" WetGain="-13" WetOnly="0"
+Amplify:Ratio="1"
+FilterCurve:f0="79.621641" f1="101.02321" FilterLength="8191" InterpolateLin="0" InterpolationMethod="B-spline" v0="5.9148936" v1="0.042552948"
+Normalize:ApplyGain="1" PeakLevel="-3" RemoveDcOffset="1" StereoIndependent="1"
+Compressor:AttackTime="0.1" NoiseFloor="-50" Normalize="1" Ratio="2" ReleaseTime="1" Threshold="-30" UsePeak="0"
+#+end_src
+
+7. Save the file to disk with libopus (.opus format)
+ Use the following settings:
+
+#+begin_quote
+Bit Rate: 64 kbps
+VBR Mode: On
+Compression: 10
+Application: Audio
+Frame Duration: 20 ms
+Cutoff: Disabled
+#+end_quote
+
+[[file:audacity-export-settings.png]]
+
+**** Model-based denoising filter
+If you can’t manage to get a good result with Audacity, chances are it’s
+because there’s too much noise in the video, even after profile-based
+denoising. This usually happens when the noise-pattern of an
+audio-track evolves over the video, or if has an aperiodic quality. For
+those, we’re going to need a bigger boat.
+
+Model-based denoising means using an AI-generated model to remove the
+audio frequencies that are usually associated to noise and preserve
+those that aren’t. A different context (e.g. noisy room with statics,
+noisy room with people chatting, etc.) means a different model; for us,
+this means a model that minimizes background noise and maximizes clear
+voices (the speakers’).
+
+This is the model we’ve been using:
+Source: [[https://github.com/GregorR/rnnoise-models][rnnoise-models]], Model: [[https://raw.githubusercontent.com/GregorR/rnnoise-models/master/marathon-prescription-2018-08-29/mp.rnnn][marathon-prescription]]
+
+You should always apply the filter on the original’s audio, as opposed
+to an Audacity-processed audio. This is to ensure that we have the most
+information about the signal, which means we can have gather the most
+information about the noise-profile.
+
+Following is the ffmpeg incantation to use to apply the filter-model.
+Make sure to modify the ~DENOISER~ variable and adapt input/output.
+
+#+begin_src sh :tangle audio-denoiser.sh
+DENOISER="/path/to/audio-denoiser-model-mp.rnnn"
+input="original.opus"
+output="denoised.opus"
+ffmpeg -i "$input" -af "$DENOISER" "$output"
+#+end_src
+
+There’s no need to customize the libopus export information; the default
+is more than enough for human-speech.
+
+When you’re done with this step, you can then process the outputted
+audio-track with Audacity, skipping the denoising steps (1 to 5).
+
+**** Questions?
+If you’ve got any question on the process, you canget in touch with me (zaeph)!
+
** When a talk is captioned
:PROPERTIES:
:CUSTOM_ID: when-captioned