diff options
Diffstat (limited to 'captioning.md')
-rw-r--r-- | captioning.md | 201 |
1 files changed, 201 insertions, 0 deletions
diff --git a/captioning.md b/captioning.md new file mode 100644 index 00000000..895e732b --- /dev/null +++ b/captioning.md @@ -0,0 +1,201 @@ +[[!meta title="Captioning tips"]] +[[!meta copyright="Copyright © 2021, 2022 Sacha Chua"]] + +Captions are great for making videos (especially technical ones!) +easier to understand and search. + +If you see a talk that you'd like to caption, feel free to download it +and start working on it with your favourite subtitle editor. Let me +know what you pick by e-mailing me at <sacha@sachachua.com> so that I +can update the index and try to avoid duplication of work. [Find talks that need captions here](https://emacsconf.org/help_with_main_captions). You can also help by [adding chapter markers to Q&A sessions](https://emacsconf.org/help_with_chapter_markers). + +You're welcome to work with captions using your favourite tool. We've +been using <https://github.com/sachac/subed> to caption things as VTT +or SRT in Emacs, often starting with autogenerated captions from +OpenAI Whisper (the .vtt). + +We'll be posting VTT files so that they can be included by the HTML5 +video player (demo: <https://emacsconf.org/2021/talks/news/>), so if +you use a different tool that produces another format, any format that +can be converted into that one (like SRT or ASS) is fine. `subed` has +a `subed-convert` command that might be useful for turning WebVTT +files into tab-separated values (TSV) and back again, if you prefer a +more concise format. + +You can e-mail me the subtitles when you're done, and then I can merge +it into the video. + +You might find it easier to start with the autogenerated captions +and then refer to any resources provided by the speaker in order to +figure out spelling. Sometimes speakers provide pretty complete +scripts, which is great, but they also tend to add extra words. + +# Reflowing the text + +First, let's start with reflowing. We like to have one line of +captions about 60 characters long so that they'll display nicely in +the stream. If the captions haven't been reflowed yet, you can reflow +the captions at natural pausing points (ex: phrases) so that they're +displayed nicely. You don't have to worry too much about getting the +timestamps precisely. + +For example, instead of: + +- so i'm going to talk today about a +- fun rewrite i did of uh of the bindat +- package + +you can edit it to be more like: + +- So I'm going to talk today +- about a fun rewrite I did +- of the bindat package. + +You probably don't need to do this step if you're working with the VTT +files in the backstage area, since we try to reflow things before +people edit them, but we thought we'd demonstrate it in case people +are curious. + +We start with the text file that OpenAI Whisper generates. We set my +`fill-column` to 50 and use `display-fill-column-indicator-mode` to +give myself a goal column. A little over is fine too. Then we use +`emacsconf-reflow` from the +[emacsconf-el](git.emacsconf.org/emacsconf-el/) repository to quickly +split up the text into captions by looking for where we want to add +newlines and then typing the word or words. We type in ' to join lines. +Sometimes, if it splits at the wrong one, we just undo it and edit it +normally. + +It took about 4 minutes to reflow John Wiegley's 5-minute presentation. + +<video src="https://media.emacsconf.org/reflowing.webm" controls=""></video> + +The next step is to align it with +[aeneas](https://github.com/readbeyond/aeneas) to get the timestamps +for each line of text. `subed-align` from the subed package helps with that. + +<video src="https://media.emacsconf.org/alignment.webm" controls=""></video> + +# Edit the VTT to fix misrecognized words + +The next step is to edit these subtitles. VTT files are plain text, so +you can edit them with regular `text-mode` if you want to. If you're +editing subtitles within Emacs, +[subed](https://github.com/sachac/subed) can conveniently synchronize +video playback with subtitle editing, which makes it easier to figure +out technical words. subed tries to load the video based on the +filename, but if it can't find it, you can use `C-c C-v` +(`subed-mpv-find-media`) to play a file or `C-c C-u` to play a URL. + +Look for misrecognized words and edit them. We also like to change +things to follow Emacs keybinding conventions. We sometimes spell out +acronyms on first use or add extra information in brackets. The +captions will be used in a transcript as well, so you can add +punctuation, remove filler words, and try to make it read better. + +Sometimes you may want to tweak how the captions are split. You can +use `M-j` (`subed-jump-to-current-subtitle`) to jump to the caption if +I'm not already on it, listen for the right spot, and maybe use +`M-SPC` to toggle playback. Use `M-.` (`subed-split-subtitle`) to +split a caption at the current MPV playing position and `M-m` +(`subed-merge-with-next`) to merge a subtitle with the next one. Times +don't need to be very precise. If you don't understand a word or +phrase, add two question marks (`[??]`) and move on. We'll ask the +speakers to review the subtitles and can sort that out then. + +If there are multiple speakers, indicate switches between speakers +with a `[speaker-name]:` tag. + +<video src="https://media.emacsconf.org/editing.webm" controls=""></video> + +Once you've gotten the hang of things, it might take between 1x to 4x +the video time to edit captions. + +# Playing your subtitles together with the video + +To load a specific subtitle file in MPV, use the `--sub-file=` or +`--sub-files=` command-line argument. + +If you're using subed, the video should autoplay if it's named the +same as your subtitle file. If not, you can use `C-c C-v` +(`subed-mpv-play-from-file`) to load the video file. You can toggle +looping over the current subtitle with `C-c C-l` +(`subed-toggle-loop-over-current-subtitle`), synchronizing player to +point with `C-c ,` (`subed-toggle-sync-player-to-point`), and +synchronizing point to player with `C-c .` +(`subed-toggle-sync-point-to-player`). + +# Using word-level timing data + +If there is a `.srv2` file with word-level timing data, you can load +it with `subed-word-data-load-from-file` from `subed-word-data.el` in +the subed package. You can then split with the usual `M-.` +(`subed-split-subtitle`), and it should use word-level timestamps when +available. + +# Starting from a script + +Some talks don't have autogenerated captions, or you may prefer to +start from scratch. Whenever the speaker has provided a script, you +can use that as a starting point. One way is to start by making a VTT +file with one subtitle spanning the whole video, like this: + +```text +WEBVTT + +00:00:00.000 -> 00:39:07.000 +If the speaker provided a script, I usually put the script under this heading. +``` + +If you're using subed, you can move to the point to a good stopping +point for a phrase, toggle playing with `M-SPC`, and then `M-.` +(`subed-split-subtitle`) when the player reaches that point. If it's +too fast, use `M-j` to repeat the current subtitle. + +# Starting from scratch + +You can send us a text file with just the text transcript in it and +not worry about the timestamps. We can figure out the timing using +[aeneas for forced alignment](https://www.readbeyond.it/aeneas/). + +If you want to try timing as you go, you might find it easier to start +by making a VTT file with one subtitle spanning the whole video, like +this: + +```text +WEBVTT + +00:00:00.000 -> 00:39:07.000 +``` + +Then start playback and type, using `M-.` (`subed-split-subtitle`) to +split after a reasonable length for a subtitle. If it's too fast, use +`M-j` to repeat the current subtitle. + +# Chapter markers + +In addition to the captions, you may also want to add chapter markers. +An easy way to do that is to add a =NOTE Chapter heading= before the +subtitle that starts the chapter. For example: + +```text +... +00:05:13.880 --> 00:05:20.119 +So yeah, like that's currently the problem. + +NOTE Embeddings + +00:05:20.120 --> 00:05:23.399 +So I want to talk about embeddings. +... +``` + +We can then extract those with +`emacsconf-subed-make-chapter-file-based-on-comments`. + +For an example of how chapter markers allow people to quickly navigate +videos, see <https://emacsconf.org/2021/talks/bindat/> . + +Please let us know if you need any help! + +Sacha <sacha@sachachua.com> |