summaryrefslogtreecommitdiffstats
path: root/captioning.md
diff options
context:
space:
mode:
Diffstat (limited to 'captioning.md')
-rw-r--r--captioning.md201
1 files changed, 201 insertions, 0 deletions
diff --git a/captioning.md b/captioning.md
new file mode 100644
index 00000000..895e732b
--- /dev/null
+++ b/captioning.md
@@ -0,0 +1,201 @@
+[[!meta title="Captioning tips"]]
+[[!meta copyright="Copyright © 2021, 2022 Sacha Chua"]]
+
+Captions are great for making videos (especially technical ones!)
+easier to understand and search.
+
+If you see a talk that you'd like to caption, feel free to download it
+and start working on it with your favourite subtitle editor. Let me
+know what you pick by e-mailing me at <sacha@sachachua.com> so that I
+can update the index and try to avoid duplication of work. [Find talks that need captions here](https://emacsconf.org/help_with_main_captions). You can also help by [adding chapter markers to Q&A sessions](https://emacsconf.org/help_with_chapter_markers).
+
+You're welcome to work with captions using your favourite tool. We've
+been using <https://github.com/sachac/subed> to caption things as VTT
+or SRT in Emacs, often starting with autogenerated captions from
+OpenAI Whisper (the .vtt).
+
+We'll be posting VTT files so that they can be included by the HTML5
+video player (demo: <https://emacsconf.org/2021/talks/news/>), so if
+you use a different tool that produces another format, any format that
+can be converted into that one (like SRT or ASS) is fine. `subed` has
+a `subed-convert` command that might be useful for turning WebVTT
+files into tab-separated values (TSV) and back again, if you prefer a
+more concise format.
+
+You can e-mail me the subtitles when you're done, and then I can merge
+it into the video.
+
+You might find it easier to start with the autogenerated captions
+and then refer to any resources provided by the speaker in order to
+figure out spelling. Sometimes speakers provide pretty complete
+scripts, which is great, but they also tend to add extra words.
+
+# Reflowing the text
+
+First, let's start with reflowing. We like to have one line of
+captions about 60 characters long so that they'll display nicely in
+the stream. If the captions haven't been reflowed yet, you can reflow
+the captions at natural pausing points (ex: phrases) so that they're
+displayed nicely. You don't have to worry too much about getting the
+timestamps precisely.
+
+For example, instead of:
+
+- so i'm going to talk today about a
+- fun rewrite i did of uh of the bindat
+- package
+
+you can edit it to be more like:
+
+- So I'm going to talk today
+- about a fun rewrite I did
+- of the bindat package.
+
+You probably don't need to do this step if you're working with the VTT
+files in the backstage area, since we try to reflow things before
+people edit them, but we thought we'd demonstrate it in case people
+are curious.
+
+We start with the text file that OpenAI Whisper generates. We set my
+`fill-column` to 50 and use `display-fill-column-indicator-mode` to
+give myself a goal column. A little over is fine too. Then we use
+`emacsconf-reflow` from the
+[emacsconf-el](git.emacsconf.org/emacsconf-el/) repository to quickly
+split up the text into captions by looking for where we want to add
+newlines and then typing the word or words. We type in ' to join lines.
+Sometimes, if it splits at the wrong one, we just undo it and edit it
+normally.
+
+It took about 4 minutes to reflow John Wiegley's 5-minute presentation.
+
+<video src="https://media.emacsconf.org/reflowing.webm" controls=""></video>
+
+The next step is to align it with
+[aeneas](https://github.com/readbeyond/aeneas) to get the timestamps
+for each line of text. `subed-align` from the subed package helps with that.
+
+<video src="https://media.emacsconf.org/alignment.webm" controls=""></video>
+
+# Edit the VTT to fix misrecognized words
+
+The next step is to edit these subtitles. VTT files are plain text, so
+you can edit them with regular `text-mode` if you want to. If you're
+editing subtitles within Emacs,
+[subed](https://github.com/sachac/subed) can conveniently synchronize
+video playback with subtitle editing, which makes it easier to figure
+out technical words. subed tries to load the video based on the
+filename, but if it can't find it, you can use `C-c C-v`
+(`subed-mpv-find-media`) to play a file or `C-c C-u` to play a URL.
+
+Look for misrecognized words and edit them. We also like to change
+things to follow Emacs keybinding conventions. We sometimes spell out
+acronyms on first use or add extra information in brackets. The
+captions will be used in a transcript as well, so you can add
+punctuation, remove filler words, and try to make it read better.
+
+Sometimes you may want to tweak how the captions are split. You can
+use `M-j` (`subed-jump-to-current-subtitle`) to jump to the caption if
+I'm not already on it, listen for the right spot, and maybe use
+`M-SPC` to toggle playback. Use `M-.` (`subed-split-subtitle`) to
+split a caption at the current MPV playing position and `M-m`
+(`subed-merge-with-next`) to merge a subtitle with the next one. Times
+don't need to be very precise. If you don't understand a word or
+phrase, add two question marks (`[??]`) and move on. We'll ask the
+speakers to review the subtitles and can sort that out then.
+
+If there are multiple speakers, indicate switches between speakers
+with a `[speaker-name]:` tag.
+
+<video src="https://media.emacsconf.org/editing.webm" controls=""></video>
+
+Once you've gotten the hang of things, it might take between 1x to 4x
+the video time to edit captions.
+
+# Playing your subtitles together with the video
+
+To load a specific subtitle file in MPV, use the `--sub-file=` or
+`--sub-files=` command-line argument.
+
+If you're using subed, the video should autoplay if it's named the
+same as your subtitle file. If not, you can use `C-c C-v`
+(`subed-mpv-play-from-file`) to load the video file. You can toggle
+looping over the current subtitle with `C-c C-l`
+(`subed-toggle-loop-over-current-subtitle`), synchronizing player to
+point with `C-c ,` (`subed-toggle-sync-player-to-point`), and
+synchronizing point to player with `C-c .`
+(`subed-toggle-sync-point-to-player`).
+
+# Using word-level timing data
+
+If there is a `.srv2` file with word-level timing data, you can load
+it with `subed-word-data-load-from-file` from `subed-word-data.el` in
+the subed package. You can then split with the usual `M-.`
+(`subed-split-subtitle`), and it should use word-level timestamps when
+available.
+
+# Starting from a script
+
+Some talks don't have autogenerated captions, or you may prefer to
+start from scratch. Whenever the speaker has provided a script, you
+can use that as a starting point. One way is to start by making a VTT
+file with one subtitle spanning the whole video, like this:
+
+```text
+WEBVTT
+
+00:00:00.000 -> 00:39:07.000
+If the speaker provided a script, I usually put the script under this heading.
+```
+
+If you're using subed, you can move to the point to a good stopping
+point for a phrase, toggle playing with `M-SPC`, and then `M-.`
+(`subed-split-subtitle`) when the player reaches that point. If it's
+too fast, use `M-j` to repeat the current subtitle.
+
+# Starting from scratch
+
+You can send us a text file with just the text transcript in it and
+not worry about the timestamps. We can figure out the timing using
+[aeneas for forced alignment](https://www.readbeyond.it/aeneas/).
+
+If you want to try timing as you go, you might find it easier to start
+by making a VTT file with one subtitle spanning the whole video, like
+this:
+
+```text
+WEBVTT
+
+00:00:00.000 -> 00:39:07.000
+```
+
+Then start playback and type, using `M-.` (`subed-split-subtitle`) to
+split after a reasonable length for a subtitle. If it's too fast, use
+`M-j` to repeat the current subtitle.
+
+# Chapter markers
+
+In addition to the captions, you may also want to add chapter markers.
+An easy way to do that is to add a =NOTE Chapter heading= before the
+subtitle that starts the chapter. For example:
+
+```text
+...
+00:05:13.880 --> 00:05:20.119
+So yeah, like that's currently the problem.
+
+NOTE Embeddings
+
+00:05:20.120 --> 00:05:23.399
+So I want to talk about embeddings.
+...
+```
+
+We can then extract those with
+`emacsconf-subed-make-chapter-file-based-on-comments`.
+
+For an example of how chapter markers allow people to quickly navigate
+videos, see <https://emacsconf.org/2021/talks/bindat/> .
+
+Please let us know if you need any help!
+
+Sacha <sacha@sachachua.com>