harvesting.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107

[[!meta title="Harvesting Q&A"]]
[[!meta copyright="Copyright &copy; 2021, 2022 Sacha Chua"]]

During the harvesting phase of the conference, we work on collecting
the ideas that people shared in the Q&A sessions as well as any talks
that were not available as pre-recorded videos. It's a great way to
help speakers get stuff out of their heads and into a form we can all
learn from. Here's a process for doing so.

# Skim the transcript and the videos to see if anything needs to be removed, and which video to use

BigBlueButton gives us the webcams and audio as one video
(`--bbb-webcams.webm`) and the screenshare (if any) as another video
(`--bbb-deskshare.webm`). If the speaker shared their screen, we can
focus on that instead of their webcam. The following ffmpeg command
combines the audio from the webcams (which has been previously
extracted into a separate file, `--bbb-webcams.opus`) with the video
from the screenshare.

    ffmpeg -i example--bbb-webcams.opus -i example--bbb-deskshare.webm -c copy example--answers.webm

We also want to check if people accidentally shared sensitive
information on their screen, or if anyone said something that they
might not have said if they remembered that thethe Q&A videos will be
shared after the talk. Sometimes there's some time before we get
around to closing the meeting at the end of the Q&A. Usually, a quick
read of the transcript will show anything that needs to be trimmed.
Here's how to stop the recording at a specified time:

    ffmpeg -i input.webm -to hh:mm:ss -c copy output.webm

Cutting out stuff from the middle of a recording is slightly more
complicated. It might be easier to use a nonlinear video editor such
as kdenlive to edit the video. If you want to use ffmpeg, using
filters to select the frames and reencode the video will probably work
out better than splitting the file into multiple parts and then
concatenating them without reencoding, as the latter tends to need to
be split on keyframes. Here's a sample command based on this
[StackOverflow](https://stackoverflow.com/questions/64866231/remove-a-section-from-the-middle-of-a-video-without-concat) answer that removes the section between 15 seconds and
45 seconds:

    ffmpeg -i input.webm \
      -vf  "select='not(between(t,15,45))',  setpts=N/FRAME_RATE/TB" \
      -af "aselect='not(between(t,15,45))', asetpts=N/SR/TB" \
      output.webm

Alternatively, you can let us know what parts needs to be trimmed, and
we can figure that part out.


# Add chapter markers

Chapter markers make it easier for people to jump to the part of the
Q&A that they're interested in. You can see an example of chapter
headings in the
[Q&A for asmblox](https://emacsconf.org/2022/talks/asmblox/). You can
make a text file with the hh:mm:ss or mm:ss timestamps and the chapter
headings.

    00:00 Introduction
    01:12 Why did you choose an internal state versus many 'state buffers'?
    02:10 Do you have plans to port shenzhen.io to Emacs?
    02:29 Did this use WASM?
    02:59 Why wasm rather than a more traditional Assembly dialect? It wouldn't be harder to implement, right?
    05:08 Any next projects on your mind?
    05:52 Does this work with any other paren-based editing packages?
    06:46 What kind of tool could use this idea?
    07:56 How did you go about designing the puzzles?
    08:39 What are your favorite changes in the upcoming Emacs 29?
    09:07 Are there tools to add more puzzles?

If you're not sure how something is spelled, you can look at the list
of questions asked during the Q&A sessions by going to the wiki page
for the talk (see the links from [[/2022/talks]]), or indicating it
with `??`.

Alternatively, you can edit the VTT file (`--bbb-webcams.vtt`) and add
NOTE comments with the chapter headings before the subtitles that are
part of that chapter. If you're using [subed](https://github.com/sachac/subed) to edit subtitles within
Emacs, you can split the subtitle as needed with `M-.`
(`subed-split-subtitle`) so that the subtitle starts with the
question. You don't have to worry about getting the timestamps exact,
as we can re-align them with `M-x subed-align`. Here's what that NOTE
comment can look like:

    NOTE Why did you choose an internal state versus many 'state buffers'?
    
    00:01:12.600 --> 00:01:16.039
    Okay. So, the first question is why did you choose an internal state

These can then be extracted with
`emacsconf-subed-make-chapter-file-based-on-comments` from `emacsconf-subed.el`
and included in our publishing workflow.

# Edit the transcript

If you want to make it even easier for people to learn from the Q&A,
you can edit the transcript so that it can also be published on the
wiki page. See [[Captioning tips|captioning]].

EmacsConf 2022 status update: asmblox, async, buttons, dbus, detached,
eshell all have large-model Whisper transcripts. The rest have
small-model Whisper transcripts that might need lots of extra editing,
so you can either use them just for chapter markers, wait for the
better transcripts (ETA Dec 15 or so), or work with the ones made with
a small model.