summaryrefslogtreecommitdiffstats
path: root/2023/talks/voice.md
blob: eeac32d0a7f244254d44bb8720b9f1488a4b1b4c (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
[[!meta title="Enhancing productivity with voice computing"]]
[[!meta copyright="Copyright © 2023 Blaine Mooers"]]
[[!inline pages="internal(2023/info/voice-nav)" raw="yes"]]

<!-- Initially generated with emacsconf-publish-talk-page and then left alone for manual editing -->
<!-- You can manually edit this file to update the abstract, add links, etc. --->

# Enhancing productivity with voice computing
Blaine Mooers (he/him/his) - Pronunciation: pronounced like "moors", blaine-mooers(at)ouhsc.edu, <https://basicsciences.ouhsc.edu/bmb/Faculty/bio_details/mooers-blaine-hm-phd>, <https://twitter.com/BlaineMooers>, <https://github.com/MooersLab>, <https://codeberg.org/MooersLab>, mastodon(at)bhmooers

[[!inline pages="internal(2023/info/voice-before)" raw="yes"]]

Voice computing uses speech recognition software to convert speech into text, commands, or code.
While there is a venerated program called EmacSpeaks for converting text into speech, a
"EmacsListens" for converting speech into text is not available yet.
The Emacs Wiki describes the underdeveloped situation for speech-to-text in Emacs.
I will explain how two external software packages convert my speech into text and computer
commands that can be used with Emacs.

First, I present some motivations for using voice computing.
These can be divided into two categories: productivity improvement and health-related issues.
In this second category, there is the under-appreciated cure for ``standing desk envy'';
the cure is achievable with a large dose of voice computing while standing.

I found one software package (Voice In) to be quite accurate for speech-to-text or dictation
(Voice In Plus, <https://dictanote.co/voicein/plus/>), but less versatile for speech-to-commands.
I have used this package daily and I found a three-fold increase in my daily word count almost
immediately.
Of course, there are limits here; you can talk for only so many hours per day.

Second, I found another software package that has a less accurate language model (Talon Voice,
<http://talon.wiki/>)) but that supports custom commands that can be executed anywhere you can
place the cursor, including in virtual machines and on remote servers.
Talon Voice will appeal to those who like to tinker with configuration files, yet it is easy to
use.

I will explain how I have integrated these two packages into my workflow.
I have developed a library of commands that expand 94 English contractions when spoken.
This library eliminates tedious downstream editing of formal prose where I do not use
contractions.
The library is available on GitHub for both Voice In Plus
(<https://github.com/mooersLab/voice-in-plus-contractions>) and Talon Voice
(<https://github.com/MooersLab/talon-contractions>).

I also supply the interactive quizzes for mastering the basic Voice In commands
(<https://github.com/MooersLab/voice-in-basics-quiz>) and the Talon Voice phonetic alphabet
(<https://github.com/MooersLab/talon-voice-quizzes/qTalonAlphabet.py>)
I learned the Talon alphabet in one day by taking the quiz at spaced intervals.
The quiz took only 60 seconds to complete when I was proficient.

About the speaker:

I am an Associate Professor of Biochemistry at the University of
Oklahoma Health Sciences Center. I use X-ray crystallography to study
the structures of RNA, proteins, and protein-drug complexes. I have
been using Python and LaTeX for a dozen years, and Jupyter Notebooks
since 2013. I have been using Emacs every day for 2.5 years. I
discovered voice computing this summer when my chronic repetitive
stress injury flared up while entering data in a spreadsheet. I
tripled my daily word count by using the speech-to-text, and I get a
kick out of running remote computers by speech-to-command.

[[!inline pages="internal(2023/info/voice-after)" raw="yes"]]

[[!inline pages="internal(2023/info/voice-nav)" raw="yes"]]