path: root/2023/talks/voice.md

                                                             

[[!meta title="Enhancing productivity with voice computing"]]
[[!meta copyright="Copyright &copy; 2023 Blaine Mooers"]]
[[!inline pages="internal(2023/info/voice-nav)" raw="yes"]]

<!-- Initially generated with emacsconf-publish-talk-page and then left alone for manual editing -->
<!-- You can manually edit this file to update the abstract, add links, etc. --->

# Enhancing productivity with voice computing
Blaine Mooers (he/him/his) - Pronunciation: pronounced like "moors", blaine-mooers(at)ouhsc.edu, <https://basicsciences.ouhsc.edu/bmb/Faculty/bio_details/mooers-blaine-hm-phd>, <https://twitter.com/BlaineMooers>, <https://github.com/MooersLab>, <https://codeberg.org/MooersLab>, mastodon(at)bhmooers

[[!inline pages="internal(2023/info/voice-before)" raw="yes"]]

Voice computing uses speech recognition software to convert speech into text, commands, or code.
While there is a venerated program called EmacSpeaks for converting text into speech, an
``EmacsListens'' for converting speech into text is not available yet.
The Emacs Wiki describes the underdeveloped situation for speech-to-text in Emacs.
I will explain how two external software packages convert my speech into text and computer
commands that can be used with Emacs.

First, I present some motivations for using voice computing.
These can be divided into two categories: productivity improvement and health-related issues.
In this second category, there is the underappreciated cure for ``standing desk envy'';
the cure is achievable with a large dose of voice computing while standing.

I found one software package (Voice In) to be quite accurate for speech-to-text or dictation
(Voice In Plus, <https://dictanote.co/voicein/plus/>), but less versatile for speech-to-commands.
I have used this package daily, and I found a three-fold increase in my daily word count almost
immediately.
Of course, there are limits here; you can talk for only so many hours per day.

Second, I found another software package that has a less accurate language model (Talon Voice,
<http://talon.wiki/>)) but that supports custom commands that can be executed anywhere you can
place the cursor, including in virtual machines and on remote servers.
Talon Voice will appeal to those who like to tinker with configuration files, yet it is easy to
use.

I will explain how I have integrated these two packages into my workflow.
I have developed a library of commands that expand 94 English contractions when spoken.
This library eliminates tedious downstream editing of formal prose where I do not use
contractions.
The library is available on GitHub for both Voice In Plus
(<https://github.com/mooersLab/voice-in-plus-contractions>) and Talon Voice
(<https://github.com/MooersLab/talon-contractions>).

I also supply the interactive quizzes to master the basic Voice In commands
(<https://github.com/MooersLab/voice-in-basics-quiz>) and the Talon Voice phonetic alphabet
(<https://github.com/MooersLab/talon-voice-quizzes/qTalonAlphabet.py>)
I learned the Talon alphabet in one day by taking the quiz at spaced intervals.
The quiz took only 60 seconds to complete when I was proficient.

I store my daily writing in a multi-file LaTeX document with one tex file per day.
365 files are compiled into one PDF per year. This is usually about 1000 pages.
I am not going to push my luck with a multiyear document.
Each month is a chapter. The resulting PDF is a breeze to scroll and search.
It has an autogenerated table of contents and an index. I have posted 
a blank version for 2023 and another for the upcoming year 
(<https://github.com/MooersLab/diary2024inLaTeX>)
One could take a similar approach in org-mode by using Bastian Bechtold's 
org-journal package (<https://github.com/bastibe/org-journal>).

I gave a 60-minute talk on this topic to the Oklahoma Data Science Workshop 
2023 Nov. 16 (<https://mediasite.ouhsc.edu/Mediasite/Channel/python>).
This workshop meets once a month and is for people interested in data 
science and scientific computing. You do not have to be an Oklahoma
resident to attend. Send me e-mail if you want to be added to our mailing list.

# About the speaker:

I am an Associate Professor of Biochemistry at the University of
Oklahoma Health Sciences Center. I use X-ray crystallography to study
the structures of RNA, proteins, and protein-drug complexes. I have
been using Python and LaTeX for a dozen years, and Jupyter Notebooks
since 2013. I have been using Emacs every day for 2.5 years. I
discovered voice computing this summer when my chronic repetitive
stress injury flared up while entering data in a spreadsheet. I
tripled my daily word count by using the speech-to-text, and I get a
kick out of running remote computers by speech-to-command.

[[!inline pages="internal(2023/info/voice-after)" raw="yes"]]

[[!inline pages="internal(2023/info/voice-nav)" raw="yes"]]