1 files changed, 198 insertions, 51 deletions
diff --git a/2023/talks/voice.md b/2023/talks/voice.md
index c1bab1b3..2c5c537f 100644
--- a/2023/talks/voice.md
+++ b/2023/talks/voice.md
@@ -1,65 +1,212 @@
-[[!meta title="Improving access to AI-assisted literate programming with voice control"]]
+[[!meta title="Enhancing productivity with voice computing"]]
 [[!meta copyright="Copyright &copy; 2023 Blaine Mooers"]]
 [[!inline pages="internal(2023/info/voice-nav)" raw="yes"]]
 
 <!-- Initially generated with emacsconf-publish-talk-page and then left alone for manual editing -->
 <!-- You can manually edit this file to update the abstract, add links, etc. --->
 
-
-# Improving access to AI-assisted literate programming with voice control
+# Enhancing productivity with voice computing
 Blaine Mooers (he/him/his) - Pronunciation: pronounced like "moors", blaine-mooers(at)ouhsc.edu, <https://basicsciences.ouhsc.edu/bmb/Faculty/bio_details/mooers-blaine-hm-phd>, <https://twitter.com/BlaineMooers>, <https://github.com/MooersLab>, <https://codeberg.org/MooersLab>, mastodon(at)bhmooers
 
 [[!inline pages="internal(2023/info/voice-before)" raw="yes"]]
 
-The audience will learn how to use voice control to create literate
-programming documents in Emacs. After reviewing the benefits of
-literate programming, I will review the prior work done with the voice
-control in Emacs. I will present the reasons why you'd want to use
-voice control; they go beyond the obvious benefit of avoiding or
-working around repetitive stress injuries and include the benefits of
-using voice control while standing to break up long periods of
-sitting, which are detrimental to one's health. There are many options
-for voice control in and out of the Emacs. I will review a list of
-several and then drill in on two: one that is easy but of limited
-extensibility (Voice In Plus (<https://dictanote.co/voicein/plus/>) and
-one that is harder to learn but more extensible (Talon Voice
-(<https://talon.wiki/>)). The latter has a welcoming community of users
-and developers in the Talon Slack channel.
-
-The Voice In Plus is a plugin for the Google Chrome browser that
-allows you to dictate in the text areas on web pages. The dictated
-text can be sent as soon as it appears in the browser to Emacs via
-GhostText and the Atomic-Chrome package. You can insert custom code
-snippets by voice control in the text area using Voice In Plus's
-support for custom snippets. Or, you can insert yasnippet snippets by
-voice control in the corresponding buffer in Emacs. I will demonstrate
-how to set up this workflow and how to use it to create an org mode
-file. This workflow is very effective for the creation of lots of
-prose, but not code.
-
-The second approach uses the open-source software called Talon
-(<http://talon.wiki>), which is good for both prose and code. This
-package enables precise voice control in a wide variety of
-applications including Emacs. This package is also highly configurable
-using Python script and an accompanying Talonscript file, which has a
-simple YAML file format. The general users of Talon who know nothing
-about Python can easily configure their setup using Talonscript files.
-Advanced users can use Python to add modules to the Talon package to
-extend its functionality. I will demonstrate how to write an org mode
-file with executable code blocks with Talon running in Emacs. I will
-edit and run the code blocks by voice control with and without the
-help of generative AI in the form of Copilot.
-
-I also demonstrate an interactive quiz in Python and Elisp that I
-developed to the support the mastery of the voice control commands. By
-running the quiz with voice control, you can accelerate mastery of the
-commands. I learned the Talon alphabet in one day by taking the quiz
-at spaced intervals. The quiz only took 60 seconds to complete when I
-was proficient.
-
-I will conclude with a discussion of lessons learned and opportunities
-for using voice control in Emacs for AI-assisted literate programming.
+[[!template id="help"
+volunteer=""
+summary="Q&A could be indexed with chapter markers"
+tags="help_with_chapter_markers"
+message="""The Q&A session for this talk does not have chapter markers yet.
+Would you like to help? See [[help_with_chapter_markers]] for more details. You can use the vidid="voice-qanda" if adding the markers to this wiki page, or e-mail your chapter notes to <emacsconf-submit@gnu.org>."""]]
+
+Voice computing uses speech recognition software to convert speech into text, commands, or code.
+While there is a venerated program called EmacSpeaks for converting text into speech, an
+``EmacsListens'' for converting speech into text is not available yet.
+The Emacs Wiki describes the underdeveloped situation for speech-to-text in Emacs.
+I will explain how two external software packages convert my speech into text and computer
+commands that can be used with Emacs.
+
+First, I present some motivations for using voice computing.
+These can be divided into two categories: productivity improvement and health-related issues.
+In this second category, there is the underappreciated cure for ``standing desk envy'';
+the cure is achievable with a large dose of voice computing while standing.
+
+I found one software package (Voice In) to be quite accurate for speech-to-text or dictation
+(Voice In Plus, <https://dictanote.co/voicein/plus/>), but less versatile for speech-to-commands.
+I have used this package daily, and I found a three-fold increase in my daily word count almost
+immediately.
+Of course, there are limits here; you can talk for only so many hours per day.
+
+Second, I found another software package that has a less accurate language model (Talon Voice,
+<http://talon.wiki/>)) but that supports custom commands that can be executed anywhere you can
+place the cursor, including in virtual machines and on remote servers.
+Talon Voice will appeal to those who like to tinker with configuration files, yet it is easy to
+use.
+
+I will explain how I have integrated these two packages into my workflow.
+I have developed a library of commands that expand 94 English contractions when spoken.
+This library eliminates tedious downstream editing of formal prose where I do not use
+contractions.
+The library is available on GitHub for both Voice In Plus
+(<https://github.com/mooersLab/voice-in-plus-contractions>) and Talon Voice
+(<https://github.com/MooersLab/talon-contractions>).
+
+I also supply the interactive quizzes to master the basic Voice In commands
+(<https://github.com/MooersLab/voice-in-basics-quiz>) and the Talon Voice phonetic alphabet
+(<https://github.com/MooersLab/talon-voice-quizzes/qTalonAlphabet.py>)
+I learned the Talon alphabet in one day by taking the quiz at spaced intervals.
+The quiz took only 60 seconds to complete when I was proficient.
+
+I store my daily writing in a multi-file LaTeX document with one tex file per day.
+365 files are compiled into one PDF per year. This is usually about 1000 pages.
+I am not going to push my luck with a multiyear document.
+Each month is a chapter. The resulting PDF is a breeze to scroll and search.
+It has an autogenerated table of contents and an index. I have posted 
+a blank version for 2023 and another for the upcoming year 
+(<https://github.com/MooersLab/diary2024inLaTeX>)
+One could take a similar approach in org-mode by using Bastian Bechtold's 
+org-journal package (<https://github.com/bastibe/org-journal>).
+
+I gave a 60-minute talk on this topic to the Oklahoma Data Science Workshop 
+2023 Nov. 16 (<https://mediasite.ouhsc.edu/Mediasite/Channel/python>).
+This workshop meets once a month and is for people interested in data 
+science and scientific computing. You do not have to be an Oklahoma
+resident to attend. Send me e-mail if you want to be added to our mailing list.
+
+# About the speaker:
+
+I am an Associate Professor of Biochemistry at the University of
+Oklahoma Health Sciences Center. I use X-ray crystallography to study
+the structures of RNA, proteins, and protein-drug complexes. I have
+been using Python and LaTeX for a dozen years, and Jupyter Notebooks
+since 2013. I have been using Emacs every day for 2.5 years. I
+discovered voice computing this summer when my chronic repetitive
+stress injury flared up while entering data in a spreadsheet. I
+tripled my daily word count by using the speech-to-text, and I get a
+kick out of running remote computers by speech-to-command.
+# Discussion
+
+## Questions and answers
+
+-   Q:  Comment there is a text to command thing called clipea that
+    would be awesome <https://github.com/dave1010/clipea>
+    -   A: <https://sourceforge.net/projects/sox/> also a good
+        alternative.
+-   Q: Could you comment on how speaking vs. typing affects your
+    logic/content.  Thanks!
+    -   A: I find that this is like the difference between writing your thoughts
+		down on a blank piece of printer paper versus paper bound with a
+		leather notebook. I do not think there has any real difference. I know
+		that some people believe there is a solid certain difference but this
+		is, for the purpose I am using this, for the purpose of generating the
+		first draft, because my skills with the-- using my voice to edit my
+		text is still not very well developed, I am still more efficient using
+		the keyboard for that stage.
+
+		So the hardest part about
+		writing generally is getting the first crappy draft written. I
+		have found that dictation is perfectly fine for that phase. I
+		find it actually very conducive for just getting the text out. The
+		biggest problem that most of us have is applying our internal editor and
+		that inhibits us from generating words in a free-flowing
+		fashion. 
+
+		I generally do my generative writing--actually, I divide my writing
+		into two categories: generative writing (generating the first crappy
+		draft) and then rewriting. Rewriting is probably 80-90% of writing
+		where you can go back and rework the order of the sentences, order of
+		paragraphs, the order of words in a sentence and so forth. It is
+		really hard work that is best done later in the day when I am more
+		awake. I do my generative writing first thing in the morning when I am
+		feel horrible. That is when my internal editor is not very awake and I
+		can get more words out more words past that gatekeeper. I can do this
+		sitting down. I can do this standing up. I can do this 20 feet away
+		from my computer looking out the window to get my eyes a break. I find
+		it is just a very enjoyable to use it in this fashion. The downside is
+		that I wind up generating three times as much text. That makes for
+		three times as much work when it comes to rewriting the text, and that
+		means I am using the keyboard a lot and later on in the day.
+
+		I have not made any progress on recovering from my own repetitive
+		stress injury. I hope that I will add the use of voice commands,
+		speech-to-commands, for editing the text in the future and I will
+		eventually give my hands more of a break.
+
+		This allows you to actually separate those two activities not only by
+		time... So many professional writers will spend several hours in the
+		morning doing the generative part and then they will spend the rest of
+		the day rewriting. They have separated this to activities temporally.
+		What most people actually do is they they do the generative part and
+		then they write one sentence, and they apply that internal editor
+		right away because they want to write the first draft as a perfect
+		version, as a final draft, and that is what slows them down
+		dramatically.
+
+		This also allows you to separate these two activities in terms of
+		modality. You are going to do the generative writing by Voice In, the
+		rewriting by keyboard. I think this is like what most people... One way
+		that many people can get into using speech-to-text in a productive way
+		that sounds great...
+    -   A: (not the author, just an audiance): So, for example, when
+        you're talking, you have an immense feeling of the topic you
+        have. You can close your eyes and do your body gestures to
+        manipulate a concept or idea, and you have... I just feel you
+        feel more creative than just tapping. Definitely you have much
+        more speed advantage over tapping, but more important thing is
+        you use your body as a whole to interact with those ideas.
+        [this one is done via voice...]
+        -   but typing is definitely good for acturate control, such as
+            M-x some-command ...
+-   Q: Have you tried the ChatGTP voice chat interface, if so how has
+    been your experience of it? As someone experienced with voice
+    control, interested to hear your thoughts, performance relative to
+    the open source tools in particular. 
+    -   A: I do not have much experience with that particular software. I have
+		use Whisper a little bit, and so that is related. Of course, you have
+		this problem of lag. I find that Whisper is good for spitting out a
+		sentence maybe for a docstring and a programming file. I find that it
+		is very prone to hallucinations. I find myself spending half my
+		time deleting the hallucinations, and I feel like the net gain is
+		diminished as a result, or there has not much of a net gain in terms of
+		what I am getting out of it.
+-   Q: Are any of these voice command/dictions freemium?
+    -   A: To be able to add custom commands, you have to pay
+		$48 a year. The Talon Voice software is free and the only
+		limitation there is access to the language model. If you want to get
+		the beta version, you need to subscribe to Patreon to support the
+		developer. I did that, and I really did not find much of
+		an improvement. I really do not intend to do that in the future.
+		But otherwise in Talon Voice, everything is open and free. The Slack
+		community is incredibly welcoming. Its parallels with
+		the Emacs Community are pretty striking.
+-   Q: How good is Talon compared to whisper?
+    - A: With Talon, I find that the first part of the sentence will
+		be fairly accurate. When I am doing dictation and then towards
+		the end, the errors... In general, I think its error rate is
+		about five words out of 100 or so or will be wrong. Whisper is
+		wonderful because it will insert punctuation for you, but I
+		guess its errors are longer and that will hallucinate full
+		sentences for you. So they both have significant error rates.
+		They are just different kinds of errors. Hopefully, both over
+		time... [Talon] errors are generally shorter in extent. It do
+		not hallucinate as long.
+- Q: are any of those voice command/dictation tools libre? i can not find that information on the web
+  - (not the speaker): 
+    - this FAQ <https://talon.wiki/faq/> says that Talon Voice is closed source
+	- talon voice is non-free <https://talonvoice.com/EULA.txt>
+    - Mistral 7B is apache 2.0 license  i.e. no restrictions
+
+
+## Notes
 
+- From the speaker: I really appreciate the high level of accuracy that I am getting from
+Voice In. I would use Talon Voice for dictation, but at this point,
+there is a significant difference between the level of accuracy of
+Voice In versus Talon Voice. It's large enough of a difference that I'll
+probably use Voice In for a while until I can figure out how to get 
+Talon Voice to generate more accurate text.
+-   When you do Org mode and you have the bullets, it can allows you to naturally shard your thoughts in a way that is really easy to edit. ... It has a
+summarizing capability. It allows you to you know pull back and get a
+overview.
+- Great stuff, definitely going to test-drive Talon
 
 
 [[!inline pages="internal(2023/info/voice-after)" raw="yes"]]