summaryrefslogtreecommitdiffstats
path: root/2021/talks/molecular.md
blob: 3cc7d829dd5f3fe9dba4836b1d6caed07cb2929c (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
[[!meta title="Reproducible molecular graphics with Org-mode"]]
[[!meta copyright="Copyright © 2021 Blaine Mooers"]]
[[!inline pages="internal(2021/info/molecular-nav)" raw="yes"]]

<!-- You can manually edit this file to update the abstract, add links, etc. --->


# Reproducible molecular graphics with Org-mode
Blaine Mooers

[[!taglink CategoryOrgMode]]

[[!inline pages="internal(2021/info/molecular-schedule)" raw="yes"]]

Research papers in structural biology should include the code used to make 
the images of molecules in the article in the supplemental materials. 
Some structural bioinformaticists have started to include
their computer code in the supplemental materials to allow readers
to reproduce their analyses. However, authors of papers reporting new
molecular structures often overlook the inclusion of the code that makes 
the images of the molecules reported in their articles. Nonetheless, 
this aspect of reproducible research needs to become the standard practice 
to improve the rigor of the science.

In a literate programming document, the author interleaves blocks 
of explanatory prose between code blocks that make the images of molecules. 
The document allows the reader to reproduce the images in the manuscript by running the code. 
The reader can also explore the effect of altering the parameters in the 
code. Org files are one alternative for making such literate programming 
documents.

We developed a **yasnippet** snippet library called **orgpymolpysnips** for 
structural biologists (<https://github.com/MooersLab/orgpymolpysnips>). 
This library facilitates the assembly of literate programming documents
with molecular images made by PyMOL. PyMOL is the most popular
molecular graphics program for creating images for publication; it has
over 100,000 users, which is a lot of users in molecular biology. PyMOL 
has been used to make many of the images of biological molecules found 
on the covers of many Cell, Nature, and Science issues. 

We used the **jupyter** language in **org-babel** to send commands from 
code blocks in Org files to PyMOL's Python API. PyMOL returns the 
molecular image to the output block below the code block. An Emacs 
user can convert the Org file into a PDF, `tangle' the code blocks 
into a script file, and submit these for non-Emacs users. We describe 
the content of the library and provide examples of the running PyMOL 
from Org-mode documents. 

# Discussion

Pad:

-   Q1:  Do you also do any hydrogen-bond analysis in your workflows?
    Also, could your snippet library be extended for other non-python
    simulation programs like GROMAC?
    -   A: Yes, i have a snippet that generate publication qualtiy
        hydrogen bonds. Yes, I have thought of making snippet library
        molecular simulation like Gromacs and AMNER and drug design
        software packages like autodock Vvna and rdkit. They can help
        lower the barrier to entry. I made library for crystallographic
        computing with CCTBX for use in Jupyter. I should make it
        available for org-mode.
-   Q2: We've seen a few talks regarding managing academic papers and
    citations in emacs/org, what does your workflow look like?
    -   A: I switched to Emacs as my primary editor 3 months ago. I have
        yet to write a paper in Org. I am very comfortable with LaTeX
        and I have been writing my papers on Overleaf in LaTeX for
        several years. I used bibtex and JabRef to manage by refernces.
        I have started playing by org-ref. It looks super promising.
-   Q3: Hi Blain, you mentioned that you have been able to come back to
    a file years later, how do you manage the environment that the org
    file executes in?
    -   A: Good question. The PyMOL code is good for years so the images
        should be reproducible regardless of the version of org.
        PyMOL's domain specific language is very stable. The Python
        code largely just wraps around the DSL code.
-   Q4: Have you used Org Mode and pyMOL for publications? Could you
    share a link to any of them?
    -   A: I have yet to use org in a publication. The first step will
        be to use it for supplemental material.

BBB discussion:

- We've seen a few talks regarding managing academic papers and citations in emacs/org, what does your workflow look like?
  - Blaine: My workflow involves  a dozen different software packages and  20-200 GB of data. Complete literate programming is not possible at this time. The smallest possible step towards that goal is to make the molecular images reproducible because the files involved are on 1-100 MB in size.
  - Questioner: I assume that's why there might be lag with several images rendered on an org buffer?
- I was specifically interested in your workflow with managing citations and papers as I'm sure you have to do, is there anything in particular you use for citation management?
  - Blaine: I switched to Emacs as my primary editor 3 months ago. I have yet to write a paper in Org. I am very comfortable with LaTeX and I have been writing my papers on Overleaf in LaTeX for several years. I used bibtex and JabRef to manage by references. I have started playing by org-ref. It looks super promising.
  - Questioner: I still use zotero and biblatex, but the previous two talks about org-ref got me thinking about my workflow
- Have you used Org Mode and pyMOL for publications? Could you share a link to any of them?
  - Blaine: I have yet to use org in a publication. The first step will be to use it for supplemental material.
  - thanks, makes sense, I'm off in a part of the python world where code base churn can be pretty severe; but it sounds like pymol is able to avoid those issues
  - Blaine: PyMOL as a domain specific language that is very stable. The transition from Python2 to Python3 as bit  disruptive.
- Hi Blaine, you mentioned that you have been able to come back to a file years later, how do you manage the environment that the org file executes in?
  - Blaine: Good question. The PyMOL code is good for years so the images should be reproducible regardless of the version of org.

BBB feedback:

- Blane, great job with the talk.  Awesome presentation.
  - I know people loved it in the IRC chat :D
- I can share that I was excited to see how you made things so seamless and integrated feeling into Emacs.  The results are really eyepopping.


IRC discussion:

- which is the package name for export org mode to pymol?
- the async header argument can be helpful with the problem of the amount of time for generating the images
- think of this is use case explication for being able to manage and render 3d models in org
- It might be faster to keep sections folded by default
- This is exactly the sort of thing my users love.

# Outline

-   5-10 minutes: (brief description/outline)
    -   Title slide
    -   Structural Biolog Workflow in the Mooers Lab
    -   Cover images made with PyMOL
    
    -   Why develop a snippet library for your field?
    -   PyMOL in Org: kernel specification
    -   Creating a conda env and installing PyMOL
    -   Example code block in Org to make DSSR block model of tRNA
    -   Resulting image
    -   Summary
    -   Acknowledgements

<!--
-   20 minutes: (brief description/outline)
    
    I would prefer to give a 20-minute talk because this allows time to develop the context.

-   Title slide
-   Structural Biology Workflow in the Mooers Lab
-   Cover images made with PyMOL
-   Bar graph of PyMOL's popularity
-   Origin story of PyMOL
-   PyMOL's hybrid open-source model
-   PyMOL's GIU
-   Default molecular representations in PyMOL
-   Example of the PyMOL macro language
-   Same commands in Python
-   Corresponding code in yasnippet snippet
-   Extension of molecular representations with orgpymolpysnips
-   Hermann Ebbinghaus's Forgetting Curve
-   Why develop a snippet library for your field?
-   PyMOL in Org: kernel specification
-   Anatomy of kernel file
-   Creating a conda env and installing PyMOL
-   Example code block to make DSSR block model of tRNA
-   Resulting image
-   Org vs. JuptyerNotebook, Juptyer Lab, and RStudio
-   Summary
-   Acknowledgements

-->
[[!inline pages="internal(2021/captions/molecular)" raw="yes"]]

[[!inline pages="internal(2021/info/molecular-nav)" raw="yes"]]