|
|
[[!meta title="Reproducible molecular graphics with Org-mode"]]
[[!meta copyright="Copyright © 2021 Blaine Mooers"]]
[[!inline pages="internal(2021/info/molecular-nav)" raw="yes"]]
<!-- You can manually edit this file to update the abstract, add links, etc. --->
# Reproducible molecular graphics with Org-mode
Blaine Mooers
[[!inline pages="internal(2021/info/molecular-schedule)" raw="yes"]]
Research papers in structural biology should include the code used to make
the images of molecules in the article in the supplemental materials.
Some structural bioinformaticists have started to include
their computer code in the supplemental materials to allow readers
to reproduce their analyses. However, authors of papers reporting new
molecular structures often overlook the inclusion of the code that makes
the images of the molecules reported in their articles. Nonetheless,
this aspect of reproducible research needs to become the standard practice
to improve the rigor of the science.
In a literate programming document, the author interleaves blocks
of explanatory prose between code blocks that make the images of molecules.
The document allows the reader to reproduce the images in the manuscript by running the code.
The reader can also explore the effect of altering the parameters in the
code. Org files are one alternative for making such literate programming
documents.
We developed a **yasnippet** snippet library called **orgpymolpysnips** for
structural biologists (<https://github.com/MooersLab/orgpymolpysnips>).
This library facilitates the assembly of literate programming documents
with molecular images made by PyMOL. PyMOL is the most popular
molecular graphics program for creating images for publication; it has
over 100,000 users, which is a lot of users in molecular biology. PyMOL
has been used to make many of the images of biological molecules found
on the covers of many Cell, Nature, and Science issues.
We used the **jupyter** language in **org-babel** to send commands from
code blocks in Org files to PyMOL's Python API. PyMOL returns the
molecular image to the output block below the code block. An Emacs
user can convert the Org file into a PDF, `tangle' the code blocks
into a script file, and submit these for non-Emacs users. We describe
the content of the library and provide examples of the running PyMOL
from Org-mode documents.
# Discussion
Pad:
- Q1: Do you also do any hydrogen-bond analysis in your workflows?
Also, could your snippet library be extended for other non-python
simulation programs like GROMAC?
- A: Yes, i have a snippet that generate publication qualtiy
hydrogen bonds. Yes, I have thought of making snippet library
molecular simulation like Gromacs and AMNER and drug design
software packages like autodock Vvna and rdkit. They can help
lower the barrier to entry. I made library for crystallographic
computing with CCTBX for use in Jupyter. I should make it
available for org-mode.
- Q2: We've seen a few talks regarding managing academic papers and
citations in emacs/org, what does your workflow look like?
- A: I switched to Emacs as my primary editor 3 months ago. I have
yet to write a paper in Org. I am very comfortable with LaTeX
and I have been writing my papers on Overleaf in LaTeX for
several years. I used bibtex and JabRef to manage by refernces.
I have started playing by org-ref. It looks super promising.
- Q3: Hi Blain, you mentioned that you have been able to come back to
a file years later, how do you manage the environment that the org
file executes in?
- A: Good question. The PyMOL code is good for years so the images
should be reproducible regardless of the version of org.
PyMOL's domain specific language is very stable. The Python
code largely just wraps around the DSL code.
- Q4: Have you used Org Mode and pyMOL for publications? Could you
share a link to any of them?
- A: I have yet to use org in a publication. The first step will
be to use it for supplemental material.
BBB discussion:
- We've seen a few talks regarding managing academic papers and citations in emacs/org, what does your workflow look like?
- Blaine: My workflow involves a dozen different software packages and 20-200 GB of data. Complete literate programming is not possible at this time. The smallest possible step towards that goal is to make the molecular images reproducible because the files involved are on 1-100 MB in size.
- Questioner: I assume that's why there might be lag with several images rendered on an org buffer?
- I was specifically interested in your workflow with managing citations and papers as I'm sure you have to do, is there anything in particular you use for citation management?
- Blaine: I switched to Emacs as my primary editor 3 months ago. I have yet to write a paper in Org. I am very comfortable with LaTeX and I have been writing my papers on Overleaf in LaTeX for several years. I used bibtex and JabRef to manage by references. I have started playing by org-ref. It looks super promising.
- Questioner: I still use zotero and biblatex, but the previous two talks about org-ref got me thinking about my workflow
- Have you used Org Mode and pyMOL for publications? Could you share a link to any of them?
- Blaine: I have yet to use org in a publication. The first step will be to use it for supplemental material.
- thanks, makes sense, I'm off in a part of the python world where code base churn can be pretty severe; but it sounds like pymol is able to avoid those issues
- Blaine: PyMOL as a domain specific language that is very stable. The transition from Python2 to Python3 as bit disruptive.
- Hi Blaine, you mentioned that you have been able to come back to a file years later, how do you manage the environment that the org file executes in?
- Blaine: Good question. The PyMOL code is good for years so the images should be reproducible regardless of the version of org.
BBB feedback:
- Blane, great job with the talk. Awesome presentation.
- I know people loved it in the IRC chat :D
- I can share that I was excited to see how you made things so seamless and integrated feeling into Emacs. The results are really eyepopping.
IRC discussion:
- which is the package name for export org mode to pymol?
- the async header argument can be helpful with the problem of the amount of time for generating the images
- think of this is use case explication for being able to manage and render 3d models in org
- It might be faster to keep sections folded by default
- This is exactly the sort of thing my users love.
# Outline
- 5-10 minutes: (brief description/outline)
- Title slide
- Structural Biolog Workflow in the Mooers Lab
- Cover images made with PyMOL
- Why develop a snippet library for your field?
- PyMOL in Org: kernel specification
- Creating a conda env and installing PyMOL
- Example code block in Org to make DSSR block model of tRNA
- Resulting image
- Summary
- Acknowledgements
<!--
- 20 minutes: (brief description/outline)
I would prefer to give a 20-minute talk because this allows time to develop the context.
- Title slide
- Structural Biology Workflow in the Mooers Lab
- Cover images made with PyMOL
- Bar graph of PyMOL's popularity
- Origin story of PyMOL
- PyMOL's hybrid open-source model
- PyMOL's GIU
- Default molecular representations in PyMOL
- Example of the PyMOL macro language
- Same commands in Python
- Corresponding code in yasnippet snippet
- Extension of molecular representations with orgpymolpysnips
- Hermann Ebbinghaus's Forgetting Curve
- Why develop a snippet library for your field?
- PyMOL in Org: kernel specification
- Anatomy of kernel file
- Creating a conda env and installing PyMOL
- Example code block to make DSSR block model of tRNA
- Resulting image
- Org vs. JuptyerNotebook, Juptyer Lab, and RStudio
- Summary
- Acknowledgements
-->
[[!inline pages="internal(2021/captions/molecular)" raw="yes"]]
[[!inline pages="internal(2021/info/molecular-nav)" raw="yes"]]
|