summaryrefslogtreecommitdiffstats
path: root/2021/captions/emacsconf-2021-molecular--reproducible-molecular-graphics-with-org-mode--blaine-mooers--main.vtt
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--2021/captions/emacsconf-2021-molecular--reproducible-molecular-graphics-with-org-mode--blaine-mooers--main.vtt628
1 files changed, 628 insertions, 0 deletions
diff --git a/2021/captions/emacsconf-2021-molecular--reproducible-molecular-graphics-with-org-mode--blaine-mooers--main.vtt b/2021/captions/emacsconf-2021-molecular--reproducible-molecular-graphics-with-org-mode--blaine-mooers--main.vtt
new file mode 100644
index 00000000..06d92f3a
--- /dev/null
+++ b/2021/captions/emacsconf-2021-molecular--reproducible-molecular-graphics-with-org-mode--blaine-mooers--main.vtt
@@ -0,0 +1,628 @@
+WEBVTT
+
+00:00.880 --> 00:00:02.446
+Hi, I'm Blaine Mooers.
+
+00:00:02.446 --> 00:00:04.160
+I'm going to be talking about
+
+00:00:04.160 --> 00:00:07.919
+the use of molecular graphics in Org
+
+00:07.919 --> 00:00:08.880
+for the purpose of doing
+
+00:00:08.880 --> 00:00:11.840
+reproducible research in structural biology.
+
+00:00:11.840 --> 00:00:13.722
+I'm an associate professor of biochemistry
+
+00:00:13.722 --> 00:00:15.768
+and microbiology at the University of Oklahoma
+
+00:00:15.768 --> 00:00:17.760
+Health Sciences Center in Oklahoma City.
+
+00:00:17.760 --> 00:00:19.600
+My laboratory uses X-ray crystallography
+
+00:00:19.600 --> 00:00:21.920
+to determine the atomic structures
+
+00:00:21.920 --> 00:00:23.439
+of proteins like this one
+
+00:00:23.439 --> 00:00:26.080
+in the lower left, and of nucleic acids
+
+00:26.080 --> 00:27.840
+important in human health.
+
+00:27.840 --> 00:00:29.591
+This is a crystal of an RNA,
+
+00:00:29.591 --> 00:00:31.359
+which we have placed in this
+
+00:00:31.359 --> 00:00:33.200
+X-ray diffraction instrument.
+
+00:00:33.200 --> 00:00:35.600
+And after rotating the crystal
+
+00:00:35.600 --> 00:00:38.000
+in the X-ray beam for two degrees,
+
+00:00:38.000 --> 00:00:40.480
+we obtain this following diffraction pattern,
+
+00:00:40.480 --> 00:00:43.280
+which has thousands of spots on it.
+
+00:43.280 --> 00:00:47.840
+We rotate the crystal for over 180 degrees,
+
+00:47.840 --> 00:00:51.760
+collecting 90 images to obtain all the data.
+
+00:00:51.760 --> 00:00:56.000
+We then process those images
+
+00:56.000 --> 00:00:57.752
+and do an inverse Fourier transform
+
+00:00:57.752 --> 00:00:59.920
+to obtain the electron density.
+
+00:00:59.920 --> 00:01:01.888
+This electron density map has been
+
+00:01:01.888 --> 00:01:04.344
+contoured at the one-sigma level.
+
+00:01:04.344 --> 00:01:06.116
+That level's being shown by
+
+00:01:06.116 --> 00:01:08.640
+this blue chicken wire mesh.
+
+00:01:08.640 --> 00:01:10.152
+Atomic models have been fitted
+
+00:01:10.152 --> 00:01:11.119
+to this chicken wire.
+
+00:01:11.119 --> 00:01:14.240
+These lines represent bonds between atoms,
+
+00:01:14.240 --> 00:01:16.240
+atoms are being represented by points.
+
+00:01:16.240 --> 00:01:18.640
+And atoms are colored by atom type,
+
+00:01:18.640 --> 00:01:21.280
+red for oxygen, blue for nitrogen,
+
+00:01:21.280 --> 00:01:23.040
+and then in this case,
+
+01:23.040 --> 00:01:24.720
+carbon is colored cyan.
+
+00:01:24.720 --> 00:01:27.203
+We have fitted a drug molecule
+
+00:01:27.203 --> 00:01:29.360
+to the central blob of electron density
+
+00:01:29.360 --> 00:01:32.400
+which corresponds to that active site
+
+01:32.400 --> 00:01:35.759
+of this protein, which is RET Kinase.
+
+00:01:35.759 --> 00:01:37.439
+It's important in lung cancer.
+
+00:01:37.439 --> 00:01:40.079
+When we're finished with model building,
+
+00:01:40.079 --> 00:01:41.339
+we will then examine
+
+00:01:41.339 --> 00:01:43.006
+the result of the final structure
+
+00:01:43.006 --> 00:01:45.200
+to prepare images for publication
+
+00:01:45.200 --> 00:01:47.439
+using molecular graphics program.
+
+01:47.439 --> 00:01:48.108
+In this case,
+
+00:01:48.108 --> 00:01:50.000
+we've overlaid a number of structures,
+
+00:01:50.000 --> 00:01:53.600
+and we're examining the distance between
+
+01:53.600 --> 00:01:55.680
+the side chain of an alanine
+
+00:01:55.680 --> 00:01:58.880
+and one or two drug molecules.
+
+00:01:58.880 --> 00:02:00.719
+This alanine sidechain actually blocks
+
+00:02:00.719 --> 00:02:02.159
+the binding of one of these drugs.
+
+00:02:02.159 --> 00:02:03.439
+The most popular program
+
+02:03.439 --> 02:06.320
+for doing this kind of analysis
+
+02:06.320 --> 00:02:07.280
+and for preparing images
+
+00:02:07.280 --> 00:02:09.520
+for publication is PyMOL.
+
+02:09.520 --> 02:11.440
+PyMOL was used to prepare these images
+
+02:11.440 --> 02:14.720
+on the covers of these featured journals.
+
+02:14.720 --> 00:02:17.520
+PyMOL is favored because
+
+00:02:17.520 --> 00:02:19.520
+it has 500 commands
+
+00:02:19.520 --> 00:02:22.128
+and 600 parameter settings
+
+00:02:22.128 --> 00:02:23.360
+that provide exquisite control
+
+00:02:23.360 --> 00:02:24.959
+over the appearance of the output.
+
+00:02:24.959 --> 00:02:28.480
+PyMOL has over 100,000 users,
+
+02:28.480 --> 00:02:30.000
+reflecting its popularity.
+
+00:02:30.000 --> 00:02:31.599
+This is the GUI for PyMOL.
+
+00:02:31.599 --> 00:02:35.120
+It shows in white the viewport area
+
+00:02:35.120 --> 00:02:36.080
+where one interacts
+
+00:02:36.080 --> 00:02:37.840
+with the loaded molecular object.
+
+00:02:37.840 --> 00:02:41.920
+We have rendered the same RET kinase
+
+02:41.920 --> 00:02:49.788
+with a set of preset parameters
+
+00:02:49.788 --> 00:02:51.200
+that have been named "publication".
+
+00:02:51.200 --> 00:02:52.720
+The other way of applying
+
+02:52.720 --> 00:02:54.319
+parameter settings and commands
+
+00:02:54.319 --> 00:02:56.720
+is to enter them at the PyMOL prompt.
+
+00:02:56.720 --> 00:03:00.159
+Then the third way is to load and run scripts.
+
+00:03:00.159 --> 00:03:03.120
+PyMOL is actually written in C for speed,
+
+00:03:03.120 --> 00:03:06.159
+but it is wrapped in Python for extensibility.
+
+03:06.159 --> 03:09.680
+In fact, there are over 100 articles
+
+03:09.680 --> 00:03:11.599
+about various plugins and scripts
+
+00:03:11.599 --> 00:03:12.400
+that people have developed
+
+00:03:12.400 --> 00:03:15.120
+to extend PyMOL for years.
+
+03:15.120 --> 00:03:16.480
+Here's some examples
+
+00:03:16.480 --> 00:03:18.959
+from the snippet library that I developed.
+
+03:18.959 --> 03:21.280
+On the left is a default
+
+03:21.280 --> 03:24.640
+cartoon representation of a RNA hairpin.
+
+03:24.640 --> 03:27.040
+I find this reduced representation
+
+03:27.040 --> 00:03:30.799
+of the RNA hairpin to be too stark.
+
+03:30.799 --> 00:03:32.319
+I prefer these alternate ones
+
+00:03:32.319 --> 00:03:33.840
+that I developed.
+
+03:33.840 --> 03:37.519
+So, these three to the right of this one
+
+03:37.519 --> 00:03:39.519
+are not available through
+
+00:03:39.519 --> 00:03:40.720
+pull downs in PyMOL.
+
+00:03:40.720 --> 00:03:42.748
+So why developed a PyMOL
+
+00:03:42.748 --> 00:03:44.879
+snippet library for Org?
+
+03:44.879 --> 00:03:47.040
+Well, Org provides great support
+
+00:03:47.040 --> 00:03:48.560
+for literate programming,
+
+00:03:48.560 --> 00:03:49.840
+where you have code blocks
+
+00:03:49.840 --> 00:03:52.000
+that contain code that's executable,
+
+00:03:52.000 --> 00:03:53.040
+and the output is shown
+
+00:03:53.040 --> 00:03:54.959
+below that code block.
+
+03:54.959 --> 00:03:56.720
+And then you can fill
+
+00:03:56.720 --> 00:03:58.959
+the surrounding area in the document
+
+03:58.959 --> 00:04:00.799
+with the explanatory prose.
+
+00:04:00.799 --> 00:04:02.000
+Org has great support
+
+00:04:02.000 --> 00:04:04.480
+for editing that explanatory prose.
+
+00:04:04.480 --> 00:04:08.080
+Org can run PyMOL through PyMOL's Python API.
+
+04:08.080 --> 00:04:11.280
+One of the uses of such an Org document
+
+00:04:11.280 --> 00:04:14.487
+is to assemble a gallery of draft images.
+
+00:04:14.487 --> 00:04:16.563
+We often have to look at
+
+00:04:16.563 --> 00:04:19.840
+dozens of candidate images
+
+00:04:19.840 --> 00:04:22.000
+with the molecule in different orientations,
+
+00:04:22.000 --> 00:04:23.520
+different zoom settings,
+
+04:23.520 --> 00:04:25.032
+different representations,
+
+00:04:25.032 --> 00:04:27.280
+different colors, and so on.
+
+00:04:27.280 --> 00:04:30.639
+And to have those images along with…,
+
+00:04:30.639 --> 00:04:31.840
+adjacent to the code
+
+04:31.840 --> 00:04:33.680
+that was used to generate them,
+
+00:04:33.680 --> 00:04:37.199
+can be very effective for
+
+04:37.199 --> 00:04:39.680
+further editing the code
+
+00:04:39.680 --> 00:04:40.880
+and improving the images.
+
+00:04:40.880 --> 00:04:44.080
+Once the final images have been selected,
+
+04:44.080 --> 00:04:46.320
+one can submit the code
+
+00:04:46.320 --> 00:04:48.479
+as part of the supplemental material.
+
+00:04:48.479 --> 00:04:52.400
+Finally, one can use the journal package
+
+04:52.400 --> 00:04:54.608
+to use the Org files as
+
+00:04:54.608 --> 00:04:57.120
+an electronic laboratory notebook,
+
+00:04:57.120 --> 00:04:59.600
+which is illustrated with molecular images.
+
+00:04:59.600 --> 00:05:01.039
+This can be very useful
+
+00:05:01.039 --> 00:05:04.080
+when assembling manuscripts
+
+05:04.080 --> 00:05:05.440
+months or years later.
+
+00:05:05.440 --> 00:05:08.320
+This shows the YASnippet pull down
+
+05:08.320 --> 00:05:12.720
+after my library has been installed.
+
+00:05:12.720 --> 00:05:15.360
+I have an Org file open,
+
+00:05:15.360 --> 00:05:17.120
+so I'm in Org mode.
+
+05:17.120 --> 00:05:20.880
+We have the Org mode submenu,
+
+05:20.880 --> 00:05:23.919
+and under it, all my snippets
+
+00:05:23.919 --> 00:05:26.880
+are located in these sub-sub-menus
+
+05:26.880 --> 00:05:30.880
+that are prepended with pymolpy.
+
+00:05:30.880 --> 00:05:33.840
+Under the molecular representations menu,
+
+00:05:33.840 --> 00:05:36.479
+there is a listing of snippets.
+
+00:05:36.479 --> 00:05:38.563
+The top one is for the ambient occlusion effect,
+
+00:05:38.563 --> 00:05:39.840
+which we're going to apply
+
+00:05:39.840 --> 00:05:41.039
+in this Org file.
+
+00:05:41.039 --> 00:05:44.240
+So these lines of code were inserted after,
+
+00:05:44.240 --> 00:05:48.479
+as well as these flanking lines
+
+05:48.479 --> 00:05:50.240
+that define the source block,
+
+00:05:50.240 --> 00:05:53.280
+were inserted by clicking on that line.
+
+05:53.280 --> 00:05:55.120
+Then I've added some additional code.
+
+00:05:55.120 --> 00:05:56.880
+So, the first line defines
+
+00:05:56.880 --> 00:05:59.039
+the language that we're using.
+
+00:05:59.039 --> 00:05:59.768
+We're going to use
+
+00:05:59.768 --> 00:06:02.639
+the jupyter-python language.
+
+06:02.639 --> 00:06:04.560
+Then you can define the session,
+
+00:06:04.560 --> 00:06:06.400
+and the name of this is arbitrary.
+
+00:06:06.400 --> 00:06:09.680
+Then the kernel is our means
+
+00:06:09.680 --> 00:06:11.360
+by which we gain access
+
+00:06:11.360 --> 00:06:14.880
+to the Python API of PyMOL.
+
+06:14.880 --> 00:06:17.039
+The remaining settings apply to the output.
+
+00:06:17.039 --> 00:06:18.319
+To execute this code
+
+00:06:18.319 --> 00:06:21.199
+and to get the resulting image,
+
+00:06:21.199 --> 00:06:25.120
+you put the cursor inside this code block,
+
+00:06:25.120 --> 00:06:26.560
+or on the top line,
+
+00:06:26.560 --> 00:06:29.840
+and enter Control c Control c (C-c C-c).
+
+06:29.840 --> 00:06:32.240
+This shows the resulting image
+
+00:06:32.240 --> 00:06:33.600
+has been loaded up.
+
+00:06:33.600 --> 00:06:37.280
+It takes about 10 seconds for this to appear.
+
+06:37.280 --> 00:06:38.479
+So the downside of this is
+
+00:06:38.479 --> 00:06:40.729
+if you have a large number of these,
+
+00:06:40.729 --> 00:06:43.919
+the Org file can lag quite a bit
+
+00:06:43.919 --> 00:06:45.120
+when you try to scroll through it,
+
+00:06:45.120 --> 00:06:48.319
+so you need to close up these result drawers,
+
+00:06:48.319 --> 00:06:50.960
+and only open up the ones
+
+00:06:50.960 --> 00:06:53.199
+that you're currently examining.
+
+00:06:53.199 --> 00:06:54.319
+These are features I think
+
+06:54.319 --> 06:56.240
+are important in practical work.
+
+06:56.240 --> 00:06:59.840
+So, the plus is, a feature that's present,
+
+00:06:59.840 --> 00:07:01.120
+minus is absent.
+
+00:07:01.120 --> 00:07:03.199
+I think tab stops and tab triggers
+
+00:07:03.199 --> 00:07:04.800
+are really important.
+
+07:04.800 --> 00:07:05.680
+Triggers are important for
+
+00:07:05.680 --> 00:07:06.720
+the fast assertion code,
+
+00:07:06.720 --> 00:07:08.639
+tab stops are important for
+
+07:08.639 --> 00:07:10.560
+complete, accurate editing of code.
+
+00:07:10.560 --> 00:07:12.735
+I already addressed the rendering speed
+
+00:07:12.735 --> 00:07:14.560
+and scrolling issue.
+
+00:07:14.560 --> 00:07:15.759
+I think the way around this
+
+00:07:15.759 --> 00:07:19.199
+is just to export the Org document to a PDF file
+
+00:07:19.199 --> 00:07:23.360
+and do your evaluation of different images
+
+00:07:23.360 --> 00:07:25.199
+by examining them in the PDF
+
+00:07:25.199 --> 00:07:26.560
+rather than the Org file.
+
+00:07:26.560 --> 00:07:30.400
+The path to PDF is lightning fast in Emacs
+
+00:07:30.400 --> 00:07:32.240
+compared to Jupyter,
+
+00:07:32.240 --> 00:07:35.280
+where it's cumbersome in comparison.
+
+00:07:35.280 --> 00:07:38.400
+This is a snapshot of my initialization file.
+
+00:07:38.400 --> 00:07:41.840
+These parts are relevant to doing this work.
+
+00:07:41.840 --> 00:07:43.039
+A full description of them
+
+00:07:43.039 --> 00:07:46.319
+can be found in the README file
+
+07:46.319 --> 00:07:48.639
+of this repository on GitHub.
+
+00:07:48.639 --> 00:07:49.456
+I'd like to thank the
+
+00:07:49.456 --> 00:07:51.840
+Nathan Shock Data Science Workshop
+
+00:07:51.840 --> 00:07:54.319
+for feedback during presentations
+
+00:07:54.319 --> 00:07:56.160
+I've made about this work.
+
+00:07:56.160 --> 00:07:57.628
+And I would also like to thank
+
+00:07:57.628 --> 00:08:00.240
+the following funding sources for support.
+
+00:08:00.240 --> 00:08:03.879
+I will now take questions. Thank you.
+
+00:08:03.879 --> 00:08:03.986
+[captions by Blaine Mooers and Bhavin Gandhi]