Free online training in bioinformatics and biostatistics!

Many biomedical research graduate programs have been compelled to “go online” with little or no notice due to the COVID-19 pandemic. Since many professors are distinctly uncomfortable in front of a camera, students may not get exposure to some of the topics that they might have expected. Since my move to South Africa in 2015, I have tried to deposit video on YouTube (and PDFs of the slides to Google Drive) every time I have delivered a lecture for students. Since I teach a fairly broad range of coursework, the catalog of topics may be useful for students working from home or for departments that are trying to ensure their students get the training they need. This page indexes the series of lectures I have made available. I hope you enjoy watching them as much as I have enjoyed creating them! (Please see the note about copyright and attribution at the bottom.)

Sequencing Bioinformatics

DNA Replication, courtesy of Wikimedia Commons

I would like to start with the core part of my curriculum. Most of the researchers I train need help to analyze the reams of data coming from massively parallel sequencing experiments. I have delivered these five lectures in a variety of revisions at three different universities in the Western Cape. This year I chose a new structure to put mapping and assembly right up front and to give myself a bit more room for some of the other topics. The amplicon lecture started at the University of the Western Cape but now has evolved strongly enough in the microbiome direction that it will be taught at my own campus, as well.

TitleSlidesLength
Sequencing, Mapping, and AssemblyPDF1:29:58
Gene-Finding and Sequence AlignmentPDF1:30:00
Hands On: Recognizing protein orthologs with DIAMONDnone0:26:46
Genetic Variants, Phenotype, and GWASPDF1:08:20
Gene Expression and DifferentiationPDF1:20:14
Amplicon and Microbiome InformaticsPDF1:15:28
Hands On: Oligo7 Primer Design Softwarenone0:06:01

Bioinformatics Beyond Sequences

fluorescence microscopy of mitotic LLC-PK1 cells, courtesy of Wikimedia Commons

Of course, bioinformatics is critical for producing information from all kinds of data, not just those from sequencers. I feel it is important for bioinformatics students to learn about basics like ASCII and the representation of image data in memory. An emphasis on the essentials of biological pathways always finds its way into my teaching line-up; I feel grateful that my colleague Bing Zhang was willing to explain so much of this topic to me!

TitleSlidesLength
Computer Science Concepts in BioinformaticsPDF1:14:21
Image Analysis and Flow CytometryPDF0:55:34
Relational Databases and RepositoriesPDF0:52:43
What does this gene do?PDF0:49:33
Biological Pathways and NetworksPDF1:16:48

Bottom-Up Proteomics

The ATP synthase complex, courtesy of the Protein Data Bank Molecule of the Month

For twenty years, I have been publishing in the field of proteomics, and its rich informatics is something I want to share with every student! Frequently, though, I find that proteomics content is reduced to just a single lecture for our division’s B.Sc. Honours students. I was pleased as I looked back through my YouTube channel to discover lectures on a much broader range of subjects. The identification of MS/MS to peptide sequences continues to be my “wheelhouse,” but many aspects of this field fascinate me just as much today as they did a decade ago.

TitleSlidesLength
Why Bother with Proteomics?PDF0:28:03
Peptide and Protein Identification from MS/MS DataPDF1:27:14
Hands On: MSFragger Database Search with IDPicker Protein Assemblynone1:10:16
Hands On: MSFragger Database Search with Philosopher Protein Assemblynone0:42:08
Identification of Post-Translational ModificationsPDF0:58:14
Label-Free QuantitationPDF1:01:25
Seven basic tools of quality for biological MSPDF0:56:56
Proteomic Quality ControlPDF0:49:07
Adventures at the Proteome-Genome BoundaryPDF0:46:02
Proteogenomics in MycobacteriaPDF0:48:13
The Secret Lives of Amino Acidsnone0:42:34

Top-Down Proteomics

This Fragment Map illustrates the extent of MS/MS evidence for a particular proteoform from H. pylori

Many biological questions can only be answered if we measure intact proteoforms without use of digestive enzymes. These “top-down” proteomics strategies have emerged from specialist laboratories to become more broadly applied; some MS facilities are beginning to offer top-down analysis as a service. These training videos explain why top-down proteomics offers special challenges. They also examine why the process of identification is more daunting in these MS/MS collections. The demonstrations of deconvolution in FreeStyle and visualizing PrSMs are intended to support users of targeted or inclusion-list experiments.

TitleSlidesLength
Top-down proteome informaticsPDF1:13:23
Top-down proteomics applicationsPDF1:08:35
Protein Structure Determination and EstimationPDF1:00:04
Hands On: Creating a subset database in MSFragger and Philosophernone0:41:58
Hands On: Identifying MS/MS experiments with pTop 1.2none0:30:54
Hands On: Identifying MS/MS experiments with TopPIC Suitenone0:39:07
Hands On: Interpreting TopPIC and TopMG Proteoformsnone1:03:25
Hands On: Top-Down Deconvolution in Thermo FreeStylenone0:53:58
Hands On: Visualizing Proteoform-Spectrum Matches in ProSight Lite and ClipsMSnone0:49:01
Hands On: Conducting searches in ProSight Proteome Discoverernone0:45:30

Metabolomics

Cell membrane diagram showing phospholipids and complex carbohydrates from Wikimedia Commons

I am definitely a visitor in the field of metabolomics, but only recently have enough researchers been available in this field that departments who want this emphasis can hire a professor in the space. In my final year at Vanderbilt University (2015), I organized a course demonstrating two of the most widely used tools in the field: XCMS and METLIN. From what I have read in metabolomics papers, 2015 is essentially before the last ice age. Still, I felt these lectures could be useful to newcomers in the field.

TitleSlidesLength
The Origin and Tools of the ProteoWizard ProjectPDF0:40:46
Advanced Options in ProteoWizard msConvertPDF0:40:30
XCMS Feature Finding and Retention Time WarpingPDF0:41:30
Statistics of differentiationPDF0:35:55
METLIN and MassBankPDF0:46:55
Lipid Identification with Greazy and LipidLamaPDF0:32:51

Clinical Biomarkers

96 well plate image from Wikimedia Commons

The Department of Biotechnology at University of the Western Cape sought to broaden the medical content available in their B.Sc. Honours curriculum, so I teamed with talented post-doc Dr. Caroline Beltran to create a module. 2020 will represent the third year the course is presented; in 2019, when this was filmed, Dr. Byron Reeve assisted by teaching the gene expression segment.

TitleSlidesLength
Introduction to Biomarker ResearchPDF0:48:14
Proteomics for Biomarker DiscoveryPDF0:54:12
ImmunoassaysPDF0:50:26
Luminex Data AnalysisPDF0:48:36
Inborn and Acquired Genetic BiomarkersPDF1:18:15
Gene Expression BiomarkersPDF1:02:31
Receiver Operating Characteristic CurvesPDF0:47:27
Introduction to Machine LearningPDF1:08:59

Statistically Speaking

This 1930s poster from the Eugenics Society demonstrates the language used to make their forced sterilization policies palatable. (courtesy of Wellcome Library)

Many students experience statistics as a very dry topic. When the University of the Western Cape asked me in 2017 to assemble a course in biostatistics, I decided it was a good opportunity to “decolonize” the subject. Many of the great advances in statistics came from researchers who had strong ties to the burgeoning eugenics movement. As I felt more comfortable at the helm over the twelve-week course, I incorporated more and more of the social context in which these researchers were laboring. Source scripts can be found in Google Drive. [Sept. 28, 2021: I updated the links to PDFs of slides to reflect a change in Google Drive security.]

TitleSlidesLength
Measurements and DistributionsPDF1:03:14
Spread and ConformancePDF0:51:12
Correlation is not CausationPDF0:50:11
Linear RegressionPDF0:56:07
Difference TestsPDF0:44:52
ANOVA and the Tukey HSD TestPDF0:58:33
Contingency TablesPDF0:57:49
Two-way ANOVA and Repeated MeasuresPDF1:01:46
Dimensionality Reduction and Principal Components AnalysisPDF0:57:44
Agglomerative and Divisive Hierarchical ClusteringPDF1:00:08
Multiple Testing CorrectionPDF1:07:40
Power AnalysisPDF00:51:34
Hands On: Create a Histogram and Boxplot in ExcelInputs0:25:06
Hands On: Create a Histogram and Boxplot in GraphPad PrismInputs0:14:42
Hands On: Create a Histogram and Boxplot in RInputs0:15:45

Introduction to R Statistical Environment

The R logo, from CRAN

The free R language has become one of the most commonly used for biostatistics because of its inherent parallelism and ability to incorporate libraries of advanced functionality, particularly the “Tidyverse.” This five-part series is designed to teach the “first rung of the ladder” to newcomers. The scripts to support the class can be found in Google Drive.

TitleSlidesLength
Why bother learning R? and basicsPDF1:06:42
Reading and Writing FilesPDF0:39:34
Conditional Execution, Looping, and FunctionsPDF0:45:33
Visualizing Data with GGPlot2 LibraryPDF0:43:38
Interpreting Difference TestsPDF0:51:48

Introduction to Python for Bioinformatics

Biopython represents a powerful suite of tools for computational biology.

Python is still quite a new language for me, but it clearly has a lot of momentum on its side! For students who want to create software of broader capability than what R can support, Python is a great choice. I decided to build my six-session workshop on the project of reading a FASTA file and evaluating and visualizing the frequencies of amino acids. It’s a helpful way to get started with the basics of this very powerful language! The Python code can be found in Google Drive.

TitleSlidesLength
Why Python? and Essential ConceptsPDF0:47:41
Sets and Iterating LoopsPDF0:42:00
Reading FilesPDF0:28:22
Exceptions and the Collections Data StructurePDF0:40:28
Data Structures and PlottingPDF0:43:22
BioPython and NumPyPDF0:42:06

Careers in Research

Scientists from iEmoji.com

From time to time, I get the chance to help graduate students with some of the essential skills of being a scientist. Perhaps my favorite of these talks is the first, which explains that going to graduate school is not a mistake, even for someone who decides that they will not stay in biomedical research afterwards! Researchers can really benefit from encouragement, just like anybody else.

TitleSlidesLength
Research Skills for a Stronger South AfricaPDF0:38:36
The Unholy Trinity of Research EthicsPDF0:43:59
Creating a solid research posterPDF0:37:13
Preparing and Delivering Scientific PresentationsPDF0:51:12
Zotero Citation Managernone0:49:48
Social Media for Scientistsnone0:45:36

Copyright and Usage Information

If you are a professor using these lectures as part of a class, you are welcome! Please let me know that you are using them by email (I am dtabb over at sun.ac.za) or by a comment below. If you need to use a slide or two from my presentations, I am willing to send you the PPTX rather than leaving you to copy images from the PDF.

I choose to license these materials as Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0), meaning “You are free to:
Share — copy and redistribute the material in any medium or format
Adapt — remix, transform, and build upon the material
The licensor cannot revoke these freedoms as long as you follow the license terms.”

56 thoughts on “Free online training in bioinformatics and biostatistics!

  1. Reuben Kombe

    Iam a molecular biology and biotechnology graduate from UDSM currently A Quality Assurance Officer …This knowledge is really very helpful in research studies.

    Liked by 1 person

    Reply
  2. Dr. Mutiu A. Alabi

    Thank you Prof. Since it’s an online course, can’t there be evaluation and assessment at the end of the course as well as certification? Thank you as I await your response.

    Liked by 1 person

    Reply
    1. dtabb1973 Post author

      Hi, Dr. Alabi. While I have made the lectures available online, this is not a “MOOC.” Addressing questions and setting assessments is something I do for the groups I am teaching directly, currently limited to universities I visit in South Africa and workshops I run at conferences. Creating a MOOC is something that would take a lot of effort, and I’m not funded to do that right now. I would try to help with questions if you are planning to teach this yourself, though! Thanks.

      Liked by 1 person

      Reply
  3. Maryam Naseer

    please provide bioinformatics questions based on python for solving to create an idea what exactly python looks in bioinformatics and what are the things we can do using python in bioinformatics .

    Liked by 1 person

    Reply
    1. dtabb1973 Post author

      The support materials for the Introduction to Python course (six sessions) include source code that we run through the Python interpreter. Six hours is probably not enough to convey more than the idea of the language, though. For a fuller introduction, I would suggest someone like Peter van Heusden (https://twitter.com/pvanheus) or Nathan Edwards (https://edwardslab.bmcb.georgetown.edu/) or Sam Payne (https://biology.byu.edu/sam-payne-lab). Really, the internet is littered with tutorials in Python; mine is just intended to give biomedical students their first look at the language.

      Like

      Reply
      1. SSALI IBRAHIM

        This is very helpful. thanks for the kind heart to help fellow scientists. I am a biomedical laboratory technologist. I am now perusing
        a master’s degree in Bioinformatics.

        Liked by 1 person

  4. Dativa pereus

    Thanks Prof. I’m a graduate in molecular biology and biotechnology and hold MSC in molecular plant systematic currently looking for the PhD in bioinformatics I hope the course will be helpful

    Liked by 1 person

    Reply
  5. Kulsoom

    Thankyou so much for the course outline u have created…..İ am research student in field of Proteomics and this will be quite helpful in revising and reassuring the concepts….thanks again for your effort and time you are putting in, making things more feasible for others.

    Liked by 1 person

    Reply
    1. dtabb1973 Post author

      I would suggest you start by clicking on the title of a lecture topic that appeals to you; it should open the YouTube video corresponding to the title. It will be a lot easier to follow along if you have also downloaded a PDF of the slides, which is available from the link labeled “PDF.”

      Like

      Reply
  6. Andrea Lius

    I’m starting my PhD program this fall. I will start my rotation in a proteomics lab this August, and the PI sent me a link to this site so I could prepare well. Thanks, Dr. Tabbs! This will be really useful!

    Liked by 1 person

    Reply
  7. Peace Onwuzurike

    So glad to be here and to have access to this long sort for information. Am a PhD student working on P . falciparium resistant studies.
    Pls can you share the power point version with me
    cause I have great passion and need to teach upcoming students.
    My email is evangelonwuzurike@gmail.com

    Liked by 1 person

    Reply
    1. dtabb1973 Post author

      Hi, I am glad you are finding these useful! No, I will not be offering certificates; I am blind to who is watching what, and I have no way to estimate the information people have absorbed. For that, you would need to attend a massive, open, online course!

      Like

      Reply
  8. Reynaldo (@Reynaldommelo)

    Thank you very much, professor. Great course, all lectures I’ve seen so far are very interesting and exciting, they really hold my attention. I already learned a lot from them and intend to watch them all, especially the statistical ones. I’m a Ph.D student in Brazil, I’m recently preparing some lectures for future classes in proteomics and they will surely inspire me in its development. Once again, thank you for your initiative.

    Liked by 1 person

    Reply

Leave a comment