Building a Bioinformaticist

Stellenbosch University, like many universities worldwide, requires researchers who attain the status of “full” professor to deliver an inaugural lecture.  These talks are intended to be accessible to a broad audience.  For my lecture, delivered August 9th, 2018, I was asked to create a short booklet that explained my career.  I opted for a somewhat earthy version of my biography.  I hope you will enjoy it!


From left to right: Prof. Gerhard Walzl, Vice-Rector Prof. Eugene Cloete, Prof. David Tabb, Dean Prof. Jimmy Volmink, Vice-Dean Prof. Julia Blitz, Vice-Dean Prof. Nico Gey van Pittius. Photo Credit: Wilma Stassen

This document contains several names in bold print. I credit all of them as mentors who helped a frequently stubborn, competitive and anxious young man to develop the set of skills necessary for a life in research (including “soft” skills such as diplomacy, empathy and patience).

Short biography

David L. Tabb was born in November, 1973 near Kansas City, Missouri, in the United States. After graduating high school in Blue Springs, Missouri, in 1992, he attended the University of Arkansas as a Sturgis Fellow for a four-year Bachelor of Science program. He graduated in 1996 with a major in Biology and a minor in Computer Science. His choice of the PhD program in Molecular Biotechnology at the University of Washington proved significant as he trained under John R. Yates III, a pioneer in the nascent field of proteomics. David defended his dissertation in 2003, propelled forward by five first-author publications. After a two-year postdoctoral fellowship at Oak Ridge National Laboratory, he joined the faculty of Vanderbilt University in Nashville, Tennessee, as an assistant professor in the Department of Biomedical Informatics. He was tenured and promoted to associate professor in 2011. In 2015, he left Vanderbilt and moved to Cape Town, South Africa, to join the Division of Molecular Biology and Human Genetics at Stellenbosch University’s Faculty of Medicine and Health Sciences.

The key themes that motivate Prof Tabb’s research are revealing biological information from experimental data, removing roadblocks to quality proteomics experimentation, and bioinformatics education.

Did the biology come first, or did the computers?

One of the most important events in my life took place in 1981. I was seven years old when my parents decided to acquire a Commodore VIC-20 computer for our home. My father came to my elder brother, Tom, and solemnly handed him the manual that explained the BASIC computer programming language. Being the oldest, Tom would have the first opportunity to learn programming. Happily, I was able to persuade him to give me the manual very soon thereafter. From that night forward, that book changed the course of my life.


William Shatner advertised the computer on which I learned to program.

I was fascinated by our new computer. I spent every spare hour I could with it. The computer proudly announced at boot that it had 3 583 bytes free for my programs and data. While that might sound impressive, keep in mind that the storage in any digital watch you can buy today dwarfs that size. We eventually acquired a cassette tape drive and a printer, but in those first days, my programs lasted only so long as the computer remained powered up. Nevertheless, I was hooked. Learning to program taught me many concepts long before they were introduced to my class in junior high or high school.

Blue Springs High School, near Kansas City, Missouri, was home to one of the finest biology departments in the state. By the time I graduated high school in 1992, I had taken a full year of Life Science, a full year of Biology, a full year of Advanced Placement Biology, and a semester of Microbiology. I even got the opportunity to be a research assistant through the summer months of 1991 in a molecular biology laboratory under Michael Lockhart at Northeast Missouri State University. I loved the subject, and fully expected to become a “bench” molecular biologist. Arriving at the University of Arkansas, I was one of the first students to major in Biology. The Arkansas campus created a unified department from pre-existing ecology, microbiology and pre-medical programs, and I was thrilled at my exposure to cell biology under Douglas Rhoads, who became my mentor. One of my favorite pieces of advice from him came when I considered changing my major to Computer Science: “David, if you stay in Biology, nobody will ever object to your programming. If you stay in Computer Science, you’ll have to fight to do anything in Biology.” Because I was part of the four-year honors program at Arkansas, I benefited tremendously from conversations with Suzanne McCray, the then honors adviser (and now vice-provost for enrollment and dean of admissions for the university).

19970400 Suzanne McCray and me

Suzanne McCray kept me on course to graduate despite my many interests.

I experienced something of a crisis in my third year of college, though. I spent the first half of the year in an internship at the University of Lyon, France, under Thierry Massé. This was my first experience spending all day, every day, at the lab bench. I developed my pipetting thumb, learned the joys of the fume hood, and even spent some quality time with an electroporation apparatus. One day, however, I was casting a polyacrylamide gel when I made the mistake of looking too closely at the print on the bottle. “Cumulative neurotoxin,” it warned. Staring back at the bottle, I silently considered how many gels of this type I was likely to pour in my career. My squeamish nature grew panicky.

Instead of continuing my planned year of research, I opted to take part in a “cooperative education” internship during the latter half of that year. I became a “scientific applications programmer” at G.D. Searle Pharmaceuticals in Skokie, Illinois, under Jonathan Boettcher and Lori K. Raymond. I developed some skills as a FoxPro database programmer. I also gained some experience managing a team of temporary workers tasked with updating the computing hardware and software across the company campus. My primary takeaway from the six-month experience was that I could work professionally in computers, not just play with them as a hobby, and would still enjoy the interaction.

I returned to the University of Arkansas, determined to jam as much computer science coursework into my final year of college as possible. It was a steep challenge because I enrolled in the challenging data structures and algorithm analysis course at the same time as I took its prerequisite, and both were taught in a different language than I had seen in the first-semester programming course! I was grateful that the chair of Computer Science, Greg Starling, agreed to serve on my undergraduate honors thesis committee in 1996.

Proteome informatics: building the bridge as we walked across it

My 1996 arrival at the University of Washington to begin my PhD found me in a more subdued and uncertain state than ever before. I had sought graduate programs where I could train as a computational biologist or bioinformaticist, but I worried that I had a far less mathematical background than others in the field. My initial adviser, Debbie Nickerson, pointed down the hall to the proteomics laboratory. “I hear that those guys need people with bioinformatics skills,” she reported. “Go do a rotation with John Yates.” I am very glad I did.

20001200 Yates Lab

For years, I was the only graduate student in John Yates’s laboratory.

John R. Yates III has an interesting attitude about graduate students: He thinks of them as underperforming postdocs. As the first PhD student to be mentored by John, I know we each learned some useful lessons about doctoral advising along the way. John was a rather unusual analytical chemist in that he invested considerable effort in pairing the key instrument of our trade (the tandem mass spectrometer) with algorithms that could help us identify proteins in mixtures very sensitively and quantify those proteins with precision. Therefore, it was no surprise when he was awarded the Biemann medal by the American Society of Mass Spectrometry in 2004. I joined his laboratory soon after his 1994 publication of SEQUEST, an algorithm to identify peptides represented in MS/MS (tandem mass spectrometry) scans by comparing their fragments with those expected to result from sequence database peptides. It is quite difficult to imagine the field of proteomics without such “database search algorithms”, and more than 30 similar algorithms have been published in the two decades since then. Arriving at Yates’s lab in 1996 gave me the chance to be part of the proteome informatics community almost from its genesis.

19981100 Dave at K-wing office

In graduate school, I was lucky to occupy a compact office, with its own door!

My graduate school experience consisted of four years of wandering, followed by three years of explosive writing. I frequently battled with feelings of failure in graduate school– I had excelled at high school and in my undergraduate years, but graduate school required a very different set of skills. The clouds and limited sunlight of Seattle contributed their own feelings of immobility. When John Yates moved his laboratory to The Scripps Research Institute in 2000, though, I was filled with new energy.

Mike Washburn and Laurence Florens made a huge difference when they convinced me that coding work I had already completed was worthy of publication. Like many graduate students, I needed help to understand the right amount of content needed for a paper. I soon submitted a manuscript describing my “DTASelect” software to a new journal from the American Chemical Society, and it became the very first research article that the Journal of Proteome Research ever published. I was delighted to have “points on the board” at last, and I scrambled to add more manuscripts to my roster.

How does the sequence of a peptide affect the set of fragment ions we observe in its tandem mass spectrum? This question had been investigated by studying MS/MS from sets of peptides with strong relationships among their sequences. With DTASelect, though, I could develop collections spanning thousands of distinct peptide sequences. With significant and persistent guidance from Vicki Wysocki at the University of Arizona, I was able to publish two very early looks at fragmentation. Unknowingly, I had “scooped” a large consortium that was publishing a more statistically sophisticated analysis.


Vicki Wysocki ensured that my bioinformatics training was grounded in chemistry. Image credit: Lauren Owens.

Graduate school also marked my first brush with spectral clustering, an attempt to find spectra that are produced repeatedly from proteome experiments, whether or not they were identified. In the days before routine Fourier transform (FT) or time-of-flight (TOF) measurement of ions, we frequently identified less than 10% of the MS/MS spectra we collected from a sample, so clustering represented a way to dig into this “dark matter”. I was glad to return to this field in 2005, and again in 2016. It is encouraging to see that a topic that seemed interesting yet tangential when I was in graduate school can do genuinely heavy lifting for repositories containing thousands of experiments.

At this early stage for proteomics, it was clear that the data enabled many kinds of analyses. Ironically, my graduate project from John Yates automated an alternative means of identifying tandem mass spectra – one that had been proposed by my adviser’s rival, Matthias Mann. If we could make it work well, “sequence tagging” could potentially compete with the SEQUEST “database search” software that John’s group had created in 1994, by allowing for highly flexible identification capable of recognizing novel post-translational modifications in our datasets. The 2003 paper unveiling the core of my PhD project began a thread of research that would eventually become my first National Institutes of Health (NIH) research project grant (R01) once I became an assistant professor.

During my time in graduate school, proteomics was transitioning from the “Wild West” to become a more restrained field. Journals such as Molecular and Cellular Proteomics developed rules to protect against a flood of false positive findings, and the field of proteome informatics began emerging as a recognisable discipline. Controlling the error rate of published identifications became a priority, and using parsimony rules to guide the inference of protein identities from peptide lists became the norm.  Software workflows such as the Trans-Proteomic Pipeline entered widespread use, particularly as the new companies LabKey and Proteome Software produced commercial implementations. This, however, is not to say that the fields of bioinformatics and computational biology had accepted its newest outgrowth as part of its core. To this day, in fact, one can easily find a syllabus for a bioinformatics class that entirely omits algorithms for protein identification.

Who am I? Building a professional identity

“You’re throwing away your career.” These words from an adviser in my graduate department were the shocking appraisal of what I could expect from the site I had chosen for my 2003 postdoc: Oak Ridge National Laboratory (ORNL). This was neither the first nor the last time I would be faced with such a negative appraisal. I can still hear my friend from high school lambasting my choice of the University of Arkansas. He broke out every tired stereotype about the American South, telling me I would never be able to leave. When I decided in 2015 to move outside the United States, I would hear the same refrain, this time with dire warnings added about the risk of violent death. I am convinced, though, that staking out your own place as a professional requires making the occasional unpopular decision.


While my initial office was in the Computational Sciences building (pictured), I soon moved to the venerable 4500S complex.

At ORNL, I experienced a tension that would recur throughout my career. Am I an analytical chemist or a bioinformaticist? I defended my dissertation in a mass spectrometry laboratory, but now I worked in a genome analysis and systems modelling group that was almost entirely dedicated to genome informatics and computational biology. Unfortunately, my diplomacy skills were still problematic, and my formal adviser and I were quickly at loggerheads. Frank Larimer, the head of the group, soon stepped in and explained I would work directly for him instead. “I hear the people in mass spectrometry need some tools,” he said. “Why don’t you go see what they need?” His quiet word neatly cut the Gordian knot: I was now officially on loan to the Chemical Sciences Division. I quickly gravitated to the team of proteomics researchers under Bob Hettich, which formed part of the Organic and Biological Mass Spectrometry team under Gary van Berkel.

I enjoyed the research in the Chemical Sciences Division, and was taken with the quiet countryside of Tennessee. I published my first database search engine at ORNL, and working to support the massive Genomes-To-Life project was an interesting challenge too. In working for ORNL, I felt I was working for the public good – although the site is operated by a business, it is a lab of the federal government. Unfortunately, I did not feel I could stay. The tension with my initial adviser remained and it seemed clear he would stand in the way of my becoming a staff scientist with the lab.


I was delighted to be on the same team as Dan Liebler.  He recruited me to Vanderbilt, and his departure signaled me to find another home as well.

I was intrigued by a telephone call from Daniel Liebler at Vanderbilt University in Nashville. The university had invested significant funds in the creation of the top-flight Mass Spectrometry Research Center, but felt they needed to establish a bioinformatics laboratory next door to the center to extract as much information as possible from each experiment. Soon I was an assistant professor in the Department of Biomedical Informatics under chairman Dan Masys and later Kevin Johnson. The position gave me the best of both worlds: I was next door to the mass spectrometrists, but I reported to informaticists!

Within three years of arriving at Vanderbilt, I had taken responsibility for the semester-long BMIF 310 Foundations of Bioinformatics class. Because its prior incarnation had frustrated student and professor alike, I was given free rein to write it from scratch. This challenge from the vice-chair for education, Cynthia Gadd, intimidated me: Was I the right person to define the set of topics people at Vanderbilt would call “bioinformatics”? I decided to portray bioinformatics as the set of informatics techniques that have grown alongside molecular biotechnology/systems biology. I tried hard to frame a course that emphasized the essentials of the field, whether or not I would feel comfortable teaching all the lectures. I was particularly grateful that I could rely on my friend Bing Zhang to cover gene expression and pathways and networks – topics I found mysterious at first. I was very fortunate that other departments at Vanderbilt bought into this plan of attack. Students in Biomedical Informatics were required to take my course, but they were usually accompanied by an equal number of students from Human Genetics and other departments who used the class as an elective, not to mention any number of auditors. In 2012, I even won the “Outstanding Educator” award for my department – a commendation of enduring pride for me.


In the summer of 2011, my Vanderbilt group reached its maximum size.

The experience of creating BMIF 310 taught me a lot about how broad a discipline bioinformatics has become. Essentially, all molecular biologists and all geneticists will be forced to interact with bioinformatics as part of their research. This may come as a brutal disappointment for those who hoped to escape mathematics by entering biology! We might compartmentalize bioinformatics into a three-way split: raw data processing, field-specific bioinformatics, and data integration. Many of our good ideas in raw data processing have found broad application, such as the detection of fluorescence (as essential in a massively parallel sequencer as it is in confocal microscopy) or the fast Fourier transform (used for detecting the “beat” of electropherograms for Phred base-call quality scores or to speed up cross-correlation in proteomics). My laboratory at Vanderbilt was a co-creator of the ProteoWizard library, which is a ubiquitous tool for converting raw data from mass spectrometers. Certainly if one is working within a single discipline, plenty of key tools will be used with high frequency (such as PLINK for genome-wide association studies, or a database search engine for proteomics). Perhaps the most widely used tool of this type with which I have been associated is Skyline – easy-to-use software for handling quantitative proteomics experiments. As we move to data integration, spanning many different disciplines, we rely more on biological pathway and networks to relate metabolomics, proteomics, transcriptomics and genomics findings. Here I would turn to my friend Bing Zhang, who created the free WebGestalt and NetGestalt tools. Of course, it is lovely to publish an algorithm that yields more widgets than the last one published, but bioinformatics really thrives when it can produce tools that enable bench researchers who don’t care about bioinformatics to do something new with their data.


Matthew Chambers was frequently the engine behind the laboratory’s greatest software accomplishments. Image credit: Lauren Owens.

Bioinformaticists must get their hands dirty or they may never come to understand which parts of their work are useful in other fields. My association with Dan Liebler led to my double involvement in the Clinical Proteomic Technology Assessment for Cancer (CPTAC) program of the National Cancer Institute (NCI). CPTAC funded the first R01 application I ever wrote, and the NCI soon expanded its three-year budget to a five-year one. CPTAC also funded a five-year U24 application (NIH application code for resource-related research projects – cooperative agreements) from Dan Liebler, which called on another 25% of my effort, and then they funded another U24 from him for a second set of five years. In short, CPTAC was the most enduring collaborative effort for my ten years at Vanderbilt University. It certainly played a major part in my receiving tenure and promotion to associate professor in 2011.

I wrote two papers on behalf of a working group for CPTAC, and both came at substantial cost. Proteomics as a field has struggled with the perception that it is irreproducible: If the same sample is processed in two different laboratories, one may receive rather different peptide and protein lists from the two sites. We decided to centralize sample preparation and data handling for a few different preparations of a yeast proteome sent to six different instruments (three linear ion traps and three Orbitraps). I then used these data to characterize the repeatability (technical replicate variation) and reproducibility (cross-lab, cross-instrument, etc.) in the identifications produced from the sample. Writing a manuscript that pleased all 32 authors was quite the struggle, with multiple restarts and scope changes. After a particularly nasty peer review from a Nature family journal, I nearly lost control of the manuscript; colleagues sought to carve it into a few paragraphs for other papers from the working group! Yet I am very proud of the paper that resulted, and Google Scholar reports that more than 300 other papers have cited it so far. After this experience, I was very leery about taking on another such effort for CPTAC working groups, but I did it again when Dan Liebler asked a few years later. This time, I was wrestling with quantitation spanning more than 1 000 liquid-chromatography MS/MS experiments for a pair of mouse xenograft tumors. I submitted that manuscript only after I retired from Vanderbilt University in 2015, and it was accepted within my first month serving at Stellenbosch University.


From left to right: Vice-Chair Cindy Gadd, Dr. Ze-Qiang Ma, Associate Prof. David Tabb, and Chair Kevin Johnson

So, if all was going so very well, why would I retire from Vanderbilt University? And why is a middle-aged academic using words such as “retire” anyway? The truth of the matter is that while the external signs showed me to be a highly productive professor, inside I frequently felt like a failure. I knew my contract with Vanderbilt specified that I was to cover 80% of my salary through external grants, but I never reached that number (though I did get close). I submitted a fair number of R01 and other grant applications, but I frequently saw the words “not discussed” in response. In 2013, I handled a challenging personal crisis that left me feeling quite empty. By 2015, I had developed vestibular migraines – a condition that occasionally found me lying on my office floor. I needed to restructure my career in a way that revitalized me, whether or not it seemed less prestigious to some of my colleagues.

Even a middle-aged dog can learn new tricks: new life in South Africa


Gerhard Walzl is the reason I came to South Africa.

Gerhard Walzl and the South African Medical Research Council (MRC) constructed a role I could play here in South Africa. It has become clear to many tuberculosis researchers that building and executing a data analysis plan for biotechnology-heavy experiments had become the limiting step to completing projects. South Africa needs bioinformatics researchers to drive its future. The MRC agreed to pay 60% of my effort for this five-year contract. My marching orders are quite clear: “Replace yourself!”

My team for this goal has several layers. The core group comprises Gerard Tromp, Gian van der Spuy and me – the three professors at the heart of the South African Tuberculosis Bioinformatics Initiative, or SATBBI. Our team mentors trainees at the postdoctoral, doctoral, master’s and B.Sc. honours level. As a group, we have implemented modules for bioinformatics and biostatistics education as part of the B.Sc. honours training offered by the Division of Molecular Biology and Human Genetics, and we were delighted this year to have students joining us from the Division of Medical Physiology B.Sc. honours program as well. Collectively, these topics are called “numeracy” training, and we are glad to see that it is gaining momentum on campus.


Gerard Tromp (second from left) and Gian van der Spuy (rightmost) inspire me with their unflagging motivation.

Working at Stellenbosch has engaged my research interests at many levels. My friend Mare Vlok manages the CAF Proteomics Laboratory. My interactions there have ranged from constructing a file server to publishing a complex “proteogenomics” analysis of Mycobacterium tuberculosis data in conjunction with the Samantha Sampson and Rob Warren laboratories. When friends from mass spectrometry ask what has changed in my research, though, I almost always answer: “I have fallen in with immunologists!” It is true that this has stretched me as a scientist. Immunologists make much greater use of assays based in flow cytometry-based separation of cell types, as well as immunoassays for cytokines and chemokines. Developing my skills in these areas to become useful to the professors around me is very much a work in progress. I have been delighted to be included in work with wildlife tuberculosis as well. Work with Sven Parsons and Michelle Miller in sequencing the spotted-hyena transcriptome will be the paper I anticipate most highly for 2019. Coming to Stellenbosch University has pushed me in new directions, and each day that I get to work on research recharges my batteries.

I view Stellenbosch’s Faculty of Medicine and Health Sciences as my home campus, and I have been grateful for the welcome I have received here. The MRC, however, has made it clear that they want to see me having an impact on research at other local institutions also. As a result, I sometime joke that I have become a “circuit bioinformaticist”, travelling among four institutions every week. I spend my Tuesdays at Groote Schuur Hospital, which affords me the opportunity to visit the team assembled by Jonathan Blackburn in the University of Cape Town (UCT) Division of Chemical and Systems Biology. Since his team produces high-quality proteomics data on a regular basis, I get to indulge myself in my favorite field through that weekly interaction. Working with Nicki Mulder’s Computational Biology team has brought me broader exposure via the H3ABioNet, an annual training program for more than 30 research groups across the continent. In 2018, we will add proteomics content to this program for the first time. I also try to spend some time with Tom Scriba at the South African Tuberculosis Vaccine Initiative, whose team has patiently worked with me to ensure I understand flow cytometry and its more recent derivative, CyTOF. One day a week for more than two years can add up to quite a lot of collaboration. In 2016, I was named an honorary professor in the UCT Division of Chemical and Systems Biology.


Prof. Jonathan Blackburn’s team at the University of Cape Town graciously hosts me every Tuesday.

Those Tuesday visits to Groote Schuur also enable me to interact with managing director Reinhard Hiller and his team at the Centre for Proteomic and Genomic Research (CPGR). I always enjoy my discussions with Liam Bell, the proteomics application specialist for CPGR. His laboratory serves the needs of proteomics users throughout greater Cape Town, and my interactions at other sites frequently involve data produced at CPGR. Reinhard Hiller and Tim Newman at CPGR are instrumental to the DIPLOMICs effort, which is intended to make it easier for researchers who need biotechnology services anywhere in South Africa to know exactly who can confidently provide those services at what price. The relationships I have built through DIPLOMICs were quite helpful when Liam and I penned a recent paper detailing the growth of mass spectrometry throughout South Africa.

20171006 Gala Photo NARDUS ENGELBRECHT 52

The Singing Sensations make joyful noises at each Spring Gala. Image credit: Nardus Engelbrecht.

After spending my Wednesday back at Stellenbosch’s Faculty of Medicine and Health Sciences, I am once again on the road for Thursday at the University of the Western Cape (UWC) Department of Biotechnology. Working with Ndiko Ludidi, first as head of department and now as deputy dean, has been very rewarding. His leadership has energized efforts to see that Biotechnology graduates are ready to serve as developers instead of mere users of biotechnology tools. I have enjoyed working with Ashwil Klein, the manager of the Proteomics Research and Service Unit (also supported by the Agricultural Research Council), to acquire new instruments for the unit and to ensure that the team can publish their work in international journals. Our recent forays have begun involving faculty from the Statistics and Population Studies Programme in analyzing data from proteomics for two-factor studies.  UWC has also represented a place where I can develop my skills in didactic training. In this year alone, I will contribute seven different lectures to the B.Sc. honours program under Bronwyn Kirby, encompassing next-generation sequencing informatics, protein identification through MS/MS, and an entirely new Special Topics module on clinical biomarkers. For me, teaching a 12-week program in Biostatistics was certainly a highlight of 2017. That opportunity targeted graduate students of the Institute for Microbial Biotechnology and Metagenomics under Marla Trindade. I was very grateful to be named extraordinary professor in the UWC Department of Biotechnology in 2016.

I had expected that moving to South Africa would pull me out of the never-ending conference circuit. I had anticipated that I would travel to the United States every year for the American Society of Mass Spectrometry and would spend the rest of my year on the African continent. How very wrong I was! Soon after I moved to Cape Town, I was asked to chair the Quality Control working group for the Human Proteome Organisation’s Proteomics Standards Initiative (HUPO-PSI). The annual workshop for the group changes location each year, but has never come to Africa. I have made it my goal to see HUPO-PSI hosted by South Africa. We are one of the contenders for 2019, though we have stiff competition. As with all things in academia, one opportunity led to another; I was delighted in 2017 to get the chance to teach quality control at a clinical proteomics meeting in Russia. This was my first time to visit the nation, and I was very conscious that not many of my friends from the United States would ever see its cities. Rather than representing a caesura in my travel schedule, my time at Stellenbosch has marked an upswing in my chance to see the world.

The future of South African bioinformatics

One of my favorite moments in lecturing B.Sc. honours students does not appear on my PowerPoint slides, and I have written no script to guide me through it. I point to myself, and just as the students begin worrying whether I am crazy, I confirm it by saying: “I am NOT the future of South African bioinformatics.” I then point to the class and say: “You are.” Our students need to hear that we are training them to become our colleagues, not to mention our intellectual heirs. Reaffirming that they have been painstakingly chosen for this training and that we are committed to their success really matters. When students struggle with assignments, I have repeatedly seen their scores climb when the professor takes them aside to say he or she knows that they can win through these challenges.


Who knows where we can go together? Image credit: Luigi Bennet.

What does my own life story tell me about building bioinformaticists in my new home?

  • We must lay the groundwork early.

I expected to go to college from primary school onwards. We were never without a computer in our home after I reached the age of eight. My high school offered a variety of special Advanced Placement classes to prepare students to enter college at an advanced level. Obviously, these advantages are not available to many learners at the primary and secondary education levels in South Africa. This has led to a “pump priming” problem at the college and graduate level: Excellent training programs at universities are not always able to find excellent candidates. University personnel must actively engage local high schools to shape “rough diamonds.”

  • All biologists need numeracy training.

When we teach bioinformatics and biostatistics to our division’s B.Sc. honours students, the SATBBI professors are not directing our lectures to only those students who have shown an interest in a bioinformatics PhD. We teach this content because we know that all of these students will encounter problems that require this skills set and these tools. A biomedical researcher who does not understand how sample size relates to power, or who has no idea how to discern which sequence variants have phenotypic effect, is at a considerable disadvantage.

  • Many specialties can be adapted to bioinformatics excellence.

In my career, I have worked with bioinformaticists who completed Ph.D.s in physical chemistry, physics, astronomy, mechanical engineering, biotechnology, biochemistry and any number of other fields, but bioinformatics professors who have bioinformatics Ph.D.s are incredibly rare. One side effect is that our field sometimes struggles with imposter syndrome: “I cannot possibly be a real bioinformaticist!” The reality is that many of our field’s greatest accomplishments have come from just such individuals.

  • A bioinformaticist cannot sit in an ivory tower.

When a university concentrates all its bioinformatics personnel in one space, their computational skills may benefit, but their ability to solve real-world problems may well suffer. I am convinced that my close proximity to biological mass spectrometry throughout my career has been essential for my understanding the tasks that most needed attention. Eventually, bioinformatics may have its own department on each campus, but for now, many practitioners in my field are housed with Pathology, Organic Chemistry, or Genetics. I believe this is a healthy arrangement.

  • Caring mentors make the difference for a student to feel at home in a strange environment.

Students need to hear: “I believe in you.” They need to know that their professors are committed to seeing that each of them gets a chance at success. And people do not stop needing encouragement when they become staff members either. Our colleagues should not feel they must apologize when traumatic events away from work cause a lapse in their work schedule. The world of academia will be healthier when departments see themselves as teams rather than competitors. Whenever I might have given up on research, someone who cared made the difference. I owe my colleagues the same level of attention. Thank you for making a difference.


1 thought on “Building a Bioinformaticist

  1. Pingback: Ghana: the transformation of Elmina Castle | Picking Up The Tabb

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s