As DNA sequencing becomes faster and cheaper, biologists are increasingly turning to their counterparts in the bioinformatics field to help interpret the reams of data generated by their instruments.
The bioinformatics field is wide, and encompasses a range of different jobs, from doing preliminary analysis on data at a core lab to writing new algorithms to better find relevant patterns in genomic datasets. Stanford University School of Medicine's Yannick Pouliot—who's been involved in bioinformatics since the mid-1990s—says bioinformaticians are increasingly taking the lead in the lab, designing their own experiments and answering important biological questions. "[Bioinformatics] can be working with bench biologists who have a large bolus of data that they need to analyze, and therefore you're writing code and supporting them," Pouliot says. He calls that role "a service kind of position." On the other hand, he says, the job can be "much more investigational where you're doing actual research using computational tools, where this is now doing science on your own with computation."
Because so much of sequencing and data generation is done through automated instruments, bench positions are becoming less critical, he adds. "Yes, you do need someone who's doing the experiment, but frankly the important part is designing the experiment and interpreting the results, and the piece in the middle is less and less valuable," Pouliot says.
The Harvard School of Public Health's John Quackenbush was one of the first people to apply for and receive a career development award from the National Institutes of Health to develop the field that would come to be known as bioinformatics in 1992, when the Human Genome Project was getting started. He has seen the field change in meaningful ways since then. "When I started working in this area, the big questions were, 'How are we going to sequence the genome, how are we going to put it together, how are we going to find the genes, and how are we going to annotate them?' Those were significant challenges," Quackenbush says. "There was no clear roadmap on how to solve any of them, so there was a tremendous amount of development of new methods, new approaches that were constantly challenged by the existing datasets. New technology, as it comes online gives us the ability to generate new kinds of data in quantities that we had previously not had access to."
Thanks to the variety of programs and software now available, and the open source movement that allows researchers to share algorithms, databases, and datasets, Pouliot says bioinformatics is now "more an issue of coming up with good research questions than writing code."
Also at Stanford, Purvesh Khatri—who switched to a career in bioinformatics 13 years ago after working in software and communications engineering—says the advances being made in bioinformatics are leading to breakthroughs in the clinic. "In about 1999 or 2000, people were discussing how to normalize a microarray," Khatri says. "Now, instead of just figuring out how to analyze the data, we're trying to figure out how this will impact health. Because of the data that's available publicly, we can take the lead and ask questions that people didn't think were possible to ask."
Indeed, Pouliot says, bioinformaticians are "rewriting biology." By applying analytical tools to different diseases, bioinformatics is changing the way diseases are classified. Presently, diseases are "basically defined for historical reasons," Pouliot says. "And then you start applying these analytical tools and you realize it's much broader than originally described. "Bioinformatics analysis can detect patterns, for instance, that indicate that drugs that are effective in disease A are also effective in disease B. "Because these aren't two diseases, but one disease manifesting differently," he says. In that way, Pouliot says, bioinformatics science has a practical side, leading to diagnostics and potential therapeutics.
Because bioinformatics encompasses such a wide range of jobs, it's important to have the right training and education. Quackenbush says bioinformatics sits at the nexus of biology, statistics, and computer science. A bioinformatician could work on anything from managing and collecting data, to developing laboratory information management systems, to doing large-scale comparisons of genomic datasets, so it's important to be well rounded, he says.
Having a firm grasp of basic programming languages, a working knowledge of data structures, and a good understanding of statistics is very important, according to Quackenbush. But it's also essential to have a firm grounding in the underlying biology.
Bioinformaticians who aren't biologists should at least learn the basics, like what a microRNA is and what role DNA methylation plays in transcription, Stanford's Khatri adds. He also makes it a point to read the available literature on a given subject before starting his own research, to familiarize himself with what's already been done.
Pouliot gives advice to prepare postdocs for careers in bioinformatics: "Get good at both designing experiments and thinking in statistical terms, because when you have vast amounts of data, the only way to interpret and get a signal is if you understand the properties of large datasets. You need to think statistically." He says he often sees those uninitiated in bioinformatics making the mistake of designing traditional experiments instead of statistical experiments. "They get their data and they see a pattern and think this is meaningful and go off chasing after the pattern, but it will almost certainly be noise."
Pouliot adds that there are people who come to the field from outside biology and learn the biological essentials, and biologists who learn programming and statistics. "There's a big mass of information you need to master," Pouliot says. "Take programming courses, data structure courses, data mining courses, statistics courses, write code, read books, and think translationally."
And try not to reinvent the wheel, he adds. Many techniques that have already been invented for other fields also work well in bioinformatics, and recognizing the similarities is a valuable skill.
Having the necessary skills could also pay off, both in salary and opportunity. "One thing people often ask me is, 'Is this a good career path?'" Quackenbush says. "It's a fundamentally valuable set of skills to have, and what we're starting to see is that because people can generate large datasets, they're starting to realize that they desperately need people with the requisite skills to interpret those data sets." At Harvard, he says, PhDs with the ability to do genomic analysis and computational biology "are being recruited for jobs almost faster than we can train them." Harvard's School of Public Health is also developing a master's program in computational biology to keep up with the demand from both students and employers, he adds.
If a postdoc were to ask Khatri whether bioinformatics is a good career path, "I would say yes before they even finish the question," he says. A postdoc in bioinformatics can expect to earn about 50 percent more than a postdoc in biology, he adds, and a bioinformatics research associate can expect to earn almost 50 percent more than a counterpart in biology.
Those estimates might be an exaggeration—Genome Technology magazine's 2012 salary survey put average salaries of bioinformaticians with up to 10 years of experience at between $50,000 and $75,000 and those with 10 or more years at between $75,000 and $100,000. But Bioinformatics Scientist Nicolas Robine says there are many more important reasons than money to consider a career in the field. "For a scientific mind, or for somebody who wants to contribute to major changes in medicine, it's an exciting time and bioinformatics is both a great career choice and a very useful skill," Robine says. He sees his employer, the New York Genome Center, as an ideal place to apply bioinformatics to transforming healthcare and biomedical research.
That exciting and ever-evolving nature of the field is what keeps Khatri, Pouliot, and Quackenbush enthusiastic about their work too. "I realized there are so many problems that I could basically spend my life working on different problems and never get bored," Khatri says. "This is my fourth career and I know I am not shifting anymore."
Quackenbush says bioinformatics has become "an essential foundation for all molecular and genomic science," that gives him the opportunity to inquire, discover, and do things no one has ever been able to do.
Christie Rizk is a reporter and editor based in New York. She is a regular contributor to the New York Genome Center, and her work has appeared in Genome Technology magazine, Techonomy, Reuters, and The Brooklyn Paper.