The Top 3 Challenges in Scientific Data Analysis: A Conversation with Benjamin Haibe-Kains
At Code Ocean, we’re thrilled and honored to work with brilliant scientists like Benjamin Haibe-Kains, Senior Scientist at Princess Margaret Cancer Center, Associate Professor at University of Toronto, and Scientific Advisor to the Break Through Cancer Foundation.
We sat down with him recently to talk through the larger state of genomic research and what he sees as the major challenges and opportunities facing the field today. This three-part series covers his response across three main areas.
“Today’s rapid advancements in computational biology, combined with plummeting costs of data and cloud computing should lead to an acceleration in discovery of new therapies,” said Haibe-Kains.
“Yet they often present new challenges that slow down progress. If not addressed, they can lead to higher costs – including the opportunity cost of not providing patients with effective therapies in a timely manner.”
Haibe-Kains summarized three major areas of focus that should be addressed: onboarding, reproducibility and collaboration.
Challenge 1: Hiring and Onboarding
The crux of this issue lies within the tremendous opportunity provided by larger changes in technology.
“The costs of sequencing and cloud computing have come down dramatically in the past few decades,” says Haibe-Kains. “But that means massive amounts of data are being generated, creating bottlenecks in analysis.”
He recounts that, 30 years ago, sequencing DNA was extremely tedious and expensive, and was the rate-limiting step, as opposed to the computational analysis of such sequencing data. But the costs of cloud computing and sequencing have become so inexpensive that sequencing data have now become a commodity.
The bottleneck is now the computational analysis. You still need an expert who can analyze the data and talk to the people who generated it to figure out the best way to analyze it. This has become a major bottleneck across all industries including life sciences, social sciences, and engineering.
The shrinking average tenure and scarcity of qualified candidates means that you can’t waste time asking someone to recreate the code of recently exited staff.
As a result, the first challenge is truly adjacent to the second, reproducibility – which is often listed as one of the main challenges in science overall. How can current knowledge be ready not only for the next scientist, but also the next experiment? Haibe-Kains will cover that topic in the next post of the series.
Read more from our blog:
View All PostsIntroducing Code Ocean Models: a unified environment for ML in CompBio
View PostPre-production: the missing ML link in Biotech & Pharma
View PostMap of foundational models for use in biotech and pharma R&D
View PostSubscribe to our newsletter
Get the latest product updates, company news, and be the first to hear about upcoming webinars and events.