Skip to content
Book a Demo
September 10, 2024

Pre-production: the missing ML link in Biotech & Pharma

pre-production

There’s a huge gap between ML model hype and adoption because developers can’t—or don’t—involve domain experts early enough.

While some companies have built AI/ML at their core, most Biotech and Pharma organizations struggle to add this work on top of a heavy list of priors:

  • They usually have only a few projects testing an ML idea at any given time
  • Of those tested, only a handful will end up in production
  • Of those in production, few bring any value to domain experts (e.g. pharmacologists, cell biologists, epidemiologists, chemists, etc.)

We know this because we speak to people in this position all the time across companies of every size, application, and skill level.

This is a big problem: ML has a lot of potential but only if people can actually work with it.

Dev teams often develop models that domain experts later invalidate

Here are the main reasons why ML model adoption is poor:

  • Data quality and quantity: there is either not enough of it, or the data is so disorganized that the project never moves past the organization and cleaning phase
  • Knowledge gaps: sometimes the dev team does not know how to train and debug a neural network beyond the standard tutorials
  • Gaps in expectations: sometimes the dev team creates what they think is a good model, but when it finally meets a real user it doesn’t meet expectations

It’s that last one, developing a model you think is good but nobody wants to use, that represents a really bad outcome. Such a huge expense in time and cost—think of all those GPUs and expensive AI talent—essentially gone to waste.

But how does that happen?

ML work is wasted when domain experts aren’t involved

Imagine you’re in a team developing a new model. After many hours of work and hours of GPU compute, you conclude that your model’s performance is great! It’s achieved an AUC (area under the curve) score of 0.95.

But… is it great? How do you know? Do you know for sure if your training and testing data contains examples that are actually tough enough? Do you know the price of a failure? Maybe having any error here is a big deal: patients could be at risk. There could be significant claims.

Trust needs to be earned, and domain experts will only trust a model if they can add some specific examples they already know the answer to in order to “see if the model can figure out what the real answer is.” Giving them a number isn’t enough.

But if your domain expert is somebody who has never even opened the command line interface let alone written any code (i.e. the vast majority of people), how are they going to do that?

Typical ML workflows fail because they move to production too soon

Here’s a simplified version of what an ML workflow usually looks like today.

Step 1: development

  • Ideation/need identified to create a model for a specific use case
  • Assembly, organization, and cleaning of data useful for training the model
  • Training of the model with prepared data, hyperparameter tuning for performance with a test dataset
  • If necessary, iteration over different model architectures

Step 2: productization

  • Make sure there’s reproducibility, lineage, and proper documentation
  • Quality assurance, privacy, compliance, adherence to guidelines
  • Serving and managing the model
  • Performance monitoring (e.g. for model drift)

Where in this workflow does it make sense to involve a domain expert with limited technical ability? We're left with two options:

  1. Involve domain experts in the development process, but struggle through because of their lack of technical understanding
  2. Push the model to production and build a no-code front end from scratch (exposing yourself to pushback if it doesn’t perform well), then send for validation

Neither option is good.

We need ML model development with domain experts in the loop

The challenge for dev teams is obvious: how do you let domain experts quickly play around with a WIP model and get feedback without needing them to use scripts, write code, access a command line, and navigate (e.g.) the AWS console without building a dedicated UI, and without involving IT to set up dedicated test servers?

While the machine learning lifecycle is usually split up into development and production phases, in reality, there is another phase that is usually not specifically addressed. This is the “pre-production” phase, in which the dev team has finished tweaking the model, and the model is being tested by the first early adopters with real-life data.

These early adoptors are characterized by their domain expertise and by their relative lack of engineering skills. However, they can uniquely judge the outcomes of the model, assess assumed success criteria, pick relevant test data, and provide valuable feedback.

The challenge in this phase is to place a WIP model in the hands of these domain experts without requiring them to set up a test environment, make API calls, or use scripts in a terminal. 

A “pre-production” phase solves this problem

Our latest product launch in Code Ocean 3.0 contains unique capabilities to make pre-production possible:

  • Pick a model from your model dashboard
  • Click “Create Inference Capsule”
  • Convert the Capsule into a parameterized app
  • Share directly with the domain expert

The domain expert can then swap out the data, run the Capsule, and review the results using the UI alone.

They’re easy for the dev team to set up and they eliminate the need for domain experts to do anything other than test the model. They’re unique to Code Ocean because Capsules are containerized, shareable execution environments.

If you want to find out more, head over to our product page, book a demo with us, or simply add me on LinkedIn if you have any questions.

Also: I'm running a webinar about ML models in bioinformatics on Wednesday, September 25th. You can register here (and get an email with the recording if you can’t make it live).

Daniel Koster, PhD

VP Product, Code Ocean

Read more from our blog:

View All Posts