Healthcare - Enabling innovative accurate diagnostics for early bladder cancer detection
Impact
Challenges
Medical image annotation process is a laborious, highly time-consuming work
Without a thorough reviewing and collaboration workflow, it is very hard to achieve the level of annotation quality required and manage consistency
Reviewing crowdsourcing annotations is very time consuming & challenging
Solutions
Consensus feature facilitating review process
Advanced reviewing workflow and collaboration tools breaking silos between departments
Effective machine learning in the loop solution to accelerate the image annotation process
Context
VitaDX, a French healthcare company focusing on the field of bladder cancer diagnosis is developing a new AI-powered diagnostics model to provide a simple, reliable, non-invasive, and more efficient diagnostics tool to detect early bladder cancer as a complementary for urinary cytology.
However, the company found it challenging to industrialize its data pre-processing to build the tool while managing the data accuracy in check – as accuracy is highly critical in medical diagnostics. VitaDX partnered with Kili Technology to enable not only a simplified process but also a more structured, collaborative workflow.
How it all started
Aiming for perfection: when accurate diagnosis could save people’s lives
Globally, almost 3 million people are currently diagnosed or treated for bladder cancer, with 400 thousand global new cases each year. Like almost all other cancer types, the survival rate of a bladder cancer patient increases when it is detected earlier. However, urinary cytology – the non-invasive diagnostic method available – has a poor performance in detecting early bladder cancer.
Using artificial intelligence as the core of its diagnostic tool, the company understood perfectly that the accuracy of data to train the model is highly critical. It was already clear that to be able to provide a precise diagnosis of bladder cancer, the AI model needs to be trained with clean, accurately labeled images of cytology slides. Thus, the image labeling process is essential as it would determine the precision level of the diagnosis, which will impact people’s lives.
We build an artificial intelligence model that will determine whether a person is sick. At the end of the day, our model will impact the life of a human being. This is why it needs to be accurate, quality should be the first to prioritize since the beginning – data preparation.
Challenge
Diagnosing cytology slides at a scale
Before partnering with Kili, the company relied on crowdsourcing to annotate cytology slides. On the face value, the cost of crowdsourcing seemed lower as all of the labelers worked part-time. The annotation process was also done without a dedicated resource in charge of data operations. Thus, each crowdsourced labeler had his annotation stream and flow. However, VitaDX realized that when it comes to delivery, it was extra difficult to achieve the level of annotation quality required and also to manage the consistency of this quality. With no one in charge of data operations, monitoring these crowdsourced labelers was a challenging task.
Precision is very important, it determines the quality. The quality of annotation will impact on the quality of our machine learning model, and at the end the quality of the medical device where we are building
Another challenge is that VitaDX understood that the medical image annotation process is laborious, and highly time-consuming work. Annotating cytology slides involves performing semantic segmentation of the images, and this requires pixel-level precision. Then, each labeler needs to label carefully with laser-sharp focus, resulting in a long process to label one image. Reviewing the labeled cytology slides was also tiresome, as doctors need to sit together and discuss when there is a disagreement over a label of presumably cancer cells.
Limitations of crowdsourcing for annotation
No dedicated resource in charge of data operations to manage the crowdsourced labelers
Each labeler with his own stream and flow making it difficult to reach consistency in quality
Lower cost but tiresome process especially at the reviewing stage with doctors.
Difficulties to achieve the level of quality required
Solution
Finding a solution to structure and streamline the process of annotating cytology slides
Reflecting on these challenges, VitaDX realized that to perform cancer diagnosis at scale it needs to find a solution to structure and streamline the process of annotating cytology slides and the collaboration between doctors, machine learning team, and data labelers. Benchmarking against several top players in the market, the company found Kili offers a combination of features suitable to their needs. More importantly, Kili stood out when it comes to quality management of the annotations.
Project-based collaborative labeling
Partnering with Kili, VitaDX was able to structure the annotation process much more clearly. One person was required to be pointed as the person in charge of data operations for quality management feature on Kili, hence quality control is kept in check. Reviewing features on Kili also allows a more efficient collaboration for doctors, machine learning teams, and data labelers, breaking the SILOs of departments that existed previously.
Labeling predictions and machine learning in the loop thanks to the high-quality of training data
Moreover, the company found itself to be able to perform image labeling much more precisely at a way faster pace. Using online learning as a machine learning in the loop was enabled during the process, where VitaDX plugged in its machine learning model into Kili to automate the labeling process itself. This process allowed a pre-annotation process, where labeling predictions were produced, reducing the effort of the annotators. Kili was also flexible to use in devices suitable to deliver pixel-precise perfection quality.
We found the quality management features – especially consensus – very useful, as it allows us (doctors) to gather up only when an asset has a low level of consensus, which means the cell image is very complex to identify. Otherwise, there is no need to review together, hence it saves us time.
Impact
We speed up the process five times
Industrializing image annotation of cytology slides to detect bladder cancer enabled VitaDX to perform cancer diagnosis early, more accurately, and at scale. Streamlining the whole labeling process along with online learning as machine learning in the loop reduces the time taken to annotate images by 70%, and it improves the efficiency and productivity of the team.
The walls between Silos of departments that existed previously, hindering collaboration and transparency of workflow, are also gone – drastically improving productivity and accelerating the diagnosis process. The overall cost to conduct cancer diagnosis is reduced.
If we don’t have high-quality data, the process of our diagnostic device could have been much slower. The bonus is that we speed up the process five times
Lesson Learned
Accurately labeled images play a critical role in determining the accuracy of a cancer diagnostic tool since it involves decisions that impact people’s lives. Hence investing time and effort in data labeling is of the utmost importance to ensure the success of the project.
Flexibility to apply machine learning in the loop during the image annotation process is essential to enable improved speed for 70% time saving of data labeling
It is important to select a suitable data annotation partner to develop a cancer detection tool. Factors such as robustness, meticulous quality management, and simplicity of collaboration are key.
Get Started
Get started! Build better data now.