Imagine a Michelin-starred restaurant. The success of each dish relies on three key elements: premium ingredients (Food), the right tools (Tool), and a skilled team (Team).
In this analogy, Foodvisor provides the premium ingredients: plenty of raw data. They possess a vast database of images and are looking to label it. Once these images are labeled, they are fed into computer vision algorithms to build their AI-driven nutritional coaching application. This application enables quick and accurate food analysis from photos, with the capability to classify over 1,500 food types.
People for AI, in charge of annotating and refining Foodvisor's data, take on the role of the skilled culinary team in our analogy. As the data labeling company, People for AI employs a team of expert labelers to ensure that the images are annotated accurately and efficiently, maintaining the right pace and precision.
However, even the finest ingredients and most talented team can't reach their full potential without a high-end kitchen and state-of-the-art equipment. This is where Kili Technology's Data Labeling Platform is crucial. Thanks to the capabilities of Kili Technology, People for AI was able to efficiently label and refine 20,000 images from Foodvisor every month, illustrating the critical role that a sophisticated tool plays in this process.
Test your knowledge in Japanese food (answers at this end of the blog post).
The Initial Data Labeling Process
The initial two-step process, which involved localizing and classifying all food items present on the images, was efficient but was improved by a three-step process:
“Geometry”: The process began with localizing food items in the image. Various label types were used on the same image (key points, bbox, segments, etc.)
The labels were then fed to a dedicated Foodvisor algorithm to predict the class for each localized food item.
Validation: Next, labelers reviewed and corrected algorithm-suggested classifications, with items grouped by food type to improve focus and training (meats, cheeses, fruits, vegetables, etc.).
The new labels were then fed to another Foodvisor algorithm to check whether the labeled classes seemed okay.
Consensus Review: Finally, the class was reviewed in a consensus process involving three labelers in case of a disagreement between the labeler and the algorithm. In case of a new disagreement between the 3 labelers, a three-step review process took place (local manager, expert project manager, and finally, the Client).
Initial annotation process, separation of geometry and classification. This makes it possible to mobilize non-experts (for geometry) with expert annotators (for classification).
This new process helped reduce the time spent on each image as the labeler's tasks got much simpler and quicker. From labeling all items on an image (which could take several minutes), the labeler labeled the images several times but one item at a time (a few seconds).
Tailored Labeling Pipeline: A Closer Look at Kili’s Role
Intuitive Interface Management for Classes & Labels
Kili's interface stood out for its flexibility, enabling the set up of custom interfaces adapted to each phase of the labeling task. The incorporation of various input methods like checkboxes, radio buttons, lists, and text fields ensured that the labeling interface could evolve following the project’s new requirements
A great annotation platform allows you to easily add, modify, or remove label classes as your project evolves, and this, from a transparent, UX-friendly front-end. In this domain Kili is especially proficient.
Simplifying Team Management and Access Control
The platform's capability to manage user roles smoothly was essential when the project needed to be scaled up. Professional platforms like Kili include:
Easy Onboarding: Quickly bring new annotators or reviewers onto the platform without hurdles.
Granular Access Control: Assign rights based on roles, ensuring that team members have appropriate levels of access to projects and data. This ensures that the right people are working on the right tasks, optimizing the annotation workflow.
At the Foodvisor project’s peak, the team included over 15 labelers, three production reviewers (local managers), one expert reviewer, and two administrators. Kili made it effortless to onboard new team members, adjust roles, and ensure that each individual had only access to the data they needed to label and review efficiently and accurately.
Annotation Versioning and ‘Explore’ mode
The labeling tool can help you to iterate on the data that needs to be labeled. The versioning feature of professional data labeling platforms ensures that every change to a data label is captured and stored. For data scientists, this means:
Historical Tracking: You can trace back through various versions of labeled data, ensuring no data iteration is lost.
Model Refinement: As models evolve, being able to revert to or to compare with earlier label versions ensures a robust training environment.
Kili’s versioning system enabled the different reviewers to get visibility on the multi-level labeling and review process. Precise training and improvement of labelers/reviewers are only possible thanks to this versioning feature.
Foodvisor AI-assisted classification process. The annotation stage and three levels of review are visible. The annotation stage is carried out in consensus mode.
Through Kili’s ‘Explore’ mode, each project manager, as well as data scientists could easily filter, navigate and audit the labels, ensuring laser-focus review rounds and precise corrections when needed. It facilitates the continuous improvement of the labeled dataset.
Insights from a data labeling platform: Team and Progress Statistics
Visibility into the data labeling process is key for maintaining project timelines and ensuring quality. Statistics allows monitoring the team performance in real-time. It tracks the pace of annotations and gives insights into individual and overall team productivity. It also allows the project managers to keep an eye on the overall progress, ensuring that deadlines are met and quality standards are upheld.
To guarantee quality in the highly complex Foodvisor project, we used Kili's dashboard and statistics to follow the labelers’ work and progress. Based on this assessment and quality reviews, we selected the best labelers to perform the reviews.
A cloud-based tool like Kili also minimizes the black box effect by allowing the data science team to keep an eye on the labeling work in real time.
By using Kili’s robust API, the team could effortlessly import the results from one labeling step to integrate to the following steps. This pre-labeling capability was key to keep consistency throughout the labeling process, particularly when integrating the outputs of Foodvisor’s algorithms as a starting point for human labeling. Using scripts to manage this process saved us time with each iteration.
Kili’s responsive support was critical in maintaining the momentum of the project. As the process evolved, particularly when shifting to a super fast labeling task, performance issues were quickly resolved, minimizing loading time and keeping the project on a fast track. This responsiveness not only addressed immediate technical concerns but also demonstrated Kili's commitment to the project's and client’s long-term success.
Enhanced Custom Tools
Beyond Kili’s inherent features, the API's flexibility facilitated the development of dedicated tools to further streamline the workflow:
Project and Data Management: Custom scripts used the API to create new projects and handle data efficiently for each labeling step.
Progress Monitoring: Real-time dashboards, created thanks to the communication with the API, provided insights into the progress of multiple concurrent projects.
Export Functions: Tailored export options facilitated the extraction of data specific to each stage of labeling.
Specialized Review Processes: The API enabled the creation of a three-level ad hoc review process, ensuring data accuracy and consistency during the whole project.
Consensus Building: Kili has its own consensus feature, but we decided to add some specific features and create a custom-built consensus mechanism.
Labeler Skill Assessment: Kili API’s were used to extract and create specialized training projects for labelers. For example, trainings focusing on different class groups (vegetables, exotic fruits, pastries etc.), which enhanced labeling accuracy and provided a method for monitoring the labelers proficiency.
A quick break before we proceed:
Test your knowledge in french cooked meats (answers at this end of the blog post).
Impactful Results and Savings
Back to the program, by leveraging Kili's platform, People for AI achieved remarkable outcomes for Foodvisor:
Time Efficiency: A significant reduction in labeling time, saving over 1000 hours monthly.
Cost-Effectiveness: Achieved considerable cost savings compared to internal annotation efforts, approximately 3 to 4 times less expensive.
Enhanced Accuracy: A 20% improvement in the algorithm's accuracy.
Superior Data Quality: A leap in data quality compared to crowdsourced annotation methods.
Through the partnership with Kili Technology, Foodvisor and People for AI have achieved significant progress in the efficiency of the data labeling process and the accuracy of the labeled data. People for AI’s advanced yet user-friendly data labeling process lead to time and cost savings, along with a clear improvement of Foodvisor’s algorithm precision. This collaboration has not only improved Foodvisor's AI models but also demonstrated the transformative power of effective team and technology integration in the domain of AI-driven nutrition.
Did you get the answers right?
The realm of data science and AI constantly evolves, and with this change comes the increasing importance of well-structured, accurately labeled data. As discussed, using a professional data labeling platform has many advantages—such as label versioning for tracking data iterations, streamlined team management, comprehensive insights through statistics, or refined class and label management, not forgetting the ability to quickly refine and correct annotations.
This might be the right time to re-evaluate your data labeling practices. Think about how you're currently leveraging these platforms and consider integrating the best practices we've mentioned. Remember, the real value doesn't just lie in choosing a platform but in how you use it.
In conclusion, as AI continues to shape the future, investing in data labeling isn't just a good practice—it is crucial. Never stop learning, refine your strategy, and let your data be the driving force behind your next breakthrough in AI.
And finally, here's the answers to our test on French food!