Data Labeling

Computer Vision

Why is annotation much more complicated than it seems?

Data annotation for machine learning should not be underestimated.

Kili Technology

Jun 5, 2023

Heading2

Heading3

AI Summary

Labeled data is the missing brick to create an AI

To make artificial intelligence you need three components:

Computing power

It is now widely available, easily scalable and relatively inexpensive with the cloud and GPU. And computing power is growing exponentially. On your iPhone you have more computing power than the entire Apollo program!

Algorithms

The state is roughly available on github thanks to Google and Facebook publications. You can now build a translation stack that embeds Google's latest Transformer architectures. The development of open source, benefiting from the network effect of the internet, is also exponential. The deep learning algorithms are in the process more and more greedy in learning data.

Labelled data

And there's a lot. Since the digital revolution, pretty much all the information in the world is recorded in digital format. But this data, you have to annotate it. That's what Facebook does today when it asks you to comment on a picture or identify a friend in a picture. Every day we upload more than a billion pictures to Facebook and annotate them. We produce a huge learning database for Facebook so that they have been able to develop models that can identify people from behind. But in most companies, the data is siloed and unstructured.

Teams are not equiped to scale annotation

That's what Kili Technology is all about :-)

Discover Kili Technology

At BNPP, to solve this problem of lack of learning data, we have set up annotation interfaces. But in doing so, we were confronted with the following problems:

UX of annotation interfaces

How to develop, at a reasonable cost, interfaces that are both comfortable and accessible for business people and data-scientists?
For example, to annotate cancer cells, I need a pixel-precise annotation. If I don't have a trackpad available and I need to annotate with the mouse from a PC, which UX offers the best efficiency?
As someone in the business, I want a graphical interface that is level, sexy, ergonomic, intuitive and that can be integrated with my business tools. As a data scientist, I want to be able to drive my annotation from my python IDE, with an API or even ideally from a python module. How can I reconcile the two?
How to have versatile interfaces to address all the ML issues; text, image, voice, OCR and each interface being customizable to combine for a project made of several different ML tasks under annotation tasks, most projects not being a simple annotation task?

Control of the quality of the annotated data

For example, in Speech to Text, do I have to annotate with or without punctuation? Can I annotate with or without spelling mistakes?
How do I make sure that the annotators I put on my task have understood what I expect from it and that they implement it correctly? How do I check the quality of my annotators' work when I have no indication of the correct answer to expect from the annotation?
On complex projects, with a strong subjectivity, for example the feeling on a video, you will want to put a high consensus for example 100% with 5 annotators. On the other hand, when there is just a basic quality control issue, you'll want to achieve a 10% consensus with 2 people. We want to be able to measure a consensus score between the annotators.

Number of annotators

For example, a retail customer is working on a project to provide a customer experience similar to Amazon Go. To do this, more than a million images need to be annotated, so you need to be able to take people on board and measure their performance and rendering quality. And to give you an order of magnitude, with a basic tool not very powerful, 10k images is 3 months for 1 person. So for a project like this, you need a lot of annotators and you have to coordinate them. How do I manage data access rights?

Annotation remains very long!

Even if you're good at interfaces and annotator coordination, it's still quite long. And so it has to be accelerated in an intelligent way by getting the best out of what man and machine can do. The machine being very good at repetitive tasks where man gets tired quickly and man being good at distinguishing and assimilating new nuances.

How be faster with Kili

We're typically going to want to do

Online learning

That is, to start learning a model being annotated in order to pre-annotate the data.

Active learning

That is, which assets to start annotating. Typically if you do deep learning, you want maximum diversity from the very beginning of your training. How do you manage an optimal prioritization thread that is not at first glance the alphabetical order of the files to annotate?

Weakly supervised learning

How to use business rules to massively pre annotate? For example if I need to annotate product names in text and I can also extract a name dictionary from my product repository?

Perfect data is not enough to reach 100% performance!

Even if we manage to produce an annotated dataset in sufficient quantity and quality to train a good model and have acceptable results, we never get 100% performance.

Indeed, all of this allows the initial training to be done. We will extract the data from your systems and annotate it to create the dataset.

This makes it possible to obtain a model, but which is far from the 100% performance expected by the business.
And the performance will tend to deteriorate over time (model drift) with the arrival of new examples.
And people in the business never accept to take the risk of automating a task without having the guarantee of 100% performance.

Imagine if you do automatic entry of customer orders read in mails directly into the CRM and this launches the production line of a 12-ton metal bale?

On customer order emails, for example, we achieve well over 95% performance on classification and over 80% performance on named entity recognition. Which is already very good. But to capture only data with 100% reliability in the systems, in order to guarantee the integrity of the orders, it is essential to orchestrate human supervision in production.

You have to be able to keep the human in the loop, to capture the feedback to keep the models in production learning through annotation.

It's key to keep the humans in the loop when you want to put the A.I. into production. #Human in the loop.#

Subscribe for updates

Stay updated with the latest news, articles and update directly into your box

July 22, 2026

Kimi K3's Benchmarks and Hallucinations — What That Tells Us About AI Evaluation

Kimi K3 ranked third on the AI Intelligence Index while its hallucination rate hit 51%. Here is what that paradox reveals about how the industry evaluates models.

Kili Technology

AI Evaluation

Foundation Models

July 15, 2026

Best On-Premise Data Labeling Platforms for Regulated Industries [2026] Guide

Compare the best on-premise data labeling platforms for defense, healthcare, and finance in 2026. This guide evaluates secure deployment models, certifications (SOC 2, ISO 27001, HIPAA), air-gapped operations, and quality-at-scale for teams labeling sensitive AI training data.

Kili Technology

Data Labeling

July 15, 2026

Introduction EU AI Act: What Every AI Team Needs to Know Before August 2026

The EU AI Act regulates AI applications by risk level, assigning obligations to every organisation that develops or deploys AI systems affecting people in the EU. This guide covers what the Act requires, who is in scope, which use cases are affected, and the enforcement timeline your team should be working against.

Kili Technology

Foundation Models

AI Evaluation

Data Labeling

Why is annotation much more complicated than it seems?

Table of contents

AI Summary

Labeled data is the missing brick to create an AI

Computing power

Algorithms

Labelled data

Teams are not equiped to scale annotation

Discover Kili Technology

UX of annotation interfaces

Control of the quality of the annotated data

Number of annotators

Annotation remains very long!

How be faster with Kili

Online learning

Active learning

Weakly supervised learning

Perfect data is not enough to reach 100% performance!

Subscribe for updates

Related articles

Kimi K3's Benchmarks and Hallucinations — What That Tells Us About AI Evaluation

Best On-Premise Data Labeling Platforms for Regulated Industries [2026] Guide

Introduction EU AI Act: What Every AI Team Needs to Know Before August 2026

Ready when you are. Start your free trial.