Chatbot training data

Chatbot training data

Kili is the text annotation tool, to help build chatbot training datasets faster, and solve conversational NLP machine learning challenges that will impact your organization: gathering the data, defining chatbot style, interpreting messages, ensuring consistency of the agents, monitoring the quality of incoming & outgoing questions.

Kili is trusted to become their data annotation tool partner

What is chatbot training data?

Algorithms learn from data. They find relationships, develop understanding, make decisions, and evaluate their confidence from the training data they’re given. And the better the training data is, the better the model performs.

Chatbot training data is information that helps a chatbot understand what users are saying and how to respond to it. Chatbots need intent classification, entity extraction, relationships extractions, syntactic analysis, sentiment analysis and sometimes even translation.

Where and how can I find enough data to build a chatbot?

Some datasets are available on the internet. Nevertheless using public data is not sufficient because it is too generic.

Chatbots need a lot of specific training data to learn how to respond effectively to different human interactions.
So, to create an effective chatbot, you first need to collect and annotate information, which can come from your company’s FAQ web pages, customer service tickets and chat scripts, call logs, help email account, and other written sources. You can also get information about chatbot training directly from the personal knowledge of the sales representatives.

Why choose Kili to generate my chatbot training data?


Kili offers specialized interfaces for all annotation tasks related to chatbot: intent classification, intent variation, entities extraction, relationships extraction, sentiment analysis, translation and more.


Kili’s state of the art quality management system allow an intensive collaboration and a rigorous review throughout the life of the project to ensure clean, high-quality chatbot training datasets.


At Kili, you can annotate wherever you want with whomever you want. On premise or in Saas, with your annotators or with our annotators, remotely or in your premises, we adapt to your constraints!


Annotating can be expensive. By allowing the use of online learning, active learning, weakly supervised learning or data augmentation, Kili allows you to drastically reduce the cost of annotation!


Kili has access to a unique network of +80 professional annotation companies around the globe, so we can quickly create large, custom chatbot training datasets in +50 languages.

Some Kili’s chatbot training data interfaces

Intent Classification

To categorize utterances into relevant predefined intent groups. Use our intent classification tool to accurately match utterances to specific intents for your chatbot to understand.

Intent Variation

To create custom intent variation datasets that cover all of the different ways that users from different demographic groups might express the same intent. Leverage state of the art methods such as counterfactually augmented-data annotation.

Entities and relations extraction

Add structure and semantic information to previously unstructured utterances at the word level. Take advantage of our weakly supervised learning service to use business rules such as regular expressions and dictionaries to massiverly annotate before any human intervention.

Quality monitoring

Plug Kili to your production and use it to classify the quality of conversations and to identify when the bot is slipping. Highlight bad sentences, enunciate a new intention, reclassify the intention.

A last but not least, create your own interfaces for your specific tasks with Kili’s interface builder!

Ready to simplify labelling in your company?

Discover the solution now

Success stories


Kili allowed a major European bank to develop a chatbot from scratch even though no internal data was available. A rigorous annotation plan combined with techniques to increase the data in the building permit to build the necessary dataset at a reasonable cost.


Kili has enabled one of the world’s largest luxury groups to identify intentions not covered by the chatbot and create the corresponding training data.


Kili enables an popular American digital bank to monitor the quality of more than 1 million interactions per day. The tool allows the bank to react to the slightest performance variance and assures customers to have an easy, intuitive way to communicate and ask questions.