Better Training Data Better AI. Since the 80's the AI paradigm has been Better Models = Better AI. Today the limitations of this paradigm are clear: significant efforts for marginal performance improvements, restricted access to overspecialized engineers, low explainability, low control, and prohibitive project costs. @Kili Technology, we are believers. 3 years ago, Edouard d’Archimbaud … Continue reading Better Training Data, Better AI
My State-Of-The-Art Machine Learning Model does not reach its accuracy promise: What can I do? Data Quality as a first response Introduction The ultimate goal of every data scientist or company that builds ML models is to create the better model with the highest predictive accuracy in production. Usually, we start with state-of-the-art algorithms being … Continue reading My State-Of-The-Art Machine Learning Model does not reach its accuracy promise: What can I do?
What Dataset should I use to retrain my model ? Retraining a model is a necessary step in a model conception. What do I do if new data is available to me ? What do I do if my model shows lower prediction performances compared to the time I trained and tested it ?Suppose that … Continue reading What Dataset should I use to retrain my model ?
Data annotation: leveraging interactive segmentation to achieve state of the art quality and speed Machine learning models have proven to be extremely powerful for automating tasks. Automatic image recognition for example has seen an incredible leap thanks to the development of convolutional neural networks. We see that our models today have not yet reached their … Continue reading Data annotation: leveraging interactive segmentation to achieve state of the art quality and speed
Label engineering: Bounding box vs Polygon When setting up a project of object detection, you will have to choose your annotation tool. The most commonly used tools in machine learning and artificial intelligence projects are bounding boxes. However, other tools such as polygons also exist in the industry. But what are these differences and which … Continue reading Label engineering: Bounding box vs Polygon
How to annotate the right data? When starting an annotation project, it is important to select the best data to annotate. In other words: the data that would improve the most your model’s accuracy. Most of the time, you will have more data than the amount you can annotate. Therefore what is the best strategy … Continue reading How to annotate the right data?
How To Monitor Machine Learning Models In Production? One key challenge in modern AI application is to maintain high levels of performance with production models. Engineer spend an increasing amount of time working on the production life-cycle of their models. In the paper on the system "Overton", Apple engineers states that "In the life cycle … Continue reading How To Monitor Machine Learning Models In Production?
How Kili Technology and AutoML Helped Me Scale Email Classification for Customer Services Emails are the de facto standard for written communication between companies and customers. They are the most preferred channel of communication with 91% of all US consumers still using email daily.  That's why businesses and individuals face massive volumes of unprioritized … Continue reading How Kili Technology and AutoML Helped Me Scale Email Classification for Customer Services
How to build a state of the art Machine Learning platform in 2021? Traditionally, machine learning has been a domain reserved for academics who have the mathematical skills to develop complex ML models but lack the software engineering skills required to produce these models. It may come as a surprise to young data scientists to … Continue reading How to build a state of the art Machine Learning platform in 2021?
What Is The Best Image Segmentation Tool? Data annotation is the process of labeling training data to make it usable in supervised learning tasks. In 2018 the survey What AI can and can’t do (yet) for your business by McKinsey states that the first limitation to AI applications is the lack of labeled data. To … Continue reading What is the best image segmentation tool ?