How to ensure your video machine learning projects will perform well and solve real-life problems? Algorithms are trained with data and the challenge becomes more complex when proceeding with data from videos.
Companies today get access to more and more unstructured data, and the share of video assets is greatly increasing. Leveraging videos to build ML applications requires a lot of work, starting with video annotation, the process of labelling video clips to prepare datasets for training your ML models. These trained models are then used for computer vision applications, such as inspection, surveillance, process management, and quality control, among many other use cases.
Building high-quality training datasets is key to shipping highly performing ML projects into production. As we are launching a new and enhanced version of our video labelling interface that will improve and speed up the labelling capabilities of our customers, we decided to look back at the basics of how to best prepare and manage the training of your models. Let’s go over the 5 best video labelling practices to make your life easier.
1. Acknowledge the video labelling challenges and prepare your datasets carefully
Videos are one of the mediums that deliver the most information. Videos are visual, they have a temporality, and also include audio and more advanced features (such as night vision, thermal detection, etc.) depending on which recording device you have. As a result, labelling videos is much more complex and pricey than labelling static images since labelling videos displays far more information. In a video labelling project, teams need to localize an object’s position frame by frame, possibly track it as it is moving identify if an action is being performed etc. Imagine having hundreds of hours of videos to label and the task at hand seems impossible. Let’s go over additional challenges one should consider in regards to labelling projects:
Include the required preparation in your framework: a lot of time-consuming, pre-processing work might be necessary to prepare the data - specifically breaking the video into a number of frames that you want available for your labelling project.
Consider the annotation stage: labelling can be an extremely painful and slow process due to the number of frames.
Tackle the complexity: Complex datasets with obstruction/disappearance/reappearance of objects, further increase the complexity of labelling correctly.
Acknowledge video labelling platform performance: the speed of the training stage and the productivity of your workforce highly depend on the platform you will use. Many interfaces are not reading videos but only images (frames), making it considerably slower to import video assets, play them and label them. The more powerful, UX-friendly, and intuitive your video labelling solutions will be, the more productive your team will be to train your model.
2. Be data-centric: train your model as part of a continuous learning and iterating workflow
According to Gartner, about 80% of ML projects never reach production deployment, and those that do are only profitable about 60% of the time. Most often, the reason is that models fail to perform in the real world despite passing the training phase. Let’s see how to set your model for success with a data-centric approach.
Preparing your dataset is key, as your model will only be as good as the training it receives. The first factor is the diversity of the data within your dataset. The more diverse your dataset is, the higher the chances are of avoiding model biases since it will cover all edge cases that will happen in the real world. For example, when building an ML model that will identify people crossing a street: make sure that your dataset includes all types of people, with different clothing options, during all types of seasons and climates.
Working on the quality of your dataset is not a one-time effort, but a continuous process. Data quality needs to feed on the results that your models provide while in training or in production. This data-centric approach aims at better seizing your dataset structure, spotting its weaknesses and implementing strategic adjustments.
3. Choose a powerful, UX friendly, platform
To enable your workforce, you will need to select a platform, tools, and processes that will help you achieve the work that has to be done. The right platform can considerably improve and accelerate the labelling process to the point where the platform can actually become faster than labelling images itself.
Choose time-saving tools: if labelling video data takes up a lot of your time, look for solutions that enable you to upload, play and label videos directly instead of only frames. This feature will considerably speed up the preparation stage of your project.
Ensure that your team works with UX-friendly solutions: pick a platform with an intuitive interface and a wide range of powerful features to address all of your possible use cases. An efficient platform will shorten the reviewing process drastically. Expect tools and features, such as bounding box, polyline, segmentation, object classification in a single frame and multi-frames, object detection in multi-frames, capability to annotate directly on the video, and more.
Kili video labelling interface in action:
4. Leverage the power of collaboration
Collaboration is key in ensuring the success of any labelling project. Check that your platform makes it easy to onboard experts and offshore labelling workforce. Make sure that comprehensive governance tools are available to easily assign roles and distribute tasks, provide guidelines and feedback to the labellers in order to quickly improve the efficiency of your labelling team and scale.
5. Do not neglect QA
Constantly test the reviewing flow and the tools of the platforms which you are considering. Best-in-class platforms enable you to quickly scan through labelled data and identify trends in how the data is being labelled. This information enables your project managers to quickly provide precise feedback to the workforce, adjust labelling guidelines, or make adjustments to the dataset.
Kili supports AI teams with best-in-class video labelling solutions
At Kili, we work hard to help companies deliver successful ML projects consistently. We support large industrial companies by helping them identify defects in their production line; we help the retail sector improve their customer experience and the defence sector leverage its surveillance data in order to gain valuable intelligence.
Our data-centric video labelling solution provides all of the processes and tools to empower any ML team. You can train your models even on the most complex scenarios, deploy your models faster, while also saving a significant amount of money, on behalf of this well-developed and efficient solution.
To learn more about our video labelling solution and Kili, visit our website or reach out to us directly.
Discover our complete guide on image & video annotation
Image & video : Image and Video Annotations: What is a Bounding Box?