How Jellysmack improved their NLP and Video models’ performance x5
Jellysmack is a multinational social video company based in France. Their technology helps video creators drive massive social audience growth and amplify monetization. Their catalog of 300+ creators in various categories from gaming to sports and beauty generates +10B monthly views across all social platforms, including Youtube, Facebook, Snapchat, Instagram, Twitter, and TikTok.
Industry: social media
Use cases: NLP & Video
project management and progress tracking
intuitive annotation tools
collaboration workflows between DS and labelers
Jellysmack needed more than their in-house solution to train their models and label data at scale
Jellysmack started its ML journey by analyzing video performances on Facebook. They had to evaluate how much they should spend on ads and position their content most profitably. A video that is 10% less effective in retention divides the number of views by a scale of 100; that’s why video quality is critical. They quickly expanded to other platforms, including Youtube, Instagram, Tiktok, and Snapchat. Today, their algorithm analyzes more than 638 million videos continuously (just on Youtube) to help them identify creators with high potential and new trends in viewership. 26 to 30 million videos are added every month to their monitoring catalog.
“Both data and automation are key to our business. We need to quickly identify the creators and the best-performing content and distribute it efficiently. We also need to scale very rapidly.”
Andrea Colonna, Head of Data at Jellysmack
Jellysmack quickly identified several use cases to get even more insights and strengthen its services. However, they quickly realized that their in-house annotation tools and training model processes would not be sustainable.
”We could have done it internally, but this task is very time-consuming, and we did not have an efficient interface”, explained Andrea. “We decided to partner with Kili to empower our data scientists. Our internal forces quickly became autonomous with their ML projects”.
One central hub to address all their use cases, from NLP to Video annotation
Jellysmack had several use cases to address, including text and video. With Kili, they found a platform to centralize and annotate any type of data, streamlining their training pipeline through one unique training hub.
NLP annotation to automate the content categorization
As a first step, Jellysmack needed a solution to better categorize Youtube videos. Since Youtube does not distinguish categories beyond large verticals like sports or music, Jellysmack needed to differentiate the content with a higher level of granularity (basketball, American football, European football, or tennis instead of “sports”). To build their classification, Jellysmack outsourced the training of their algorithm based on 200-300 Youtube channels (with 10 videos per channel).
Jellysmack partnered with Kili to train the content categorization model using the text description of the videos.
Video annotation for quicker content distribution across platforms
To optimize content distribution and performance, Jellysmack needed its videos available in various formats so they can be published on platforms as different as YouTube, TikTok, or Instagram. To ease their video editors’ work, Kili was requested to train an algorithm to automatically detect transitions and cuts within videos so editors could rapidly identify different content segments and repurpose them.
Today, Jellysmack’s data scientists are using Kili’s video labeling interface to train their model and accurately annotate thousands of videos.
Gaming Beauty Sports Reaction Cooking News …
NLP annotation for sentiment analysis to better understand content performance
Jellysmack wanted to check if a video was performing or not beyond the Like/Dislike ratio. The popularity of a video and the audience’s sentiment is not always reflected by the number of views. That’s why they built an algo in 2019 to understand if comments were positive, neutral, or negative. In the long term, negative comments might impact the performance of a video, even if that specific video has a great number of views. This enables the analytics team to make strategic editorial and business recommendations.
Kili’s platform has become a central hub for all their annotation projects and became the single source for managing all their datasets, which considerably facilitated and accelerated their machine learning workflow.
Jellysmack improved by 5x the performance of their NLP & Video ML models
The collaboration with Kili started in 2018. The reliable support provided by Kili and the smooth integration of the algo developed was certainly the keys to sustainable success.
Kili’s integration into Jellysmack’s workflow was smooth and enabled the team to be operational almost immediately. “Kili’s integration was easy. Their API docs are very well explained, and it enabled us to get operational quickly,” explained Barthelemy Pavy, Data Scientist at Jellysmack.
Another key differentiator for Jellysmack was the level of support provided. “Kili’s customer support is best-in-class and has a direct impact on our performance. It enabled us to launch projects and solve issues much faster than what we were used to.”, added Andrea Colonna, Head of Data at Jellysmack.
Thanks to the smooth integration and great support, Kili became a key contributor to making Jellysmack’s algorithms 4-5x more efficient than it was 4 years ago. This level of enhanced performance enabled Jellysmack to scale with confidence as they expanded their business and helped them provide better services to content creators.
“Kili had a real impact on our roadmap and the level of service we provide to our customers. Kili enabled us to improve our models’ performance and scale our AI projects as fast as our business needs.” - Andrea Colonna, Head of Data at Jellysmack
Efficient workflows and UX enable data scientists full autonomy, effective collaboration, and scalability
Back in 2018, Jellysmack had about 7 data scientists (DS). In 2021, they reached the milestone of 25+ DS and they expect to get up to 35 by the end of the year. This hyper-growth reflects how important AI is to strengthen Jellysmack’s offering. The growing volume of data processed stands for one of their biggest challenges.
Kili has positioned itself as a key partner in its data science team performance.
“A critical issue at Jellysmack is to scale our capabilities quickly. Kili plays a big part as their solution enables our data scientists to be fully autonomous with building and managing training datasets and ship models into production faster.”
Andrea Colonna, Head of Data at Jellysmack
Kili’s intuitive platform and powerful workflows empower DS to manage their projects, increasing their team’s productivity and reducing by a factor of 2 to 10 times the length of the training phase of their models. Data Scientist Barthelemy Pavy mentioned how Kili’s interface made his life easier on a daily basis: “I can easily get an overview of my projects with useful KPIs such as the distribution of my assets and the status of the labeling work, so I can quickly jump onto what will bring the most value to me.”
The solution has been developed at Kili with ongoing improvement, which brings a significant asset to Jellysmack’s Data Scientists team.
“Kili gives us a reliable and powerful video labeling solution that they keep on improving to facilitate our most complex projects. The interface provides all the tools we need to make our workforce more productive and annotate with accuracy.” - Thomas Michel, Data Scientist at Jellysmack
Collaboration is another key factor when annotating complex datasets. To be successful, it is critical that the various project members can work and communicate together efficiently.
“Kili makes it easy to collaborate. We can quickly onboard new project members, and take questions or provide feedback from and to our annotators to quickly solve issues or help them improve their skills.”
Barthelemy Pavy, Data Scientist at Jellysmack
A collaboration that enabled Jellysmack to address an exponential growth
Addressing localization might be the next challenge for Jellysmack. Jellysmack is adding more and more countries to its portfolio, including India and South Korea. Challenges will most certainly arise and will grow beyond the language itself as each culture has its own way of talking and commenting on content. The company will need to train its algorithms to understand these specificities.
This is why Jellysmack follows the data-centric AI revolution with great interest. Lots of ready-to-use models do not require much rework internally to fit their use cases. The real challenge lies in the data. To get the expected level of performance, JellySmack will need to gather a significant volume of data and build strong training datasets. Then, without changing the code, continuous work on dataset improvement will be required and altered based on the observed models’ performance. Continuous iteration on the dataset will be key to building better AI, faster.