Geospatial annotation is a machine learning task that isn't as widely used as image or text annotation but remains essential in the world of AI. Whether for environmental protection, defense, or the private sector, geospatial data is heavily used by government agencies, NGOs, NASA, and ESA to analyze the world we live in.
However, extracting valuable data from geospatial data is not an easy task. Why is that?
As a general concept, aerial imagery refers to images that are captured "from the skies". Geosatellite imagery can be captured by UAVs (e.g., drones), aerial vehicles (planes, helicopters, etc.), and spatial satellites. Running geospatial annotation on these images comes with a set of challenges.
The Challenges of Geospatial Annotation
One of the most significant challenges in geospatial image annotation tools is the large size of the files. Geospatial imagery files are typically much more significant than other image files, containing many pixels, each corresponding to a real-life distance.
For example, satellite images usually cover 20 to 50 meters of ground distance per pixel. Geospatial images are also multi-band images, meaning that you will find many more in satellite imagery instead of the regular red, blue, and green channels.
Geospatial imagery includes geodata, which is georeferencing information embedded in the data. Geodata can be map projections, coordinate systems, band types, etc.
All in all, this means that labeling geospatial already comes with the complexity inherent to the nature of the files.
This large size of geospatial files poses significant challenges for annotators regarding file loading times, processing power, and memory usage. It can be challenging to work with and manage data annotation with these large files, and annotators must be mindful of these constraints when working with geospatial data.
The second challenge in object detection comes from geometric distortion. Geometric distortion occurs when objects in an image are improperly positioned relative to their proper position in the real world. In geospatial imagery, objects appear much smaller and more numerous than usual. This, coupled with geometric distortion, makes object detection a difficult and complex task.
Fortunately, orthophotography can correct the issue by measuring the real-world coordinates of objects and scaling the final image in physical space. But this means that labeling teams must find a tool supporting geospatial files composed of geocoordinates, which cannot be treated as regular image files.
Long story short: to work with geospatial data, your tool of choice needs to help you annotate images that are actually composed of geo coordinates.
While some geospatial images are vectorized, most GeoTIFF images are rasterized. Rasterization involves converting a vector image into a grid of pixels, with each pixel representing a real-life distance. These files are saved in specific formats, such as .img, .tiff, and .rst. Most of these files also include metadata like geo-coordinates. As a result, image annotation of such files requires annotation tools that can support these formats.
Finally, many issues arise directly in the labeling process:
when doing object detection, many objects can be contiguous with each other and create overlapping bounding boxes (e.g., trees in a forest);
rotating the image is a challenge because of the file size, but necessary for specific image annotation tasks;
layer creation to focus annotation of vegetation on one, wildlife on another, and human infrastructure on another is not possible;
mapping geo-coordinates with objects is a hassle if the tool is not set up for it
and many more!
Finding the Right Geospatial Annotation Tool
If your machine learning project involves geospatial data, finding the right annotation tool is essential to your success. Here are the three key things to look for when choosing annotation tools.
Navigating inside the massive GeoTIFF files is essential to make labeling as easy as possible.
When working with an annotation platform that doesn't support geospatial files, the pre-processing time to format and upload the images is necessary and very time-consuming. If your tool or platform supports geospatial data, the pre-processing is managed directly by the tool, saving you time and money.
As these files are massive in size, loading times to start working on an image can be very long and decrease productivity. That's why Kili Technology platform splits the image into tiles, only loading the tile where labeling is being done.
It is also important to ensure that the geospatial annotation platform of your choice has a comfortable and user-friendly interface for object detection on small items, with a smooth zoom that can dive into the corners and details of the image.
Geospatial imagery includes geocoordinates, so it is essential that annotation tools manage this information correctly by matching every bounding box, polygon, or label with a latitude and longitude. In addition, it is important to take advantage of the geocoordinates, such as detecting an object that is 1 km away from a previous label.
Display settings are an incredibly important feature in image and video annotation. It is essential to have the ability to toggle the brightness and saturation of the image to allow annotators to identify objects to label.
For example, interpreting Synthetic Aperture Radar (SAR), images can be challenging. SAR images are used to create 2D or 3D images of environments, typically of landscapes. Analyzing these can be tricky, so labelers must be able to manipulate the image's parameters. This guarantees that the final labeled image is of high quality and accuracy.
Labeling geospatial images can be time-consuming, often taking up to several hours to complete. Imagine labeling all cars on a 10km wide geospatial image of the highway! The autosave feature is critical to ensure data labelers can move efficiently through these images. Autosave allows labelers to take breaks and pick up where they left off without worrying about losing their progress.
Why Kili Technology is the Right Tool for Geospatial Annotation
At Kili Technology, we believe that focusing on high-quality training data that is consistently labeled is the way to unlock the value of AI. Today, we continue our journey to empower all businesses to transform unstructured data into high-quality data to accelerate the build of reliable AI dramatically.
To do so, we provide one tool to label all asset types including geospatial files, find and fix issues and simplify LabelingOps, wholly integrated into your existing ML stack.
With Kili Technology platform, you can improve your labeling productivity by rapidly and accurately annotating unstructured data with customizable annotation tasks. All annotations are available via an intuitive interface optimized for productivity and quality or via a powerful API.
We also help our customers boost the quality of their training datasets by allowing the supervision of quality levels and improvements to ensure low-error datasets in the tool. Build advanced team collaboration workflows, leverage programmatic QA, or easily explore your datasets to identify the most critical data.
All of this functionality is integrated natively into your ML stack. You can easily orchestrate your data pipeline, structure projects, manage users, and oversee the entire data lifecycle of your ML project on Kili Technology.
To give Kili Technology platform a try, start for free here >
To check our pricing plans and professional services, learn more here >
Use cases of geospatial annotation are vast and many, and there is already a lot available online. If you want to start a project that involves object detection on geospatial annotation, here are a few datasets to get started:
BigEarthNet: BigEarthNet is a benchmark archive comprising 590,326 pairs of Sentinel-1 and Sentinel-2 image patches for computer vision.
BrazilDAM Dataset: BrazilDAM is a multi-sensor and multi-temporal dataset that consists of multispectral images of ore tailings dams throughout Brazil.
BH-WATERTANKS: -BH-WaterTanks is a dataset of water tanks composed of imagery from several neighborhoods in Belo Horizonte, Minas Gerais, Brazil. The data was acquired through the Google Earth Pro tool.
EuroSAT: land use and land cover classification using Sentinel-2 satellite imagery.
DSCR: Public Datayou'rer Ship Classification in Remote sensing pictures.
SN6: Multi-Sensor All-Weather Mapping
If you're more on the academic side, here are a few papers that should be of interest:
FAQ on Geospatial Data and Data Labeling
What are the labeling formats supported by Kili Technology?
In addition to supporting geospatial data annotation, the Kili Technology platform supports computer vision (image & video), and text annotation (rich text and documents, pdfs, and conversations). Kili Technology is a text annotation tool, image annotation tool, and document annotation tool. You can do image classification, image segmentation, computer vision, object tracking, text classification, named entity recognition, and more.
Is Kili Technology free among image annotation tools?
Yes, you can use Kili Technology for free to do computer vision, video, text, document, or image annotation. Our free plan allows you to label small datasets, and our pricing plans will enable you to grow as your dataset grows. It's an excellent starting point to try our platform for yourself and experiment with our different annotation types (bounding boxes, superpixel, etc...).
Is Kili Technology an open-source software?
Kili Technology is not an open-source software. However, you can use our free plan to do image annotation and computer vision tasks, use our geospatial tools, run data visualization, and do everything needed to turn training data into powerful datasets. Note that when using our free plan, you may not benefit from all our various tools at 100% of their capacity.
How is Kili Technology different from other image annotation tools?
Kili Technology platform is different because we put quality at the core of our product. Many low-cost labeling tools focus on improving labeling productivity, which we do as well, but disregard the focus on creating quality data.
At Kili Technology, we build our geospatial image annotation features to allow users to create high-quality datasets that yield the best machine-learning models' performance results.
How does Kili Technology ensure data security?
Kili Technology as an image annotation tool is fully secure with a SOC2, ISO 27001 & HIPAA certification. We also provide different deployment options to fit the data security needs of our customers. Note that data management options may vary depending on your hosting mode (Cloud or On-premise).
Is Kili Technology providing automatic annotation?
Kili Technology's API is accessible to our users. Therefore, you can connect your machine-learning model to generate pre-annotations on computer vision, image annotation, object detection, image segmentation, video annotation, and more.
Does Kili Technology offer integrations?
Kili Technology has been designed to integrate seamlessly into your existing ML stack, easily import and export data, create and manage labeling projects, and manage your ML project's entire training data lifecycle on Kili Technology. Use the CLI & our SDK to quickly upload and download vast amounts of data.