Your Guide to Practical Computer Vision
Computer vision enables computers to derive information from images, videos and other inputs. Learn about its applications as well as potential future use cases.
Computer vision is a field of artificial intelligence (AI) enabling computers to derive information from images, videos, and other visual media. Learn all about the varied applications of computer vision, from its uses in healthcare to farming, retail, video games, government, and plenty more. Read on to discover how computer vision is used today, as well as learn about potential future use cases.
A deep dive into computer vision and use cases
The use of computer vision and the range of computer vision applications has grown exponentially in the last decade. Computer vision is now used in industries as varied as healthcare, manufacturing, retail, and artificial intelligence. In this article, real-world computer vision examples will be given for each of these industries, as well as potential future applications of computer vision.
What is computer vision?
Computer vision is an application of artificial intelligence to analyze and understand the content of visual media, such as images, videos, and real-time systems. The ability to identify objects in images and videos has many varied applications, from autopilot features in electric cars to cancer detection and diagnostic tools in healthcare. Computer vision can be trained to accurately and consistently spot almost any kind of pattern in a range of visual mediums.
How does computer vision work?
Computer vision takes visual input data, such as images and videos, and applies different artificial intelligence models to analyze the input and match it to known patterns. These can take the form of several different tasks, such as:
Object classification
Object classification involves determining what kind of objects can be recognized in a given scene. Models are trained to classify objects by a range of different parameters, such as shape, size, and color. A sufficiently trained model will be able to not only recognize known objects but classify similar objects in novel input.
As an example, a computer model trained to recognize cars may be fed thousands of images of different cars until it builds up a model of what the common characteristics of a car look like. It can then be fed novel images and identify cars in them.
Object identification
Computer models may also be trained to identify unique objects among a set. So, for example, it can be trained to identify each individual car in a group of different cars. This is commonly used with real-time video systems, such as vehicle tracking as part of traffic management systems.
Object tracking
Object tracking involves identifying individual objects and tracking them as they move about. One of the most useful applications of this is in traffic management, where individual cars may be identified and tied to their license plate. Cars caught speeding can immediately and automatically be identified and tracked, as well as having their license plate read and recorded.
Computer vision applications in healthcare
Artificial intelligence and computer vision applications in healthcare range from diagnostic assistance tools, to healthcare and disease monitoring, and even to suggest treatment options. Generally speaking, by taking away and automating routine tasks, computer vision enables healthcare professionals to spend more time with their patients.
One such use of computer vision is in their application to medical scans, such as MRI scans, CT scans, and ultrasound images. AI models are trained on thousands of medical images in order to be able to identify abnormalities that may warrant medical attention. They can aid in medical diagnoses as well as help to prevent false diagnoses.
One such example of computer vision and AI being applied to healthcare is COVID-19 detection and patient monitoring. An AI model has been trained to look at CT scans of lungs and identify signs of COVID. Given the nature of the pandemic and significant caseloads, these applications of computer vision to healthcare help ease the burden of overworked medical staff and monitor disease progression in patients.
Computer vision applications in retail
Computer vision applications in retail have been put to use in a number of ways. One of the most prominent examples is that of Amazon Go retail stores that utilize computer vision in a number of different ways. Computer vision is used to identify individual customers as they enter the shop and track each one as they move around the shop.
Computer vision is also used to identify each individual product so that when a customer picks up a product and puts it in their basket, the store is able to add that to a customer’s virtual basket and track it. The customer may simply walk out the door when they are done shopping. When a customer leaves, the products in their virtual basket are charged to the customer’s Amazon account.
Amazon Go was the first of its kind, but it demonstrates the power of how computer vision may be applied to retail to help cut costs and increase profit margins. Allowing stores to operate on a cashless basis, drastically reduces the risk surface from theft and expenses involved in the transport of cash. It also means retailers will require fewer staff, as there is no need for cashiers when the entire checkout process is completely automated and frictionless. Staff is also put less at risk of criminal behavior, as shops have no need to handle cash.
Computer vision applications in video games
Computer vision has been used to create some fun and unique video games. The Xbox Kinect was a camera for Xbox consoles that could recognize players and allow them to use their bodies as a controller. Real-time gesture recognition and body skeletal detection meant that the Kinect could identify players and allowed them to use both overall body movement and more intricate hand gestures to control games.
Many of the most popular games that made use of the Kinect involved sports or dancing. Computer vision enabled players to make the same gestures or dance moves that they would when actually playing a sport, and had them recognized and used as input inside of the video game. Players could learn the moves to dance and then computer vision would be used to track and analyze the performance. It would compare the player’s dance to an example recorded by a professional. The player’s dances moves would then be graded based on how closely their dance matched the professional’s.
Computer vision applications in transportation
One of the most well-known examples of computer vision in transportation is Tesla. Tesla is a company that builds electric cars and utilizes AI to develop self-driving capabilities known as autopilot. The end goal is full-self driving, but as of yet, it is used to enhance the driving experience while still requiring a human driver to be in control.
Computer vision is used in a range of different applications in self-driving cars. Tesla utilizes neural networks in order to be able to take input from a range of different cameras and sensors positioned around a car to sense and track the local environment. All of these different inputs need to be processed in a matter of milliseconds in order to provide an AI-powered autopilot that can navigate roads at speed and respond to the innumerable range of possibilities that can occur when driving. Some of the tasks that require computer vision include the following:
Pedestrian detection
Self-driving cars must be able to identify and track pedestrians in their field of view. This needs to happen in real-time in order to feed important information to the autopilot so that it can take corrective measures when necessary.
Pedestrian tracking also includes intent prediction. This uses deep machine learning in order to analyze pedestrian behavior and make judgment calls about what a pedestrian may do next, such as the direction they are heading. This uses computer vision to analyze a range of cues, such as their current heading, the pace of movement, and position to the curbside. This information is fed to the autopilot so that it can use as part of the decision-making process while driving.
Vehicle detection
Another necessary part of self-driving cars is vehicle detection. This uses computer vision in much the same way as it is for pedestrian detection, only applied to cars instead. This involves identifying and tracking vehicles within the car’s proximity. It also includes things like intent prediction, so that the autopilot can make judgments about when it should brake, accelerate, and take evasive maneuvers.
Traffic light and road sign detection
Computer vision must also be able to recognize and identify the characteristics of various road signs and traffic lights. Since both traffic lights and road signs can come in many different designs, large datasets of the various different kinds of lights and signs are required to teach an AI model how to recognize and respond to them.
Lane management
Autopilot is not yet capable of full self-driving, but it is capable of lane management, including merging and overtaking. This requires the use of multiple cameras and sensors placed around the car to identify the road, road markings (lanes), and other vehicles around it. Lane management is a complex process involving several different applications of computer vision, intent prediction, and neural networks to arrive at the appropriate time, position, and speed of any lane maneuvers.
Computer vision applications in drones
Drones are another area that benefits from various applications of computer vision. Drones need to be able to recognize objects in their surroundings so that they can navigate the environment while avoiding colliding with objects. They must be able to recognize buildings, trees, and many different types of terrain in real-time in order to fly without colliding with anything. Beyond navigation, there are other uses for computer vision for specific use cases.
Computer vision applications in agriculture
Agriculture is another industry that has seen benefits from computer vision and other technological advances. So-called smart farming, or e-farming, makes use of image processing and deep learning for various use cases, from monitoring farmland to augmenting decision-making about important factors like planting, fertilizing, and harvesting. Below are some of the use cases of computer vision in agriculture.
Monitoring crops
Automatic crop management using drones is possible thanks to computer vision and artificial intelligence. Drones allow for a bird’s eye perspective of the environment, granting them the ability to see over acres of property far quicker than possible from the ground. Computer vision is used for a range of different purposes with regard to crops.
Initially, drones can be used for aerial surveys and imaging. This helps farmers to get a detailed image of the state of their farmland and can help inform them about the best locations for planting. Multispectral images also allow farmers to keep an eye on the condition of their crops and spot signs of disease or stress.
Computer vision can also be used to monitor soil moisture and help with water management. This is a very important part of crop farming as you may imagine, and the ability for farmers to accurately measure and monitor the condition of their soil allows for more efficient use of soil and less water waste.
Monitoring livestock
Another use of computer vision is in the identification and tracking of livestock. Drones are particularly useful for helping to manage large herds of livestock spread across wide-open areas, such as with cows or sheep. Individual animals can be monitored and farmers can be alerted to signs of stress or lost livestock. Currently, computer vision and neural networks are mainly used for anomaly detection. Animals can be monitored and farmers alerted to signs of abnormal behavior, so things that would warrant farmer concern like excessive aggression or lethargy can be brought to their attention.
Though much of this technology is not yet widely commercially available, there shows great promise in how much more computer vision can help with animal husbandry. Things like more advanced behavior monitoring, body weight tracking, and even gait monitoring can all help to spot signs of concern that warrant attention, possibly before even the farmer notices.
Once the animals have been sent to the abattoir, computer vision is used to help assess cuts of meat for grading purposes and disease monitoring. Being able to analyze cuts of meat at different wavelengths offers computer vision advantage over human monitoring, as it can spot issues that are not immediately obvious to the naked eye, and can process far more cuts of meat with lower margins for error.
Computer vision applications in the military
As you may imagine, many of the aforementioned uses of computer vision have also been applied to various use cases by militaries around the world. There are many different use cases for computer vision in the military that is used to generally optimize mission planning and information extraction that can be used to help save the lives of both soldiers and civilians.
Autonomous vehicles and robots
One of the primary use cases of computer vision in the military is to allow for robots and machines to take the place of humans in high-risk missions when there is a serious risk of loss of life. Computer vision is used to allow robots to not only navigate unpredictable environments like warzones but crucially also to identify both ally and enemy combatants at a glance. They can also be used to spot IEDs or other booby traps so that they may be safely triggered without harm to humans.
Target recognition and precision
Another use of computer vision, in combination with other technologies such as GPS data, is to allow for greater accuracy and precision of autonomous weaponry. Weapons such as missiles and combat drones may be equipped with visual sensors that help it to refine its target trajectory and make adjustments autonomously mid-flight in order to enhance precision and reduce collateral damage.
Surveillance and risk detection
Computer vision also plays an important role in automated surveillance across a range of industries. Facial recognition, for example, is used for civilian, police, and military purposes. One of the most prominent uses of computer vision in general and facial recognition, in particular, is in fighting terrorism.
Computer vision makes it possible to identify individuals within a crowd in real-time, even when they are wearing masks and other face coverings. The technology works in extremely low lighting conditions, low-quality images or video feeds, and densely packed crowds, all conditions under which human operators would struggle to work as efficiently. This takes the burden off of manual identification and can help pinpoint dangerous individuals in a crowd before they are able to take action.
Clearing landmines
Another use of computer vision uses neural networks to identify landmines. Most importantly, these networks learn the topology differences between land with landmines and land without landmines. This means that not only is the system is capable of identifying landmines under ideal conditions, but also partially concealed or rotated landmines. This use of computer vision not only saves lives and reduces maiming from hidden unexploded ordinances, but it also makes clearing landmines much safer and helps to make landmine-infested battlefields habitable once more.
Computer vision applications in insurance
Computer vision is being used in the insurance industry in a number of different ways. It is used to conduct more consistent and accurate damage assessments for things like vehicles, homes, and other valuables. This data is typically combined with other AI and deep learning models to both help automatically detect instances of fraud and also to streamline the claims process for legitimate claimants.
Computer vision applications in government
Computer vision also has a range of applications for use in the public sector. Asset management, including public infrastructure and equipment, can be monitored and assessed using computer vision to determine where funds are most needed. Predictive maintenance can also be used to flag and prioritize assets showing the most signs of wear, so that they may be repaired before they become a problem.
One particular area where computer vision has extensive use in government is in customs. Detecting contraband, drug trafficking, and human trafficking are all very labor-intensive and it is not feasible to manually inspect each and every person or package that passes through customs. This results in plenty of people and contraband entering or leaving countries when they shouldn’t. Computer vision becomes an aid that can help detect and flag signs of concern for customs agents to investigate, saving them time on analysis and identification and instead allowing them to focus their attention on suspicious packages and people.
Computer vision is also often used more menially to quickly identify and categorize documents and paperwork. Many governments have very standard labels, regulations, and legal documents that must be properly inspected and filed. This is a tedious job that is prone to human error, but computer vision is ideal for identifying and categorizing paperwork. It is also able to flag errors and violations so that humans only need deal with the problematic paperwork that requires their attention.
IBM solutions
IBM has been at the forefront of creating digital solutions to business problems and computer vision is no exception to this. IBM offers two application suites that take advantage of AI-powered remote monitoring to help businesses protect their assets and optimize their operations.
IBM Maximo Visual Inspection
IBM Maximo Monitor was created to help businesses more effectively monitor and manage their assets, with actionable alerts when things go wrong combined with insights into the causes behind the alerts for more effective problem-solving. This is useful for a range of reasons.
For asset-intensive organizations, like energy and utilities or transportation and logistics, there may be hundreds of thousands to millions of assets under their care at any moment. For example, FedEx Express delivers about ~6 million parcels per day, with FedEx Express just being one of many large courier services. These couriers all need to handle the logistics of picking up, sorting, shipping, and delivering these millions of parcels every day.
Computer vision is used in a range of ways to help businesses more effectively manage their assets. In the context of couriers, computer vision is used to read address labels and automatically sort parcels according to their destination. It can also identify signs of damage or problematic parcels that may be incorrectly labeled or unable to be shipped. All of this helps to optimize logistical operations and help couriers effectively handle and deliver millions of parcels every day.
IBM Maximo Visual Inspection Mobile
IBM Maximo Visual Inspection also offers a range of AI-powered computer vision tools that can be used on mobile devices, such as smartphones, smartwatches, and industry-specific devices like in healthcare. This allows, for example, for factory inspectors to be able to take their mobile phone and use it to discover defects and other issues based on image classification models powered by IBM Maximo Visual Inspection.
The portability of these devices means that the power of computer vision can be brought to even the most remote of locations. For example, NHS Highland used IBM-powered computer vision to monitor their assets that were spread across roughly half of Scotland, including both large cities and also small, remote healthcare facilities. Not only is the manual workload of monitoring all of these devices drastically reduced, but computer vision also offers other benefits.
For example, once an engineer is called to the scene, they can take a picture of the initial assessment and then again after the work has been complete. Not only does this provide an accurate digital history of repairs that can easily be referenced in the future, but in the future, this may also be used to help engineers to diagnose issues, perhaps even before they reach the scene by having a patient take an image of the faulty hardware.
Future applications of computer vision
Computer vision has been around in some form since the late 1950s/early 1960s, but it is within the last decade that its use has exploded quite as widely as it has now. Thanks to artificial intelligence, deep learning, and neural networks, machines can recognize objects and respond to them in real-time. The range of practical uses for this technology is ever-increasing, especially as the models powering these uses improve further and further as they are introduced to more data.
One of the biggest drawbacks currently affecting these devices is their reliance on data centers and cloud services to interpret and process visual data before returning results. This means that devices in the field often require an internet connection to work properly, as the technical work behind object identification, classification, and prediction all require processing power beyond what can be achieved in real-time with mobile devices.
Being able to run these algorithms on the mobile devices themselves is known as edge computing. Significant work has been undertaken in the industry to increase the capacity for mobile devices to perform these kinds of visual interpretation and analysis on the device itself, reducing the reliance on internet connectivity and making it more practical for even more use cases.
Computer vision has come a long way since its earliest days, but it still has a long way to go. As computer models are trained on more data and smaller devices become more capable of handling tasks without having to offload data to external servers, the application of computer vision to different industries and use cases will only continue to grow.