The advancement of artificial intelligence, neural networks, and deep learning has absolutely exploded in recent years. For many people, these terms are often used rather generically and their specific meaning is obscured. What is deep learning and how does it factor into artificial intelligence? This article will cover deep learning, how it is useful, and also provide several deep learning examples.
What is Deep Learning and how is it useful?
Deep learning is a subfield of artificial intelligence and neural networks that is intended to loosely mimic the way neural networks in the human brain work and learn novel tasks. Rather than directly providing instructions and rulesets for a task, deep learning neural networks are instead provided data and algorithms for recognizing patterns, which are used to generate rules that can be applied to tasks without direct human involvement. Deep learning deals with creating ways and means for machines to make sense of data without having to be explicitly taught how to make use of or understand novel data; instead, it can use algorithms and apply prior knowledge to comprehend and react to data not previously encountered.
Deep learning is a structured learning method that goes ‘deeper’ by using a number of layers to analyze and understand data. The roots of the term ‘deep learning’ stretch back to the 1980s, but it is mostly within the past decade that cloud computing and general computer power have been able to crunch the vast data sets and many tasks needed to get results in reasonable timeframes.
Deep learning has many applications, ranging from chatbots to facial recognition, speech recognition, and more. These tasks would be too difficult or unwieldy for programmers to manually account for every possible outcome or permutation. Instead, artificial intelligence is trained on related data sets and tasks, such as understanding the features of a face. Machine learning algorithms are used to create a model of what a human face looks like two eyes, two ears, one nose, and one mouth. With sufficient data, deep learning systems can then identify faces in images that it has not specifically been trained on because they can use the data and model to apply that ruleset to understand what a human face looks like without being explicitly taught.
How does Deep Learning work?
Deep learning works differently from traditional computer programming. In traditional computer programming, the developers write a set of rules which can be applied to the input to generate an expected output. This output is usually (but not always) deterministic, such that the developer can write tests for their rules, knowing ahead of time what the answer should be.
By contrast, deep learning works somewhat backward - neural networks are provided with large datasets and expected outputs, from which it derives a set of rules itself. With enough training, a deep learning neural network can then respond to novel data by applying those rules without requiring an explicitly programmed response.
There are several different deep learning techniques in use to help train neural networks, but in general, deep learning transforms data into vectors and uses mathematical functions to map the input to output. They are trained to recognize patterns in data (as well as filter out irrelevant or redundant information) bypassing these vectors through many different layers of nodes, working somewhat like a process of deduction. Through processes like Monte Carlo or quadratic programming, deep learning neural networks can sift through very large datasets and arrive at conclusions within short timeframes.
Deep Learning Techniques
Dropout is a method of dealing with overfitting in deep learning. When training algorithms and crafting models, training data is used in order to train those models. Overfitting occurs when artificial intelligence tries to reflect the training data too well. This prevents a model from being able to generalize and apply its knowledge to new data. This can happen for a number of different reasons, such as the model having a high variance, the training data containing a lot of noise that the model focuses on too much, or the training data set is not large enough for the model to be able to generalize.
This is primarily a problem of complex networks with large weights. Dropout, or adding dropout layers, tackles this by probabilistically dropping neurons from the model during training in order to reduce complexity. As a rule of thumb, it is always better to go with a less complex model than a more complex model, all else being equal. Not only are they easier to reason about, but it is easier to add additional complexity at a later point as necessary than it is to remove complexity.
Deep learning from scratch
Training from scratch is pretty much what it sounds like: training a neural network from scratch, rather than building off of existing data sets or models. This is not a particularly common choice, given that it takes longer to build and train a model this way, but for new applications and novel data sets, it is sometimes used.
Rather than starting from scratch, transfer learning involves making use of previously trained models as a baseline. Pre-existing networks are then trained on new data, with adjustments being made for more specific goals. This method is desirable due to the drastic reduction in time required to develop both the model and training data as compared to starting from scratch. However, that time saving is dependent on how similar the new model is compared to the previously trained model.
Learning rate decay
In machine learning, the learning rate is a hyperparameter - that is, a parameter that controls the learning process, such as by determining how much change a model should experience in response to model weight adjustments. Learning rates that are set too low may take too long to train and risk getting stuck, while learning rates that are too high may result in a volatile training process.
Learning rate decay is a process of adapting the learning rate that adjusts the hyperparameter over time, in order to increase performance while also reducing the training time.
Deep learning applications
Since deep learning essentially involves training neural networks in how to respond to novel conditions, there is a wide variety of different applications and use cases for it in the real world. Below are just some of the practical deep learning applications currently in use across various industries.
Deep Learning in Transportation
Self-driving cars are probably one of the most exciting and modern applications of deep learning. Learning to drive involves not simply how to physically move and manipulate a car, but also how to respond to the environment around it. There are countless variables that must be accounted for, from road markings to road signs, other vehicles, and pedestrians. Self-driving cars have a variety of cameras and sensors placed all around the vehicle, in order to collect as much data as possible, that must be analyzed and used to produce a response inside incredibly short time windows. This is a huge challenge and is still an ongoing process within the industry, though incredible strides have been made in recent years, and driverless vehicles may soon become one of the most iconic applications of deep learning in the real world.
Deep Learning in Entertainment
Deep learning has many different applications within the entertainment industry, but probably one of the most widely used use cases (even if people don’t realize it) is recommendations. Services like Netflix and Spotify are able to leverage the power of deep learning to make recommendations for other shows or songs based on personalized preferences and watch history.
Deep learning can also be combined with facial recognition tech in order to gauge people’s emotions. For example, certain sports events such as Wimbledon have used this technology to analyze the emotions of both players and the audience in order to automatically generate highlight clips based on people’s reactions. It allows machines to judge which are the most exciting moments and to show them again as replays, which saves a lot of time and effort over manually doing so.
Deep Learning in Financial Services
The banking and financial services industry deals with millions of transactions on a daily basis and relies on deep learning and neural networks for automated fraud detection. Machine learning is used to understand the regular spending and saving patterns of customers, which is then used to flag anomalous and potentially fraudulent behavior. Doing so in this way saves billions of dollars and helps keep clients’ money safe. A lot of these cases still require human intervention at some stage, but being able to identify suspicious transactions automatically drastically cuts down on the number of fraudulent transactions that slip through the net.
Deep Learning in Healthcare
The healthcare industry is rapidly becoming digitalized, not only as more patient records and healthcare data are being moved to digital systems, but also because machine learning is being used to augment the work of physicians and other medical professionals. The range of use is as varied as helping to identify and diagnose illnesses in patients, monitoring patient health, and lowering readmission rates, as well as being used in clinical research to help find cures to currently untreatable diseases.
Deep Learning in Historical Preservation
Image colorization is the process of taking black and white images and applying color to them. This can be a tedious and painstaking manual process, involving a lot of guesswork and creativity on behalf of the artist. However, deep learning technology is now increasingly being used to colorize old photographs and help bring historical imagery to life. Using this technology, greyscale images can be analyzed for their hues and combined with recognizable objects to create colorized images that closely match our best estimates of what these photographs would have looked like in full color.
These techniques are not limited to simply imagery or colorization. Similarly, machine learning can also be used to add sounds to silent videos. By analyzing objects within a scene and determining the appropriate sounds from a vast database of audio, it’s possible for these neural networks to identify actions like the sound of a baseball hitting a bat or a drumstick striking a drum. From there, the appropriate sound for the situation can be matched to the audio database and added to the video, simulating a realistic sound where there was none in the original video.
Deep Learning in Translation
One obvious example of machine learning is using natural language processing, which involves training neural networks to understand natural language as a human would. This involves understanding not only the explicit meaning of words, but also the wider scope of intent, emotion, and other factors.
What makes this particularly interesting is that it can be combined with other deep learning use cases, such as image recognition. The Google Translate app is a particularly good example of this: it allows users to use their phone’s camera to take a picture or live video feed of text in one language and have it automatically recognized, translated, and replaced with the text of another language. This allows tourists to point their phones at signs or documents in a language they don’t understand and have it translated for them into their native language without any human translators being involved.
The current limitations of Deep Learning
Though there are many practical use cases for the technology already, it is still in its relative infancy. Much learning done by artificial intelligence is still currently guided by humans, using very curated training data. In the far future, the aim is for machines to be able to learn from unguided, unstructured data and develop novel use cases for that data that have little or no human involvement at all.
Currently as it stands, these models and algorithms still need a lot of human guidance and structured data in order to learn. Data from very far outside of the norm represents a challenge that most models cannot handle. In addition to this, particularly complex models require vast amounts of computing resources, which very often means data scientists are reliant on cloud computing to provide the necessary resources. Running these models on consumer devices without an internet connection is not yet possible.
The future of Deep Learning
Ultimately, the goal is to empower machines to learn and respond to novel data much as a human would. Humans (and animals) are able to absorb vast amounts of information from their surrounding environment independent of any task. They are also much better at adapting to highly unexpected events than artificial intelligence. The ability to generalize is something that humans do much better than machines, so eventually, the goal is for researchers to bridge this gap and provide machines with the ability to learn in a largely unguided manner, without the need for vast curated data sets to learn from.
This is known as self-supervised learning. Essentially, the idea behind this is for systems to be able to accept some input like an image or piece of text, with some amount of data intentionally masked or missing, and then be able to guess at what is missing. As an example, imagine a picture of a house with some large black boxes over the windows and door. To a human, it would be obvious what is behind the black boxes, whereas deep learning systems would currently struggle. The idea for self-supervised learning is to empower these models to make inferences based on the context surrounding them and make guesses as to what is missing. This method currently enjoys some success in natural language processing, but much more work is needed to take it beyond simple text input and to be able to generalize about the wider world.