One of the ways that computers and the machines they control can become increasingly more intelligent is by the use of deep learning. This deep learning allows them to see, speak and, perhaps, think like us. And the way they are able to do this is through the use of algorithms that are loosely inspired by our understanding of the brain.
Through the use of these algorithms, computers can extract patterns from large amounts of data that could be in the form of human speech, images, music or video. And once a computer has the ability to see objects and understand what they are, then the computers could be used for a variety of uses, such as self-driving cars and smart glasses.
Wikipedia tells us that deep learning typically uses artificial neural networks. The levels in these learned statistical models correspond to distinct levels of concepts, where higher-level concepts are defined from lower-level ones, and the same lower-level concepts can help to define many higher-level concepts. In effect, deep learning is meant to discover multiple levels of features that work together to define increasingly more abstract aspects of the data.
The ideas behind deep learning and the associated neural architectures go back to the 1980s, but the growth of the Internet of things (with its multiple sources of data input) and big data (with its ability to store and retrieve large amounts of data) have run in parallel with the growth in research into deep learning. Of course, the ultimate aim of deep learning is to create Artificial Intelligence (AI).
According to claims made recently at the Machine Learning Conference in San Francisco, Google no longer understands how its deep learning decision-making computer systems have made themselves so good at recognizing things in photos. Google software engineer Quoc V. Le explained that deep learning involves large clusters of computers looking at and automatically classifying data, such as things in pictures. This kind of technology can be used for services such as Android’s voice-controlled search, image recognition and Google Translate.
As mentioned earlier, deep learning works hierarchically – the bottom-most layer of the neural network can detect changes in color in an image’s pixels, and then the layer above may be able to use that to recognize certain types of edges. There then follows various other layers looking for different features and eventually you have a system that can recognize faces, etc. If you follow neuroscience, this is pretty much how the occipital lobe of the brain works. One layer of cells recognizes vertical lines in messages from the eye, the next horizontal lines, the next diagonal lines and so on until the brain can identify an object or a letter of the alphabet.
According to Google’s AI chief, Peter Norvig, the data-heavy models used by Google should be able to provide reliable speech recognition and understanding.
It’s not just Google that is working on the problem. It seems that Facebook plans to use deep learning approaches to understand its users. It’s been suggested that deep learning could help Facebook recognize and categorize the content of News Feed posts to show them to people who would be interested. So if you frequently discuss a particular topic, it will detect posts from friends about that topic, and put them on your home page. It’s also been suggested that deep learning would let Facebook determine landmarks in photos for location tagging, or select your photos that are good enough to share with others.
And like Google, Bing plans to use deep learning to provide better search results by connecting like images via a giant graph. People at Bing recently said that two images can be connected if the distance between the respective features learned through deep learning is small enough. Extending this concept to all images on the Web, trillions of connected images form a gigantic graph where each image is connected via semantic links to other images. Using deep learning features, the image of one motorcycle is connected to other images with motorcycles of different colors and shapes. By using traditional features such as colors and edges, the same image of a motorcycle is connected to images of different entities such as bicycles or even waterfalls and landscapes. In contrast, deep learning keeps the semantics in the image neighborhood even though the visual patterns are not very similar.
Microsoft has also demonstrated some deep learning skills with a live English-to-Mandarin translation tool. Microsoft hopes that deep learning can help it provide more compelling experiences on its various platforms.
Yahoo has bought two deep-learning-based image-recognition start-ups, IQ Engines and LookFlow, which one can only assume it will use with Flickr.
And IBM has Watson, which is an amalgamation of numerous data-analysis techniques, including deep learning. IBM also has its cognitive computing plans, which include deep learning, and IBM is in partnership with four universities.
Look out for more about deep learning as we go through 2014.