The future is binary. And if you haven’t heard about Machine Learning or Deep Learning yet, you’re riding a bullock cart on the highway of information. For the uninitiated, Machine Learning is the set of algorithms through which we can make a computer learn how to do a task, without programming explicit rules to achieve the said task. For example, making the computer “see” and “recognize” digits or translating from one language to another, by just looking at examples of translation. Machine Learning can be employed on tasks that were previously thought to be of the human realm. Taking advantage of this fact, hundreds of companies are using these innovations now to solve their problems. Machine Learning, which is a subset of Artificial Intelligence, comes in many forms. In this piece, we are going to take a look at a computer’s ‘seeing’ ability.
Computer vision is the field which deals with automating the human vision and the understanding that is implicit in that system. The data boom attributed to the invention of the internet, radical hardware speed ups and the advent of the Convolution Neural Network(CNN), have revolutionized this space. CNNs are an integral part of most of the novel architectures used in computer vision tasks.
These neural network architecture are already achieving human-level performance at many of the vision tasks. Following are some tasks that have proven to be effectively tackled by AI:
In the object detection task, the algorithm is tasked with recognizing the objects in the scene and the output coordinates of a box bounding the categorized entity. This field has seen leaps of advancement. As of now, state-of-the-art detection algorithms are able to detect 80+ objects from the scene in real-time.
Facial recognition is the vision task which requires a machine to correctly identify a person based on few reference images. Today, the technology has become so refined and optimized, that it runs on many flagship handheld devices for unlocking the phone. We at Knowledge Lens, are in the process of automating our own attendance system using the same technology.
According to a 2017 survey, the most sought-after AI solution is for detecting and deterring security intrusions. Security intrusion can be classified as the breach of computer networks, or physical intrusion into a secure facility. In case of detecting unauthorized physical access, the algorithm is fed the video stream directly from the source and it detects and raises an alert about the presence of any suspicious entity.
A number of companies are using computer vision now to :
- Develop autonomous vehicles eg. Tesla, Uber, Waymo
- Segment useful areas from satellite images
- Train a machine to describe what it sees in the image known as image captioning
- Identify actions in a video
- Find anomalies in radiology scans
- Automatically detect and recognize text written in a picture
- Analyze audio spectrograms
With unstructured data being generated at the petabyte scale, the possibilities of applying computer vision to your problems is only limited by your imagination. At Knowledge Lens, we are in the constant process of developing cutting-edge research in computer vision, and scaling these solutions to the enterprise level.