What are image recognition APIs, and what can they do for you? This article will clear up what image recognition is, what an API does, and how it can help you or your business get more out of the internet. Image recognition has huge potential for businesses and for individual visually impaired internet users.
What is an image recognition API?
Image recognition is where a piece of software detects the characteristics of an image and accurately categorizes it. For example, if you upload an image of a Ferrari 458 to an image recognition API, it should recognize that it is a car and that it is (or should be) red. Depending on the API, further classification may be possible depending on the type of image you’re using.
This may seem really simple–humans can look at a picture and tell you what it’s a picture of without trying, most of the time–but this has been a hard problem to teach computers to solve. A lot of work has gone into figuring out how a computer can understand what things look like, and we’ve made big strides, from the ability to do reverse image searches to Google’s famous Deep Dream network.
An API is an Application Program Interface. It is essentially a middleman between program routines that tells one element how to work with another, or provides the tools they need to perform a function. There are dozens of types of API that can achieve all sorts of goals, using a range of programming languages. In this context, an image recognition API is the tool you can use to access the deep learning power of some commercial image recognition systems.
You need a lot of computing power to perform image recognition. You need masses of data and the power to interpret it all. Most users simply don’t have the massive resources to build their own deep learning machine. Big names such as Google’s Vision API, Microsoft’s Face API, ImageNet, and others have such machines and allow access to them through APIs, either for free or for a fee. This lets businesses of all sizes access this power and users get new experiences as a result.
How is image recognition going to change our internet experience?
Different internet users will get different advantages from image recognition. Let’s look at a hypothetical website owner and a hypothetical user to see how both sides can benefit.
The business benefits of image recognition
As an example, let’s say you run a self-sell portal similar to Etsy or a dating website. You want to manage the quality and suitability of all the images uploaded by users. You want to block all adult or unsuitable images and sort them into the appropriate categories, but you can’t possibly do it all by hand.
Enter the image recognition API. You can use the API, along with a suitable image recognition machine, to scan every single image and define it by set criteria. So you could scan the the library of images for indecent images and delete them. You could scan the images and sort ones that contain food into the “food” category and knitwear into the “woolen” category. Once you tell the API what to do, the process is automated.
There are also opportunities here for augmented reality and interactive image and video. You can use image recognition to have a program recognize objects in the real world. For example, you could take a picture of a pair of sneakers someone is wearing in the street. If the program recognizes the sneakers, the picture could be augmented with a link to purchase them for yourself. This benefits business (it offers an immediate sales opportunity) and benefits the user (they get what they want right now).
The user benefits of image recognition
The sneaker example above is just one obvious way users can benefit from image recognition. Augmented reality means we could instantly access reviews, price information, and lots of data simply by taking a picture of a product. That gives users massive amounts of data to help them make a buying decision.
Mark Zuckerberg summed up an often overlooked benefit to image recognition in his speech on AI earlier this year. He envisioned an image recognition API that worked with blind or partially sighted people that could “read” an image and describe what it sees out loud. This could have massive repercussions for impaired internet users–or, with augmented reality, out in the real world some time down the line.
Image recognition also plays a part in vehicle safety. The new autonomous braking and collision avoidance technologies being introduced work similarly to the APIs we’ve been talking about. They scan and assess images many times a second to keep you and your car safe while on the road. This technology that tells autonomous cars what is around them, too.
Image recognition APIs aren’t going to revolutionize our internet experience on their own. They work alongside existing technology to add a layer of interaction and immersion to the world we see. While this article’s examples are limited, there is huge potential for gaming, movies, the auto industry, retail, entertainment, and any technology-enabled industry. This is just the beginning of what intelligent systems can achieve!