This blog post was originally published in the late August 2017 edition of BDTI's InsideDSP newsletter. It is reprinted here with the permission of BDTI.
On a recent vacation, I was struck by how indispensable smartphones have become for travelers. GPS-powered maps enable us to navigate unfamiliar cities. Language translation apps help us make sense of unfamiliar languages. Looking for a train, taxi, museum, restaurant, shop or park? A few taps of the screen and you’ve found it.
And yet, there’s a vast amount of useful information that isn’t at our fingertips. Where’s the nearest available parking space? How crowded is that bus, restaurant, or museum right now?
This got me thinking about the potential for computer vision to enable cities – and the people in them – to operate more efficiently. The term “smart cities” is often used to describe cities that adopt modern processes and technology to enhance efficiency, convenience, safety and livability. Most of these improvements require lots of data about what’s going on throughout the city. Embedded vision – the combination of image sensors, processors, and algorithms to extract meaning from images – is uniquely capable of producing this data.
For example, consider street lights. Today, street lights are very simple; they use ambient light detectors to switch lights on at sunset and off at sunrise. But what if we exchanged the light sensor for an embedded vision module? Then street lights could reduce brightness when no people or vehicles are present, saving energy. And they could monitor parking space occupancy to enable drivers to quickly find the nearest vacant space – without requiring installation of a sensor in each parking space. They could spot potholes that need filling and blocked storm drains that need clearing. A network of such sensors could provide data about pedestrian and vehicle traffic flows to enable optimization of traffic signals.
Three key technologies are required to enable these types of capabilities to proliferate. First is embedded hardware: We need inexpensive but powerful microprocessors to run complex computer vision algorithms. Also, cheap image sensors and wireless modems. Second, we need robust algorithms that can reliably extract the needed information from images that are often noisy and cluttered (for example, in low light, or with raindrops on the lens). And third, we need ubiquitous wireless connectivity so that the valuable insights extracted by these devices can be shared.
To me, the really exciting thing about this opportunity is that these technologies are all available today.
learning techniques are making it possible to create robust algorithms for challenging visual recognition tasks with much less engineering effort than was typically required to create traditional, special-purpose computer vision algorithms.
And, thanks to the increased focus on computer vision and machine learning by processor designers, processor performance and efficiency for computer vision and deep learning algorithms are improving fast – not by 10 or 20% per year, but by an order of magnitude or more in two or three years.
I don’t mean to suggest that creating sophisticated computer-vision-based solutions is simple. It’s not. But increasingly, creating such solutions is becoming possible for those with an idea and a skilled engineering team. For example, companies like ParkAssist and Sensity Systems are already deploying camera-based systems to improve parking.
Companies that succeed in getting such systems widely deployed early will enjoy growing opportunities over time. This is because images contain huge amounts of data, enabling a single embedded vision system to collect many diverse types of information – that is, to be a software-defined sensor. So, over time, improved algorithms, processors and sensors will allow more capabilities to fit into the same cost, size and power envelope. For example, a system initially deployed for managing parking spaces might later be upgraded to also monitor pedestrian and vehicle traffic, trash and road surface problems.
And this opportunity isn’t limited to outdoor environments. Inside of shops, for example, companies like RetailNext and GfK are already using vision-based systems to provide retailers with insights to optimize merchandise layout and staffing. And a start-up called Compology is even monitoring the contents of trash receptacles to optimize collection schedules, reducing costs and pollution.
Wherever there are people, or things that people care about, today we have unprecedented opportunities to add value by extracting useful information from images. Already, we’re seeing a few pioneering examples of innovative products delivering on this promise. But they are just the tip of the iceberg.
Speaking of algorithms: If you’re based in Europe and are developing vision algorithms, check out the new full-day, hands-on training class, “Deep Learning for Computer Vision with TensorFlow,” presented by the Embedded Vision Alliance in Hamburg, Germany on September 7th. For details, visit the event web page.
Co-Founder and President, BDTI
Founder, Embedded Vision Alliance