What Is Computer Vision?

Blog Posts, NVIDIA / October 29, 2020

This blog post was originally published at NVIDIA’s website. It is reprinted here with the permission of NVIDIA.

Computer vision is achieved with convolutional neural networks that can use images and video to perform segmentation, classification and detection for many applications.

Computer vision has become so good that the days of general managers screaming at umpires in baseball games in disputes over pitches may become a thing of the past.

That’s because developments in image classification along with parallel processing make it possible for computers to see a baseball whizzing by at 95 miles per hour. Pair that with image detection to help geolocate balls, and you’ve got a potent umpire tool that’s hard to argue with.

But computer vision doesn’t stop at baseball.

What Is Computer Vision?

Computer vision is a broad term for the work done with deep neural networks to develop human-like vision capabilities for applications, most often run on NVIDIA GPUs. It can include specific training of neural nets for segmentation, classification and detection using images and videos for data.

Major League Baseball is testing AI-assisted calls at the plate using computer vision. Judging balls and strikes on baseballs that can take just .4 seconds to reach the plate isn’t easy for human eyes. It could be better handled by a camera feed run on image nets and NVIDIA GPUs that can process split-second decisions at a rate of more than 60 frames per second.

Hawk-Eye, based in London, is making this a reality in sports. Hawk-Eye’s NVIDIA GPU-powered ball tracking and SMART software is deployed in more than 20 sports, including baseball, basketball, tennis, soccer, cricket, hockey and NASCAR.

Yet computer vision can do much more than just make sports calls.

What Is Computer Vision Beyond Sports?

Computer vision can handle many more tasks. Developed with convolutional neural networks, computer vision can perform segmentation, classification and detection for a myriad of applications.

Computer vision has infinite applications. With industry changes from computer vision spanning sports, automotive, agriculture, retail, banking, construction, insurance and beyond, much is at stake.

3 Things to Know About Computer Vision

Segmentation: Image segmentation is about classifying pixels to belong to a certain category, such as a car, road or pedestrian. It’s widely used in self-driving vehicle applications, including the NVIDIA DRIVE software stack, to show roads, cars and people. Think of it as a sort of visualization technique that makes what computers do easier to understand for humans.

Classification: Image classification is used to determine what’s in an image. Neural networks can be trained to identify dogs or cats, for example, or many other things with a high degree of precision given sufficient data.

Detection: Image detection allows computers to localize where objects exist. It puts rectangular bounding boxes — like in the lower half of the image below — that fully contain the object. A detector might be trained to see where cars or people are within an image, for instance, as in the numbered boxes below.

What You Need to Know: Segmentation, Classification and Detection

Segmentation	Classification	Detection
Good at delineating objects	Is it a cat or a dog?	Where does it exist in space?
Used in self-driving vehicles	Classifies with precision	Recognizes things for safety

NVIDIA’s Deep Learning Institute offers courses such as Getting Started with Image Segmentation and Fundamentals of Deep Learning for Computer Vision.

Scott Martin
Senior Writer, Corporate Communications Team, NVIDIA

Here you’ll find a wealth of practical technical insights and expert advice to help you bring AI and visual intelligence into your products without flying blind.

Pages

Topics

Contact

Address

Berkeley Design Technology, Inc.
PO Box #4446
Walnut Creek, CA 94596

Phone

Phone: +1 (925) 954-1411