Enhancing Object Detection: The Impact of Visidon CNN-based Noise Reduction

This blog post was originally published at Visidon’s website. It is reprinted here with the permission of Visidon.

In the realm of computer vision, object detection plays a vital role in various applications, including surveillance systems, autonomous driving, and image recognition. However, accurate object detection can be challenging in real-world scenarios due to the presence of noise, which can significantly impact the performance of detection algorithms.

Noise refers to unwanted variations or distortions that can corrupt an image, making it difficult for object detection algorithms to distinguish relevant features from irrelevant ones. Different sources of noise, such as sensor noise, compression artifacts, atmospheric interference, or even cluttered backgrounds and motion, can all contribute to inaccuracies in object detection. Consequently, mitigating the effects of noise becomes crucial for achieving reliable and precise object detection results.

Visidon has developed CNN-based noise reduction technology which improves not only the visual quality but also makes object recognition more accurate. We have done extensive research around the topic, and we will publish some of the research results in our blog in the near future. To get the full access and more demos, please contact our sales at [email protected]

In this first article, we examine the effect of the algorithm on OCR (optical character recognition) accuracy in static scenes. The effect of motion will be studied in the upcoming articles.

Starting the research process

The effect of noise removal on OCR accuracy was studied using open-source OCR model called docTR which runs locally. As a test camera, a FHD video camera with Sony STARVIS CMOS sensor was chosen, due to the flexibility and controllability of the camera settings.

Raw images, images de-noised with different networks, and images corresponding to the ground truth were given as input to the OCR algorithm. The variable in the test has been the brightness of the paper containing the text. The font size of the text on the paper decreases towards the bottom edge.

RAW image (0,5 lux)

Denoised with Visidon noise reduction (0,5 lux)

First, OCR accuracy in natural text recognition was examined. Images like the one above with different brightnesses were used as input. Accuracy is defined here using the Levenštein distance:

From the graphs below, you can see the difference in accuracies achieved with images processed in different ways. “n” means the reading shown by the luminance meter and “m” is an index that increases as the luminance increases by a fraction of a lux. Stacked reference means an average of several images of the same scene.

Accuracy vs illuminance in 0.0-2.0 lux

Accuracy vs illuminance in 1.0-10.0 lux

It is easy to see from the graphs that in these cases, Visidon noise reduction improves accuracy significantly in low light. The algorithm even gets close to the ground truth level.

Comparison to test camera´s own noise reduction

As mentioned, our test camera was a FHD video camera with Sony STARVIS CMOS sensor. Test camera’s own noise reduction was also included in the test, and it is noticeable that the test camera’s own algorithm significantly degrades accuracy of object recognition.

Below is an example of a raw image and an image processed with the camera’s own noise reduction.

Image denoised with test camera´s own algorithm (0,5 lux)

VD denoised (0,5 lux)

License plate recognition – phase one

Next, the effects on license plate recognition (format ABC-123) were investigated. Accuracy describes the amount of correctly predicted complete registration number.

Noisy input

Denoised output

The results follow the same main lines as in the previous test. Here, however, the discriminating ability between the different methods is weaker, because the accuracy is measured by “words” and not characters, and there are far fewer words than characters.

With the license numbers, another test was carried out, where instead of the brightness, the distance of the camera from the photographed object was changed. The goal was to illustrate how the detection distance improves by using de-noise. The distances measured in the laboratory have been converted into real distances based on the ratio of the text and the size of the correct license plate. Brightness in this case ~1 lux.

From these results, we further refined, how big the difference between the best de-noise and the raw image is at different distances in percentage units and how much further (%) the de-noise reaches a certain accuracy threshold: