Automatic License Plate Recognition with Hailo-8

This blog post was originally published at Hailo’s website. It is reprinted here with the permission of Hailo.

In this blog post, we present Hailo’s License Plate Recognition (LPR) implementation (also known as Automatic Number Plate Recognition or ANPR). The presented solution can be used in Intelligent Transportation Systems (ITS) and is a good example of how Hailo-8™ is being utilized in a real-life deployment of machine learning in AI-based products. We distinguish between two different deployment scenarios. One where the LPR pipeline runs on the camera and the other where there is a ruggedized processing device that is connected to one or more cameras that are feeding it. In this blog, we are focusing on the former case to highlight the possibility of enabling even the stringent constraints imposed by a camera-attached system unlocked by the capabilities of a high-performance AI processor. The device includes a full HD camera, camera processor, Hailo-8™ AI processor, and GStreamer application which integrates Computer Vision ( CV) pipeline with multi-neural networks.

Introduction

Automatic License Plate Recognition (ALPR) system is one of the most popular video analytics applications for smart cities. The system can be deployed on highways, toll booths, and parking lots to enable fast vehicle identification, congestion control, vehicle counting, law enforcement control, automatic fare collection, and more.

Figure 1 – ALPR system output. The system is able to detect and track the vehicles as well as detect their license plates and recognize them

With a powerful edge AI processor, ALPR can be deployed on edge devices and run in real-time which is crucial for:

Improving product miss-rates with better performing NN that are more resilient to a wide range of scenarios.
Improving overall detection latency.
Lowering the overall TCO compared to existing systems, including installation and maintenance costs.
Increased data protection and improved privacy by eliminating the need to send raw video.

The high compute power introduced by Hailo-8™ also enables the processing of several vehicles concurrently from a long distance and with high accuracy. The accuracy of traditional object detection networks tends to decrease by a factor of x5 for small objects. For instance, a vehicle from 100 meters will occupy only a few hundred pixels in FHD frame which behooves high-capacity NN.

The Hailo TAPPAS ALPR system implemented with GStreamer on Hailo-8™ M.2 card and Kontron’s pITX-iMX8M with NXP’s i.MX8 processor running in real-time (without batching) with a USB camera in FHD input resolution.

Figure 2 – System drawing of the ALPR application running with Hailo-8™ and i.MX8

Application Pipeline

The Hailo ALPR application pipeline is depicted in the following diagram. The pipeline includes three NN that run on the Hailo-8™ device which 1) detect the vehicles, 2) detect the license plates, and 3) recognize the license plate characters (LPRNet). The entire pipeline runs with the GStreamer framework.

To optimize the application latency, we use the Hailo GStreamer Tracker to avoid running unnecessary calculations on vehicles that are already being recognized and quality estimation to avoid running the LPRNet on blurred license plates. The pipeline was designed to meet the challenging requirement of running in real-time for 1080p input resolution with several vehicles in each frame.

All the NN models were compiled using the Hailo Dataflow Compiler and we released the pre-trained weights and precompiled models in the Hailo Model Zoo. The Hailo Model Zoo also supports retraining of the models on custom datasets to ease transferability to other environments. We note that all models were trained on relatively generic use cases and can be optimized (in terms of size/accuracy/fps) for a specific scenario with dedicated datasets.

Figure 3 – CV pipeline of the ALPR application with Hailo-8™. In blue – are blocks that run on the Hailo-8™device, in orange – are blocks that run on the embedded host

Vehicle Detection

For vehicle detection we used a network is based on YOLOv5m with a single class that aggregates all types of vehicles. YOLOv5 is a strong single-stage object detector released in 2020 and trained with Pytorch. To train the vehicle detection network we took several different datasets and aligned them to the same annotation format. Note that different datasets may also have a different separations between vehicle types and this network was trained to detect all kinds of vehicles to the same class. Using high capacity NN, such as YOLOv5m, to detect vehicles means we can detect them with very high accuracy and great distance; therefore, enabling the application to detect and track vehicles even on highways.

Parameters	Compute (GMAC)	Input Resolution	Training Data	Validation Data	Accuracy
21.47M	25.63	640x640x3	370k images (internal dataset)	5k images (internal dataset)	46.5AP*

*YOLOv5m network trained on COCO2017 achieves 33.9AP on the same validation dataset.

Figure 4 – Vehicle detection example comparing SSD-MobileNet-v1 (on the left) and our YOLOv5m (on the right)

License Plate Detection

Our license plate detection network is based on Tiny-YOLOv4 with a single class. Tiny-YOLOv4 is a compact single-stage object detector released in 2020 and trained with the Darknet framework. Although the model has relatively modest accuracy on the COCO dataset (19mAP) we found it to be satisfying for the task of detecting a license plate in a single-vehicle image. To train it, we used different license plate detection datasets and many negative samples (images of vehicles that do not contain license plates).

Parameters	Compute (GMAC)	Input Resolution	Training Data	Validation Data	Accuracy
5.87M	3.4	416x416x3	100k images (internal dataset)	5k images (internal dataset)	73.45AP

Figure 5 – License plate detector examples for images-from-the-wild

LPRNet

LPRNet is a CNN with variable-length sequence decoding driven by connectionist temporal classification (CTC) loss that was trained with Pytorch. This network was trained using mostly autogenerated synthetic datasets of Israeli license plates which makes it suitable for recognizing LP with numbers only. To adopt LPRNet to other regions we recommend using a mix of synthetic and real datasets that represent the license plates of that region and change the number of classes if needed (for example, add a different alphabet). In the Hailo Model Zoo, we provide re-training instructions and a Jupyter notebook that shows how to generate the synthetic dataset that was used for training the LPRNet.

Parameters	Compute (GMAC)	Input Resolution	Training Data	Validation Data	Accuracy
7.14M	18.29	75x300x3	4M images (internal dataset)	5k images (internal dataset)	99.96%*

*Percentage of license plates that are fully recognized (from the entire validation dataset)

Figure 6 – Synthetic dataset examples for the LPRNet training. We used a combination of real and synthetic license plates with different augmentations for training

Deploying ALPR using the Hailo TAPPAS

We have released the ALPR application sample as part of the Hailo TAPPAS. The ALPR sample application builds the pipeline using GStreamer in C++ and allows developers to run the application either from a video file or a USB camera. Other arguments that allow you to control the application include setting parameters for the detectors (for example, the detection threshold), the tracker (for example, keep/lost frame rate), and the quality estimation (minimum license plate size and quality thresholds).

The Hailo Model Zoo also allows you to re-train the NN with your own data and port them to the ALPR TAPPAS application for fast domain adaptation and customization. The Hailo ALPR application goal is to give you a solid baseline to build your APLR product by implementing the full ML application pipeline on the Hailo-8™ and an embedded host processor to be deployed on the edge.

What is GStreamer?

GStreamer is an open-source media framework aimed to develop powerful and complex media application pipelines. The GStreamer pipeline is constructed by connecting different GStreamer plugins together. Each plugin is responsible for certain functionality and the combination of all of them creates the full pipeline. For example, a simple GStreamer pipeline to display a video file would include a plugin to handle the file read, a second plugin to decode the format of the file, and a third plugin to display the decoded frame. Each plugin declares its inputs (sinks) and outputs (sources), and the framework generates the full pipeline utilizing its LEGO-like building blocks.

Figure 7 – A simple GStreamer pipeline

Hailo’s GStreamer Support

As part of the HailoRT (Hailo’s runtime library), we release a GStreamer plugin for AI inferencing on the Hailo-8™ AI processor (libgsthailo). This plugin takes care of the entire configuration and inference process on the device which makes the integration of the Hailo-8™ to GStreamer pipelines easy and straightforward. It also enables inference of a multi-network pipeline on a single Hailo-8™ to facilitate a full ML system.

Except for the standard HailoRT plugin, in the ALPR application, we also use additional GStreamer plugins that are released with the TAPPAS package – Hailo GStreamer tools (libgsthailotools).

Tracking: this GStreamer plugin implements a Kalman Filter tracker and is responsible for tracking a general object in an image. It receives updates for each operation of the detection network and is able to associate objects from past frames to assign them a unique ID across frames. The tracker is also able to generate predictions for the location of the objects in unseen frames.
Quality estimation: this plugin is able to estimate the quality of an image by calculating the variance of its edges. It receives an input image and calculates its blurriness (score).
Crop & Resize: this plugin is able to generate different crops of an image by specific locations. It receives an image and a series of ROIs (Regions of Interest or boxes) and generates several images of fixed size.
Hailo Filter: a general plugin that allows you to embed C++ code into the pipeline. For example, postprocessing functionality.
Hailo overlay: to draw the final output of the application we use a specialized plugin that aggregates all the predictions, draws the bounding boxes and metadata and generates the final output frame.

Performance

The following table summarizes the performance of the ALPR application on Hailo-8™ and i.MX8 with USB Camera in FHD input resolution (1920×1080) as well as breakdown for the NN standalone performance.

	FPS	Latency	Accuracy
Full Application (x86)	42	–	99.96%
Full Application (i.MX)	20	–	99.96%
Standalone Vehicle Detection	54	1.39 ms	46.5AP
Standalone License Plate Detection	1386	44.37 ms	73.45AP
Standalone LPRNet	303	5.81 ms	99.96%

Figure 8 – Performance table of the ALPR application running on Hailo-8™

Summary

The Hailo ALPR application presents an end-to-end application framework for deploying AI for intelligent transportation applications on the edge. It includes the entire application pipeline deployed in GStreamer with Hailo TAPPAS and re-training capabilities of each neural network to enable customization with the Hailo Model Zoo. This application provides a solid baseline for building an APLR product with Hailo-8™. For more information check out Hailo TAPPAS documentation.

This work is a collaboration by Tamir Tapuhi, Nadiv Dharan, Gilad Nahor, Rotem Bar, Itai Ofir, Yuval Belzer and Yuval Bernstein

Bibliography

Bochkovskiy, A., Wang, C.-Y., Hong-Yuan, & Liao, M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. Tech Report.

Laroca, R., Zanlorensi, L. A., Gonçalves, G. R., Todt, E., Schwartz, W. R., & Menotti, D. (2019). An Efficient and Layout-Independent Automatic License Plate Recognition System Based on the YOLO detector. IET Intelligent Transport Systems.

Silva, S. M., & Jung, C. R. (2018). License Plate Detection and Recognition in Unconstrained Scenarios. ECCV.

Zherzdev, S., & Gruzdev, A. (2018). LPRNet: License Plate Recognition via Deep Neural Networks. arXiv.

Niv Vosco
Machine Learning Group Manager, Hailo

If you're building AI or vision-enabled products, you've come to the right place.