NebulOuS’ Fresh Food Supply use case has seen significant improvement in recent months. During the months of August and September, the solution has been scaled up from a testing scenario to a full network of cameras deployed at each intersection in Mercabarna’s congested zone.
These cameras have been placed in the road intersections (two for each intersection) on the roof of each building to retrieve the maximum amount of information, including license plates. Those cameras record the traffic in both directions and stream the video in the internal network (RSTP).
As we move forward with the hardware deployment, we’ve also been enhancing the algorithmic solutions responsible for processing the video stream and converting it into valuable traffic information for Mercabarna’s roads. The detection system is comprised of two main components.
The Vehicle Detection and Cropping Component (VDC) focuses on detecting objects in motion and their direction with low-computational resources. VDC is carried out in fog devices and its pipeline is based on traditional computer vision techniques. The use of lightweight techniques ensures that VDC can be carried out without needing specialized processing units such as GPU. At this stage, the direction is also computed before sending the crops to the next component. Simulations have been conducted to assess the performance and efficiency of this component more accurately, as can be seen in the following figure.
The License Plate and Feature Extraction Component (LPFE) uses detection and feature extraction deep learning models to determine if the object in motion is a vehicle and extract valuable data to link the vehicle in another place or time. LPFE holds four models to identify and extract valuable data from the crops. First, a YOLO model inferences to identify and classify the type of vehicle. Then, we use a Vision Transformer (ViT) embedding model to link the results from the crops of the first model if needed. The resulting vector of all the crops, which is computed by the ViT, is also saved as a feature from the vehicle. Finally, a YOLO model inferences the crops to detect possible license plates and, if they exist, we apply Optical Character Recognition (OCR) model to read the license plate (that will be anonymized).
The results, gathered by running multiple tests over the roads of Mercabarna, show a substantial improvement in identifying and classifying by applying these modifications to the LPFE workflow, as illustrated in the following confusion matrix.
While we have achieved strong results in identifying forklifts and motorcycles, there is still room for improvement in the YOLO vehicle detection model when it comes to accurately classifying cars due to occlusions, as well as distinguishing between trucks and vans because of their similarity.
NebulOuS’ Fresh Food Supply solution has improved significantly vehicle detection and classification in recent months. Future work will focus on integrating multi-camera data fusion for enhanced tracking and with the help of other feature extraction models, accuracy issues should be reduced. These advances will ensure greater robustness, supporting more effective traffic management across Mercabarna’s roads.