YOLOv7 and YOLOv5 Comparison on Embedded Devices and Computer Systems
YOLOv7 and YOLOv5 Comparison on Embedded Devices and Computer Systems
Computer vision rapidly evolves in every industry by covering agriculture, retail, security, and many use cases. This article will discuss the benchmarks of computer vision's latest SOTA (state-of-the-art algorithms, YOLOv5, and YOLOv7) object detection algorithms for CPU/GPU systems and embedded devices.
So, let's start; the steps this article covers are mentioned below.
- Accuracy Comparison
- Speed Comparison on CPU and GPU Systems
- Speed on Embedded Devices
- General Comparison
- Discussion and Conclusion
Accuracy Comparison
Undoubtedly, every new YOLO (object detection algorithm) variant that comes to market provides better accuracy than all previous ones. Still, at the same time, this doesn't mean every new variant will overcome all drawbacks of previous YOLO variants, and some of the drawbacks can exist in new variants too.
YOLOv7 is better than YOLOv5 in terms of accuracy. The MAP (mean average precision) on the COCO dataset of YOLOv5 is 55.0%, and for YOLov7, it's 56.8%. Research says that change came with a reduction of 35-40% parameters and also with half of the computations for each (normal and embedded system).
Speed Comparison on CPU and GPU's Systems
When YOLO variants are used in commercial use cases, the most concerning question that comes to mind is related to the speed of the algorithm because if the algorithm is providing good accuracy but not providing an acceptable speed, then it cannot be utilized for any use case, and it can then only be limited to research. YOLO's latest variants can provide good speed and accuracy and can easily be useable for different use cases.
We at Cameralyze tested different YOLO variants on a normal GPU system, having the specifications mentioned below.
- 11th-generation computer system
- Nvidia Quadro P2200 GPU (5GB memory)
- 8 Cores, 16 Logical Processors
- Keyboard and Mouse for working
The results are shown in the video mentioned below;
The above results are with pre-trained models on the COCO dataset. So, you can see, YOLOv5 is faster than YOLOv7 in terms of speed on CPU and normal GPUs (I.e., Quadro P2200, Nvidia GTX 1650). But in general, YOLOv7 is faster than YOLOv5 on high-processing GPUs like (Tesla A100, Nvidia RTX 3090), etc.
Speed on Embedded Devices
Embedded devices (I.e., Jetson Nano Developer Kit, Jetson Xavier NX) have gained big popularity in the last few years. The reasons behind this include cost-effectiveness and fewer resource consumption. Now, clients prefer computer vision solutions to be set up on embedded devices instead of being set up on a complete computer system.
In the start, when the Jetson Nano developer Kit was launched, there was no support for the YOLO series to be embedded in the use cases. But in the last few years, the TensorRT engine developed by Nvidia Developers solved this issue. TensorRT provides the functionality to run SOTA (state-of-the-art algorithms) on edge devices (I.e., Jetson Nano Developer Kit, Jetson Xavier NX).
We at Cameralyze tested different YOLO variants on a Jetson Nano Developer Kit. The device that has been used for testing specification includes,
- 4GB RAM (shared memory between CPU and GPU Cores)
- Monitor Screen for display (Connected with Jetson Nano through HDMI to VGA connector)
- Keyboard and Mouse for working
- External SSD Card, or SSD for storage device (Jetson Nano will not come with the storage devices)
- The results are shown in the video mentioned below.
After performing some experiments, we analyzed whether YOLOv5-nano is fast as compared to YOLOv7-tiny on an Embedded device (Jetson Nano Developer Kit).
General Comparison
Note: The mentioned comparison was published on 06 December 2022.
In the future, these benchmarks can change if there are major changes added to YOLOv5 and YOLOv7 official codes.
Discussion and Conclusion
There are a lot of factors that can increase or decrease the above discussed benchmarks and, in general, the performance of SOTA object detection models (YOLOv5 and YOLOv7). The major reason behind this includes but is not limited to are mentioned.
Model Image size: Input image size is a major factor that can increase or decrease speed and accuracy. Reducing model input size will increase speed but decrease accuracy, and an increase in model input size will decrease speed but increase accuracy.
Camera Position: The camera position matters most while considering accuracy. If you have trained the model that can detect people from a perspective view, then on top-head camera view, that model accuracy will definitely drop.
That is all about "The comparison between state of the art (SOTA) algorithms (YOLOv5 and YOLOv7)".
Quickly experience how it works