Lightweight Object Detection: A Study Based on YOLOv7 Integrated with ShuffleNetv2 and Vision Transformer
Abstract
As mobile computing technology rapidly evolves, deploying efficient object detection algorithms on mobile devices emerges as a pivotal research area in computer vision. This study zeroes in on optimizing the YOLOv7 algorithm to boost its operational efficiency and speed on mobile platforms while ensuring high accuracy. Leveraging a synergy of advanced techniques such as Group Convolution, ShuffleNetV2, and Vision Transformer, this research has effectively minimized the model's parameter count and memory usage, streamlined the network architecture, and fortified the real-time object detection proficiency on resource-constrained devices. The experimental outcomes reveal that the refined YOLO model demonstrates exceptional performance, markedly enhancing processing velocity while sustaining superior detection accuracy.
Turn this paper into a full lesson
ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.