We propose a comprehensive computer vision framework that integrates multi-scale signal processing with an enhanced ConvNeXt-YOLO architecture for robust objectdetection. Our framework addresses three critical challe...
详细信息
ISBN:
(纸本)9798350377040;9798350377033
We propose a comprehensive computer vision framework that integrates multi-scale signal processing with an enhanced ConvNeXt-YOLO architecture for robust objectdetection. Our framework addresses three critical challenges in visual recognition: multi-scale feature representation, signal quality enhancement, and model generalization. The framework implements a sophisticated signal processing pipeline for image preprocessing. Initially, we develop an adaptive resolution normalization algorithm that maintains consistent feature quality across varying input dimensions. Subsequently, we design a context-aware Gaussian filtering mechanism that optimizes the signal-to-noise ratio while preserving essential feature characteristics. These preprocessing techniques significantly enhance the framework's capability to extract discriminative features and maintain computational stability. To optimize the learning process, we introduce a systematic data augmentation strategy incorporating both geometric and signal-level transformations. Our approach combines predetermined rotation sampling (90 degrees, 180 degrees, 270 degrees) with continuous-space ROI augmentation during inference. This hybrid strategy enables the framework to achieve rotation invariance and enhanced generalization capabilities, particularly beneficial for complex objectdetection scenarios. The core innovation lies in our architectural integration of ConvNeXt with YOLO. We redesign the feature extraction backbone using hierarchical ConvNeXt blocks, enabling efficient multi-scale feature learning. The cross-branch information fusion mechanism, coupled with our signal-aware design, substantially improves the model's representational capacity. Experimental results on standard computer vision benchmarks demonstrate superior performance, achieving state-of-the-art accuracy (improvement of X%) and recall rates (improvement of Y%) compared to conventional approaches.
暂无评论