In today's security-conscious environment, the need for effective real-time weapon detection systems is paramount, especially in public spaces and sensitive areas. The primary objective of this research is to crea...
详细信息
ISBN:
(纸本)9798350374292;9798350374285
In today's security-conscious environment, the need for effective real-time weapon detection systems is paramount, especially in public spaces and sensitive areas. The primary objective of this research is to create a real-time system for detecting weapons utilizing advanced deep learning techniques, namely VGG16 and Faster RCNN. The system's objective is to precisely detect and categorize weapons from photos and video data, offering prompt notifications and improving security procedures. The project's backdrop explores the difficulties encountered, including data gathering, precision standards, immediate processing, confidentiality, and deployment considerations. The system achieves excellent accuracy and real-time detection capabilities by building a comprehensive dataset manually and training the models using GPUs and modern technologies. Using Python as the programming language provides flexibility and simplicity in development, making use of Python's modules such as OpenCV for imageprocessing and Keras for deep learning models. The Tkinter framework enables the creation of a graphical user interface (GUI) that supports various user operations, such as uploading datasets, generating models, processingimages and videos, detecting weapons, and visualizing results. The methodology employs a systematic approach, encompassing stages such as data preprocessing, model building, training, testing, and result analysis. The combination of VGG16 and Faster RCNN algorithms demonstrates a compromise between speed and accuracy, with Faster RCNN exhibiting greater performance in real-time detection. The project aims to achieve several objectives, including the compilation of a dataset, training a model, evaluating accuracy, implementing real-time detection, and establishing alerting methods. The system's range encompasses a wide range of surroundings, lighting situations, and orientations, allowing it to be flexible and suitable for a variety of security applications. To sum
The use of AR technology in image-guided neurosurgery enables visualization of lesions that are concealed deep within the brain. Accurate AR registration is required to precisely match virtual lesions with anatomical ...
详细信息
The use of AR technology in image-guided neurosurgery enables visualization of lesions that are concealed deep within the brain. Accurate AR registration is required to precisely match virtual lesions with anatomical structures displayed under a microscope. The purpose of this work was to develop a real-time augmented surgical navigation system using contactless line-structured light registration, microscope calibration, and visible optical tracking. Contactless discrete sparse line-structured light point cloud is utilized to construct patient-image registration. Microscope calibration optimization with dimensional invariant calibrator is employed to enable real-time tracking of the microscope. The visible optical tracking integrates a 3D medical model with surgical microscope video in realtime, generating an augmented microscope stream. The proposed patient-image registration algorithm yielded an average root mean square error (RMSE) of 0.78 +/- 0.14 mm. The pixel match ratio error (PMRE) of the microscope calibration was found to be 0.646%. The RMSE and PMRE of the system experiments are 0.79 +/- 0.10 mm and 3.30 +/- 1.08%, respectively. Experimental evaluations confirmed the feasibility and efficiency of microscope AR surgical navigation (MASN) registration. By means of registration technology, MASN overlays virtual lesions onto the microscopic view of the real lesions in realtime, which can help surgeons to localize lesions hidden deep in tissue.
In this paper, we introduce a novel unsupervised video denoising deep learning approach that can help to mitigate data scarcity issues and show robustness against different noise patterns, enhancing its broad applicab...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
In this paper, we introduce a novel unsupervised video denoising deep learning approach that can help to mitigate data scarcity issues and show robustness against different noise patterns, enhancing its broad applicability. Our method comprises three modules: a Feature generator creating feature maps, a Denoise-Net generating denoised but slightly blurry reference frames, and a Refine-Net re-introducing high-frequency details. By leveraging the coordinate-based network, we can greatly simplify the network structure while preserving high-frequency details in the denoised video frames. Extensive experiments on both simulated and real-captured videos demonstrate that our method can effectively denoise real-world calcium imaging video sequences without prior knowledge of noise models and data augmentation during training.
Vehicle detection in high-resolution remote sensing imagery faces challenges such as varying scales, complex backgrounds, and high intra-class variability. We propose an enhanced YOLOv8 framework, incorporating three ...
详细信息
Vehicle detection in high-resolution remote sensing imagery faces challenges such as varying scales, complex backgrounds, and high intra-class variability. We propose an enhanced YOLOv8 framework, incorporating three key advancements: the Adaptive Feature Pyramid Network (AFPN), Omni-Dimensional Convolution (ODConv), and a Slim Neck with Generalized Shuffle Convolution (GSConv). These enhancements improve vehicle detection accuracy, computational efficiency, and visual AI capabilities for applications such as computer animation and virtual worlds. Our model achieves a Mean Average Precision (mAP) of 0.7153, representing a 4.99% improvement over the baseline YOLOv8. Precision and recall increase to 0.9233 and 0.9329, respectively, while box loss is reduced from 1.213 to 1.054. This framework supports real-time surveillance, traffic monitoring, and urban planning. The NEPU-OWOD V2.0 dataset, used for evaluation, includes high-resolution images from multiple regions and seasons, along with diverse annotations and augmentations. Our modular approach allows for separate assessments of each enhancement. The dataset and source code are available for future research and development at (https://***/10.5281/zenodo.13075939).
Efficient video transmission enables a wide range of applications in underwater environments, such as seabed survey, subsea equipment maintenance, oil pipe/bridge inspection, and marine life sample collection. At pres...
详细信息
ISBN:
(纸本)9798350362077
Efficient video transmission enables a wide range of applications in underwater environments, such as seabed survey, subsea equipment maintenance, oil pipe/bridge inspection, and marine life sample collection. At present, it is a common belief that real-time underwater video transmission through underwater acoustic communication is challenging due to the influence of complex underwater environments and the limitation of underwater acoustic communication. In this paper, we propose an adaptive real-time underwater video transmission system using underwater communication. The system consists of three modules, i.e, video pre-processing module, video transmission module and video post-processing module. In the first two modules, the sender adaptively adjusts the compression bitrate and transmission rate according to the video quality and channel conditions. In the third module, the deep learning-based video reconstruction algorithm for underwater image information recovery is exploited. The efficacy of this system is verified by real underwater videos collected in several sea fields. The results prove the proposed system is able to transmit video successfully and efficiently in the underwater environment.
Multiple objects tracking in a video sequence can be performed by detecting and distinguishing the objects that appear in the sequence. In the context of computer vision, the robust multi-object tracking problem is a ...
详细信息
Multiple objects tracking in a video sequence can be performed by detecting and distinguishing the objects that appear in the sequence. In the context of computer vision, the robust multi-object tracking problem is a difficult problem to solve. Visual tracking of multiple objects is a vital part of an autonomous driving vehicle's vision technology. Wide-area video surveillance is increasingly using advanced imaging devices with increased megapixel resolution and increased frame rates. As a result, there is a huge increase in demand for high-performance computation system of video surveillance systems for real-timeprocessing of high-resolution videos. As a result, in this paper, we used a single stage framework to solve the MOT problem. We proposed a novel architecture in this paper that allows for the efficient use of one and multiple GPUs are used to process Full High Definition video in realtime. For high-resolution video and images, the suggested approach is real-time multi-object detection based on Enhanced Yolov5-7S on Multi-GPU Vertex. We added one more layer at the top in backbone to increase the resolution of feature extracted image to detect small object and increase the accuracy of model. In terms of speed and accuracy, our proposed approach outperforms the state-of-the-art techniques.
Remote control vehicles require the transmission of large amounts of data, and video is one of the most important sources for the driver. To ensure reliable video transmission, the encoded video stream is transmitted ...
详细信息
ISBN:
(纸本)9781728198354
Remote control vehicles require the transmission of large amounts of data, and video is one of the most important sources for the driver. To ensure reliable video transmission, the encoded video stream is transmitted simultaneously over multiple channels. However, this solution incurs a high transmission cost. To address this issue, it is necessary to use more efficient video encoding methods that can make the video stream robust to noise. Moreover it should have a less complexity to adapt to the realtime requirement. In this paper, we propose a low-complexity, low-latency 2-channel Multiple Description Coding (MDC) solution with an adaptive Instantaneous Decoder Refresh (IDR) frame period, which is compatible with the HEVC standard with adaptive redundancy adjustment. This method shows a better resistance to high packet loss rates with lower complexity.
Edge-aware image smoothing refers to the removal of details with edges preserved. It is an essential topic in the field of imageprocessing and computer graphics. In this paper, in order to achieve better edge preserv...
详细信息
Edge-aware image smoothing refers to the removal of details with edges preserved. It is an essential topic in the field of imageprocessing and computer graphics. In this paper, in order to achieve better edge preservation than the existing models, we propose a robust edge-preserving image filtering method based on a complementary weighting scheme. Both isotropic and anisotropic weights are involved in our model to adapt the fidelity and the regularization terms. To efficiently solve the proposed model, we introduce an effective algorithm based on additive half quadratic minimization, alternating direction of multipliers, and Fourier domain optimization strategies. We experimentally validate the proposed filter on several low-level vision tasks. Both quantitative and qualitative experimental results show significant superiority of our proposed filter compared to existing techniques. Furthermore, the filter exhibits high efficiency and is able to process 720P color images (over 10 fps) in real-time on an NVIDIA RTX 3070. Therefore, it is practical for real-world applications.
The mixed reality conference system proposed in this paper is a robust,real-timevideoconference application software that makes up for the simple interaction and lack of immersion and realism of traditional video co...
详细信息
The mixed reality conference system proposed in this paper is a robust,real-timevideoconference application software that makes up for the simple interaction and lack of immersion and realism of traditional videoconference,which realizes the entire process of holographic videoconference from client to cloud to the *** paper mainly focuses on designing and implementing a videoconference system based on AI segmentation technology and mixed *** mixed reality conference system components are discussed,including data collection,data transmission,processing,and mixed reality *** data layer is mainly used for data collection,integration,and video and audio *** network layer uses Web-RTC to realize peer-to-peer data *** data processing layer is the core part of the system,mainly for human video matting and human-computer interaction,which is the key to realizing mixed reality conferences and improving the interactive *** presentation layer explicitly includes the login interface of the mixed reality conference system,the presentation of real-time matting of human subjects,and the presentation *** the mixed reality conference system,conference participants in different places can see each other in real-time in their mixed reality scene and share presentation content and 3D models based on mixed reality technology to have a more interactive and immersive experience.
Current traffic status detection methods heavily rely on historical traffic flow data and vehicle counts. However, these methods often fail to meet the stringent real-time requirements of state detection, especially o...
详细信息
Current traffic status detection methods heavily rely on historical traffic flow data and vehicle counts. However, these methods often fail to meet the stringent real-time requirements of state detection, especially on edge devices with limited computing *** address these challenges, this study develops a traffic alert model using temporal video frame analysis and grayscale aggregation quantization techniques. Initially, the model uses distance mapping between pixel features and frames of road traffic videos to construct a comprehensive road environment and vehicle segmentation model. The model also establishes a mapping between pixel equidistant lines and actual distances, enabling precise congestion detection. This approach significantly reduces costs associated with traditional traffic detection methods as it does not rely on historical data. Performance evaluation using fixed-point road monitoring data indicates that the proposed model outperforms traditional traffic state detection models, with a performance improvement of approximately 4.7% to 9.5%. Additionally, the model improves computing resource efficiency by approximately 72.5% and demonstrates substantial real-time detection capabilities.
暂无评论