Crowd counting, the task of estimating the total number of people in an image, is essential for intelligent surveillance. Integrating a well-trained crowd counting network into edge devices, such as intelligent CCTV s...
详细信息
Crowd counting, the task of estimating the total number of people in an image, is essential for intelligent surveillance. Integrating a well-trained crowd counting network into edge devices, such as intelligent CCTV systems, enables its application across various domains, including the prevention of crowd collapses and urban planning. For a model to be embedded in edge devices, it requires robust performance, reduced parameter count, and faster response times. This study proposes a lightweight and powerful model called TinyCount, which has only 60k parameters. The proposed TinyCount is a fully convolutional network consisting of a feature extract module (FEM) for robust and rapid feature extraction, a scale perception module (SPM) for scale variation perception and an upsampling module (UM) that adjusts the feature map to the same size as the original image. TinyCount demonstrated competitive performance across three representative crowd counting datasets, despite utilizing approximately 3.33 to 271 times fewer parameters than other crowd counting approaches. The proposed model achieved relatively fast inference times by leveraging the MobileNetV2 architecture with dilated and transposed convolutions. The application of SEblock and findings from existing studies further proved its effectiveness. Finally, we evaluated the proposed TinyCount on multiple edge devices, including the Raspberry Pi 4, NVIDIA Jetson Nano, and NVIDIA Jetson AGX Xavier, to demonstrate its potential for practical applications.
Defect anomaly detection is beneficial in the production cycle of various industries. It is widely used in areas such as metal surface and fabric industries. This paper focuses on deeplearning-driven defect detection...
详细信息
Defect anomaly detection is beneficial in the production cycle of various industries. It is widely used in areas such as metal surface and fabric industries. This paper focuses on deeplearning-driven defect detection models using energy-efficient computing. We concentrate on a segmentation-based defect detection model for metal surface anomaly detection, while we deal with a deconvolution-based defect detection model for fabric defects in this work. We propose a depth-wise convolution structure for the segmentation-based visual defect detection model. In addition, we apply the optimizations supported by the inference engine to two models. The segmentation-based defect detection model inference is approximately 10x faster than the original. Furthermore, the real-time requirement is achieved in a lightweight vision processing unit (VPU) device with a power consumption of only 1.5 Watts for the fabric defect detection model. The practical values of this work are multifaceted, offering substantial benefits in terms of cost reduction, product quality, real-timeprocessing, energy efficiency, and scalability. These advancements not only improve operational efficiency but also contribute to sustainability efforts and provide a competitive advantage in the industry.
Medical image fusions are crucial elements in image-based health care diagnostics or therapies and generic applications of computer visions. However, the majority of existing methods suffer from noise distortion that ...
详细信息
Medical image fusions are crucial elements in image-based health care diagnostics or therapies and generic applications of computer visions. However, the majority of existing methods suffer from noise distortion that affects the overall output. When pictures are distorted by noises, classical fusion techniques perform badly. Hence, fusion techniques that properly maintain information comprehensively from multiple faulty pictures need to be created. This work presents Enhanced Lion Swarm Optimization (ESLO) with Ensemble deeplearning (EDL) to address the aforementioned issues. The primary steps in this study include image fusions, segmentation, noise reduction, feature extraction, picture classification, and feature selection. Adaptive Median Filters are first used for noise removal in sequence to enhance image quality by eliminating noises. The MRIs and CT images are then segmented using the Region Growing-based k-Means Clustering (RKMC) algorithm to separate the images into their component regions or objects. images in black and white are divided into image. In the white image, the RKMC algorithm successfully considered the earlier tumour probability. The next step is feature extraction, which is accomplished by using the Modified Principal Component Analysis (MPCA) to draw out the most informative aspects of the images. Then the ELSO algorithm is applied for optimal feature selection, which is computed by best fitness values. After that, multi-view image fusions of multi modal images derive lower-, middle-, and higher-level image contents. It is done by using deep Convolution Neural Network (DCNN) and the Tissue-Aware Conditional Generative Adversarial Network (TAcGAN) algorithm, which fuses the multi-view features and relevant image features, and it is used for real-time applications. ELSO +EDL algorithm gives better results in terms of accuracy, Peak Signal-To-Noise Ratio (PSNR), and lower Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE)
作者:
Baek, Ji-WonChung, KyungyongKyonggi Univ
Dept Comp Sci 154-42 Gwanggyosan Ro Suwon 16227 Gyeonggi Do South Korea Kyonggi Univ
Div AI Comp Sci & Engn 154-42 Gwanggyosan Ro Suwon 16227 Gyeonggi Do South Korea
Recently, image analysis research has been actively conducted due to the accumulation of big image data and the development of deeplearning. image analytics research has different characteristics from other data such...
详细信息
Recently, image analysis research has been actively conducted due to the accumulation of big image data and the development of deeplearning. image analytics research has different characteristics from other data such as data size, real-time, image quality diversity, structural complexity, and security issues. In addition, a large amount of data is required to effectively analyze images with deep-learning models. However, in many fields, the data that can be collected is limited, so there is a need for meta learning based image analysis technology that can effectively train models with a small amount of data. This paper presents a comprehensive survey of meta-learning-based object-tracking techniques. This approach comprehensively explores object tracking methods and research that can achieve high performance in datalimited situations, including key challenges and future directions. It provides useful information for researchers in the field and can provide insights into future research directions.
real-time object detection is significant for industrial and research fields. On edge devices, a giant model is difficult to achieve the real-time detecting requirement, and a lightweight model built from a large numb...
详细信息
real-time object detection is significant for industrial and research fields. On edge devices, a giant model is difficult to achieve the real-time detecting requirement, and a lightweight model built from a large number of the depth-wise separable convolutional could not achieve the sufficient accuracy. We introduce a new lightweight convolutional technique, GSConv, to lighten the model but maintain the accuracy. The GSConv accomplishes an excellent trade-off between the accuracy and speed. Furthermore, we provide a design suggestion based on the GSConv, slim-neck (SNs), to achieve a higher computational cost-effectiveness of the real-time detectors. The effectiveness of the SNs was robustly demonstrated in over twenty sets comparative experiments. In particular, the real-time detectors of ameliorated by the SNs obtain the state-of-the-art (70.9% AP(50) for the SODA10M at a speed of similar to 100 FPS on a Tesla T4) compared with the baselines. Code is available at https://***/alanli1997/slim-neck-by-gsconv.
Fires can potentially cause significant harm to both people and the environment. Recently, there has been a growing interest in real-time fire and smoke detection to provide practical assistance. Detecting fires in ou...
详细信息
ISBN:
(纸本)9783031510229;9783031510236
Fires can potentially cause significant harm to both people and the environment. Recently, there has been a growing interest in real-time fire and smoke detection to provide practical assistance. Detecting fires in outdoor areas is crucial to safeguard human lives and the environment. This is especially important in situations where more than traditional smoke detectors may be required. In this work, we propose FIRESTART, which aims to achieve accurate and robust ignition detection for prompt identification and response to fire incidents. The proposed framework utilizes a lightweight deeplearning architecture and post-processing techniques for fire-starting interval detection. Its evaluation was conducted on the ONFIRE dataset, comparing it with several state-of-the-art methods. The results are encouraging, particularly from computational and real-time use perspectives.
This paper presents a deeplearning-based system for urban traffic monitoring, focusing on the detection and tracking of motorcycles using embedded hardware, due to the high accident rates of this type of vehicle. Dif...
详细信息
This paper presents a deeplearning-based system for urban traffic monitoring, focusing on the detection and tracking of motorcycles using embedded hardware, due to the high accident rates of this type of vehicle. Different convolutional neural network (CNN) models were evaluated, including MobileNet-v1-SSD, YOLOv5, and Faster R-CNN, implemented on an NVIDIA Graphics processing Units (GPUs) board as the Jetson Xavier NX (R). The MobileNet-v1-SSD model stands out for its balance between precision (90 %), recall (66 %), and latency (similar to 10 ms), making it ideal for real-time applications. Additionally, a tracking algorithm based on optical flow using the Lucas-Kanade method was developed, complemented with logic for creating and deleting identities (IDs), enabling object tracking in dynamic scenarios with partial occlusions. The system includes a methodology for calculating key traffic variables such as speed and direction by correlating pixels with real-world distances through camera calibration. This approach demonstrates the feasibility of developing complex image-processing applications based on resource-constrained platforms by leveraging the features of efficient embedded systems such as General Purpose GPUs.
The accurate recognition of defects in the time-of-flight diffraction (TOFD) images of welds is important to improve the capability and efficiency of defect detection. The existing deeplearning-based defect detection...
详细信息
The accurate recognition of defects in the time-of-flight diffraction (TOFD) images of welds is important to improve the capability and efficiency of defect detection. The existing deeplearning-based defect detection methods take a single image as input, without considering the fact that technicians need to observe the image "dynamically" during its evaluation, resulting in low accuracy and credibility of the defect detection results. To address these issues, combining deeplearning techniques with TOFD inspection domain knowledge, this article proposes a multi-image fusion and feature hybrid enhancement-based weld defect detection method for TOFD images, comprising three parts: a single-to-multiple image decomposition module based on gain preprocessing, multi-image feature extraction module, and weld defect detection module based on feature hybrid enhancement. The developed method can realize a "dynamically changing" feature extraction and target detection of weld defects in TOFD images. The proposed method was experimentally verified using TOFD images of welds in largescale spherical pressure tanks. This method greatly surpassed the current state-of-the-art approaches, including You Only Look Once (YOLO) v9, YOLOv10, and real-time DEtection TRansformer (RT-DETR), achieving a mean average precision of 82.0%, average precision for small-size targets of 45.2%, and average recall for small-size targets of 70.9%. The detection time for a single TOFD image with a resolution of 500 x 1350 pixels is 0.1287 s, satisfying the real-time requirements for weld TOFD inspection in practical engineering applications. The proposed method can also be extended to engineering applications such as intelligent detection of weld defects based on X-ray images.
To address issues such as mainlobe distortion and sidelobe elevation in airborne monostatic radar for mainlobe jamming suppression, this paper proposes a mainlobe jamming suppression algorithm for airborne bistatic ra...
详细信息
Notwithstanding the tremendous success of deep neural networks in a range of realms, previous studies have shown that these learning models are exposed to an inherent hazard called adversarial example - images to whic...
详细信息
Notwithstanding the tremendous success of deep neural networks in a range of realms, previous studies have shown that these learning models are exposed to an inherent hazard called adversarial example - images to which an elaborate perturbation is maliciously added could deceive a network, which entails the study of countermeasures urgently. However, existing solutions suffer from some weaknesses, e.g., parameters are usually determined empirically in some processing-based detection methods might result in a sub-optimal effect, and the directly performed processing on images might affect the classification of benign samples, leading to increment of false positive. In this paper, we propose a novel image-DepenDent noIse reducTION (ADDITION) model based on deeplearning for adversarial detection. The ADDITION model can adaptively convert the adversarial perturbation in each image to approximate Gaussian noise by injecting image-dependent additional noise, then perform noise reduction to eliminate the adversarial perturbation, and finally detect adversarial examples by examining the classification inconsistency between the input image and its denoised version. The ADDITION model is trained end-to-end on benign samples without any prior knowledge of adversarial attacks, and thus avoid time-consuming task of generating adversarial examples in practical use. We generate more than 220,000 adversarial examples based on six attack algorithms for evaluation and present state-of-the-art comparisons on three real-word datasets. Extensive experiments demonstrate that our proposed method achieves improved performance in both detection accuracy rate and false positive rate.
暂无评论