检索结果-内蒙古大学图书馆

A lightweight image inpainting model for removing unwanted objects from residential real estate's indoor scenes

MULtimeDIA TOOLS AND APPLICATIONS 2024年第34期83卷 80389-80410页

作者： Sompoppokasest, Srun Siriborvornratanakul, Thitirat Natl Inst Dev Adm Grad Sch Appl Stat 148 SeriThai Rd Bangkok 10240 Thailand

To enhance the appeal of residential real estate listings and captivate online customers, clean and visually convincing indoor scenes are highly desirable. In this research, we introduce an innovative image inpainting model designed to seamlessly replace undesirable elements within images of indoor residential spaces with realistic and coherent alternatives. While Generative Adversarial Networks (GANs) have demonstrated remarkable potential for removing unwanted objects, they can be resource-intensive and face difficulties in consistently producing high-quality outcomes, particularly when unwanted objects are scattered throughout the images. To empower small- and medium-sized businesses with a competitive edge, we present a novel GAN model that is resource-efficient and requires minimal training time using arbitrary mask generation and a novel half-perceptual loss function. Our GAN model achieves compelling results in removing unwanted elements from indoor scenes, demonstrating the capability to train within a single day using a single GPU, all while minimizing the need for extensive post-processing.

关键词： deep learning image inpainting image completion Generative Adversarial Network

来源：评论

学校读者我要写书评

暂无评论

Automatic Quantification of Atmospheric Turbulence Intensity in Space-time Domain

引用

SENSORS 2025年第5期25卷 1483页

作者： Gulich, Damian Tebaldi, Myrian Sierra-Sosa, Daniel UNLP CONICET La Plata Ctr Invest Opt CIC RA-1897 La Plata Argentina Univ Nacl La Plata UNLP Fac Ingn Dept Ciencias Bas RA-1900 La Plata Argentina Catholic Univ Amer Elect Engn & Comp Sci Washington DC 20064 USA

Quantifying atmospheric turbulence intensity is a challenging task, particularly when assessing real-world scenarios. In this paper, we propose a deep learning method for quantifying atmospheric turbulence intensity based on the space-time domain analysis from videos depicting different turbulence levels. We capture videos of a static image under controlled air turbulence intensities using an inexpensive camera, and then, by slicing these videos in the space-time domain, we extract spatio-temporal representations of the turbulence dynamics. These representations are then fed into a Convolutional Neural Network for classification. This network effectively learns to discriminate between different turbulence regimes based on the spatio-temporal features extracted from a real-world experiment captured in video slices.

关键词： atmospheric turbulence deep learning space-time analysis video analysis turbulence intensity quantification

来源：评论

学校读者我要写书评

暂无评论

Crystal search - feasibility study of a real-time deep learning process for crystallization well images

引用

ACTA CRYSTALLOGRAPHICA A-FOUNDATION AND ADVANCES 2023年第4期79卷 331-338页

作者： Thielmann, Yvonne Luft, Thorsten Norbert, Zint A. Koepke, Juergen Max Planck Inst Biophys Mol Membrane Biol Max von Laue Str 3 D-60438 Frankfurt Germany Systrade GmbH Bockenheimer Landstr 47 D-60325 Frankfurt Germany

To avoid the time-consuming and often monotonous task of manual inspection of crystallization plates, a Python-based program to automatically detect crystals in crystallization wells employing deep learning techniques was developed. The program uses manually scored crystallization trials deposited in a database of an in-house crystallization robot as a training set. Since the success rate of such a system is able to catch up with manual inspection by trained persons, it will become an important tool for crystallographers working on biological samples. Four network architectures were compared and the SqueezeNet architecture performed best. In detecting crystals AlexNet accomplished a better result, but with a lower threshold the mean value for crystal detection was improved for SqueezeNet. Two assumptions were made about the imaging rate. With these two extremes it was found that an image processing rate of at least two times, but up to 58 times in the worst case, would be needed to reach the maximum imaging rate according to the deep learning network architecture employed for real-time classification. To avoid high workloads for the control computer of the CrystalMation system, the computing is distributed over several workstations, participating voluntarily, by the grid programming system from the Berkeley Open Infrastructure for Network Computing (BOINC). The outcome of the program is redistributed into the database as automatic real-time scores (ARTscore). These are immediately visible as colored frames around each crystallization well image of the inspection program. In addition, regions of droplets with the highest scoring probability found by the system are also available as images.

关键词： biocrystallization high-throughput screening deep learning neural network U-Net AlexNet VggNet ResNet SqueezeNet BOINC

来源：评论

学校读者我要写书评

暂无评论

Adaptive routing sign transformer framework

引用

SIGNAL image AND VIDEO processing 2023年第4期17卷 1715-1722页

作者： Chen, Yuming Mei, Xue Nanjing Tech Univ Coll Elect Engn & Control Sci Nanjing Peoples R China

Although deep learning-based continuous sign language translation (CSLT) models have made great progress in recent years, they are still faced with various difficulties and limitations when applied to practical scenarios. In order to better apply the technology of deep learning, we propose the adaptive route sign transformer framework for CSLT. The adaptive routing strategy is proposed to solve the problem that the accuracy of the deep learning model trained in the laboratory scene is greatly reduced when it is applied to the real scene, and the back-end part of the model, we present, adopts transformer-style decoder architecture to real-time translate sentences from the spatiotemporal context around the signer. By means of network layer visualization, we demonstrate that the attention mechanism of the model captures the hand and face regions of signers, which is often crucial for semantic analysis of video sign language. In this paper, we introduce the Chinese sign language corpus of the business scene which show sign language communication in a bank, a station, etc. It has certain impetuses for further research on video sign language translation. Experiments are carried out the PHOENIX-Weather 2014T (RWTH Aachen University, Germany);the proposed model outperforms the state-of-the-art in inference times and accuracy using only raw RGB as input.

关键词： Sign language recognition Domain adaption Transfer learning Transformer

来源：评论

学校读者我要写书评

暂无评论

A real-time SVM-based hardware accelerator for hyperspectral images classification in FPGA

引用

MICROPROCESSORS AND MICROSYSTEMS 2024年 104卷

作者： Martins, Lucas Amilton Viel, Felipe Seman, Laio Oriel Bezerra, Eduardo Augusto Zeferino, Cesar Albenes Univ Vale Itajai UNIVALI Polytech Sch Itajai Brazil Fed Univ Santa Catarina UFSC Dept Elect Engn Florianopolis Brazil

Hyperspectral imaging can be conceptualized as a three-dimensional dataset of spectral information related to a particular landscape. Generally speaking, these are aerial photographs captured by Earth observation satellites. A useful analogy for a hyperspectral image is one of a cube formed with the image acquired along the X and Y axes and a third dimension of spectral bands of varying wavelengths. Given the wealth of data contained within these images, they have been employed in both civilian and military applications such as terrain recognition, urban development supervision, recognition of rare minerals, and various other objectives. The increased utilization of these images has garnered the interest of researchers striving to create solutions that may enable faster processing of the images via parallel processing. In this context, FPGA technology is an option capable of facilitating the implementation of such a system for observation satellites. This research is situated within this framework and aims to develop an FPGA-synthesized hardware accelerator to facilitate real -time hyperspectral image categorization. By taking this approach, hardware-specific solutions can be implemented for embedded applications that process hyperspectral images and can also be integrated with further image processing steps. The proposed accelerator was constructed based on an advanced algorithmic model, resulting in outcomes consistent with those generated by the software -based solution. The experimental results demonstrate that the engineered accelerator can attain a pixel classification time equal to or less than the pixel acquisition time, thus conforming to the real -time processing criteria concerning classification time. Further, the manufactured accelerator exhibits scalability that can classify distinct datasets with varying classes concurrently while maintaining a uniform logic resource utilization.

关键词： Remote sensing Hyperspectral imaging Machine learning Classification Hardware acceleration FPGA

来源：评论

学校读者我要写书评

暂无评论

An Auto Chip Package Surface Defect Detection Based on deep learning

引用

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 2024年 73卷 1-1页

作者： Cao, Yuan Ni, Yubin Zhou, You Li, Haotian Huang, Zhao Yao, Enyi Hohai Univ Coll Internet Things Engn Changzhou 213022 Peoples R China Xidian Univ Sch Comp Sci & Technol Xian 710071 Peoples R China South China Univ Technol Sch Microelect Guangzhou 510641 Peoples R China

Defect detection in chip packaging is a crucial step to ensure product quality and reliability. Traditional methods typically employ image-processing techniques for defect detection during the chip manufacturing process. However, these solutions require manual feature extraction and have limited adaptability to complex scenarios. Thus, deep-learning (DL)-based methods have received widespread attention. Nevertheless, they may fail to achieve the requirements of real-time and high accuracy, and effective datasets are still missing. In this article, we construct a new chip package surface defect detection dataset, which contains 2919 images and four common defect types. To our knowledge, it is the only dataset for simultaneous detection of multiple chips. Also, we propose a real-time chip package surface defect detection method based on the you only look once version 7 (YOLOv7) model to solve the challenge of detecting small targets. In particular, we utilize k -means++ to recluster the anchor frames, merge the convolutional block attention module (CBAM) attention mechanism and receptive field block (RFB) structure, as well as replace traditional nonmaximum suppression (NMS) with our newly proposed confidence propagation cluster (CP-Cluster) to further increase detection accuracy and result confidence. Finally, we evaluate our method by performing many ablation experiments on the dataset we created. The experimental results demonstrate that compared to the original YOLOv7, the proposed method improves the mean average precision@0.5 (mAP@0.5) by 1.39%, the speed of detection by 21.6%, reduces the amount of computation by 17.7%, and the number of parameters by 66.4%, respectively. This proves the superiority and practicality of the proposed method.

关键词： Feature extraction deep learning Production Packaging Convolutional neural networks Transformers real-time systems Attention mechanism chip package deep learning (DL) surface defect detection you only look once version 7 (YOLOv7)

来源：评论

学校读者我要写书评

暂无评论

real-time Multi-Task ADAS Implementation on Reconfigurable Heterogeneous MPSoC Architecture

引用

IEEE ACCESS 2023年 11卷 80741-80760页

作者： Tatar, Guner Bayar, Salih Fatih Sultan Mehmet Vakif Univ Dept Elect Elect Engn TR-34445 Istanbul Turkiye Marmara Univ Dept Elect & Elect Engn TR-34840 Istanbul Turkiye

The rapid adoption of Advanced Driver Assistance Systems (ADAS) in modern vehicles, aiming to elevate driving safety and experience, necessitates the real-time processing of high-definition video data. This requirement brings about considerable computational complexity and memory demands, highlighting a critical research void for a design integrating high FPS throughput with optimal Mean Average Precision (mAP) and Mean Intersection over Union (mIoU). Performance improvement at lower costs, multi-tasking ability on a single hardware platform, and flawless incorporation into memory-constrained devices are also essential for boosting ADAS performance. Addressing these challenges, this study proposes an ADAS multi-task learning hardware-software co-design approach underpinned by the Kria KV260 Multi-Processor System-on-Chip Field Programmable Gate Array (MPSoC-FPGA) platform. The approach facilitates efficient real-time execution of deep learning algorithms specific to ADAS applications. Utilizing the BDD100K+Waymo, KITTI, and CityScapes datasets, our ADAS multi-task learning system endeavours to provide accurate and efficient multi-object detection, segmentation, and lane and drivable area detection in road images. The system deploys a segmentation-based object detection strategy, using a ResNet-18 backbone encoder and a Single Shot Detector architecture, coupled with quantization-aware training to augment inference performance without compromising accuracy. The ADAS multi-task learning offers customization options for various ADAS applications and can be further optimized for increased precision and reduced memory usage. Experimental results showcase the system's capability to perform real-time multi-class object detection, segmentation, line detection, and drivable area detection on road images at approximately 25.4 FPS using a 1920 x 1080p Full HD camera. Impressively, the quantized model has demonstrated a 51% mAP for object detection, 56.62% mIoU for image segmen

关键词： ADAS deep learning deep processing unit memory allocation multi-task learning MPSoC-FPGA architecture Vitis-AI quantization aware training

来源：评论

学校读者我要写书评

暂无评论

deep learning-based robust positioning scheme for imaging sonar guided dynamic docking of autonomous underwater vehicle

引用

OCEAN ENGINEERING 2024年 293卷

作者： Wang, Zhao Xiang, Xianbo Guan, Xiawei Pan, Han Yang, Shaolong Chen, Hong Huazhong Univ Sci & Technol Sch Naval Architecture & Ocean Engn 1037 Luoyu Rd Wuhan 430074 Peoples R China Huazhong Univ Sci & Technol State Key Lab Intelligent Mfg Equipment & Technol Wuhan 430074 Peoples R China Wuhan Second Ship Design & Res Inst Wuhan 430074 Peoples R China Shanghai Jiao Tong Univ Sch Aeronaut & Astronaut Shanghai 200240 Peoples R China

In this paper, a deep learning -based underwater positioning scheme is proposed to achieve robust feature tracking of an autonomous underwater vehicle (AUV) in sonar image during dynamic docking. To address the issues that the distorted feature and acoustic noises lead significant difficulty to detection and tracking of AUV in acoustic image during dynamic docking, first, a pre -trained You Only Look Once (YOLO) network is applied to detect both body and head features of AUV. Second, we introduce an Intersection Over Union (IOU) match -based backend which preliminarily filters the error detections of AUV head based on the rigid relationship between body and head of AUV. Subsequently, Simple Online and realtime Tracking with a deep association metric (deepSort) is utilized to achieve track matching of all detection results including error detections and real target. Moreover, a scoring mechanism is presented to further remove the unfiltered error detections based on the motion tendency of detection tracks. Experiment result shows that the proposed scheme enables real-time and robust feature tracking of AUV with the interference of feature distortion, reverberation and environmental noises.

关键词： Autonomous underwater vehicles (AUVs) Dynamic docking deep learning Acoustic image Underwater positioning

来源：评论

学校读者我要写书评

暂无评论

QSAM-Net: Rain Streak Removal by Quaternion Neural Network With Self-Attention Module

引用

IEEE TRANSACTIONS ON MULtimeDIA 2024年 26卷 789-798页

作者： Frants, Vladimir Agaian, Sos Panetta, Karen CUNY Grad Ctr New York NY 10016 USA CUNY Coll Staten Isl New York NY 10314 USA Tufts Univ Elect & Comp Engn Dept Medford MA 02155 USA

real-world images captured in remote sensing, image or video retrieval, and outdoor surveillance are often degraded due to poor weather conditions, such as rain and mist. These conditions introduce artifacts that make visual analysis challenging and limit the performance of high-level computer vision methods. In time-critical applications, it is vital to develop algorithms that automatically remove rain without compromising the quality of the image contents. This article proposes a novel approach called QSAM-Net, a quaternion multi-stage multiscale neural network with a self-attention module. The algorithm requires significantly fewer parameters by a factor of 3.98 than the real-valued counterpart and state-of-the-art methods while improving the visual quality of the images. The extensive evaluation and benchmarking on synthetic and real-world rainy images demonstrate the effectiveness of QSAM-Net. This feature makes the network suitable for edge devices and applications requiring near real-time performance. Furthermore, the experiments show that the improved visual quality of images also leads to better object detection accuracy and training speed.

关键词： deep learning object detection quaternion image processing quaternion neural networks rain removal

来源：评论

学校读者我要写书评

暂无评论

A deep Face Antispoofing System with Hardware Implementation for real-time Applications 8th

A Deep Face Antispoofing System with Hardware Implementation...

引用

8th International Conference on Computer Vision and image processing (CVIP)

作者： Umer, Saiyed Singh, Shubham Sarvadeo Kangotra, Vikram Rout, Ranjeet Kumar Aliah Univ Comp Sci & Engn Kolkata India Natl Inst Technol Comp Sci & Engn Srinagar Jammu & Kashmir India

ISBN: (纸本)9783031581809;9783031581816

A deep learning-based face anti-spoofing system has been proposed here. This work has been implemented in four segments. Firstly, an image preprocessing task is performed to extract the facial region. Then, the texture analysis of the facial region is performed to compute discriminant features. For this, a robust approach to deep learning techniques is needed, starting with defining some convolutional neural network (CNN) architectures for feature computation, followed by the classification of genuine vs. imposter face liveliness. The motivation of this work is to find both software- and hardware-based solutions to access biometric-based real-time systems through robust and vigorous face-liveness detection techniques. The recognition system's performances are further improved by image acquisition-challenging issues, image augmentation, fine-tuning, transfer learning, and the fusion of various trained CNN models. Finally, the above steps have been embedded in Raspberry Pi devices to build the system for real-time applications. The experimentation with two benchmark databases, NUAA and CASIA Replay-Attack, and comparing the performance with some well-known methods relating to the proposed system area show the proposed system's superiority.

关键词： Face-liveliness Anti-spoofing CNN Fusion Raspberry-Pi

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：