检索结果-内蒙古大学图书馆

37th conference on Neural Information processing systems (NeurIPS)

作者： Fang, Alex Kornblith, Simon Schmidt, Ludwig Univ Washington Seattle WA 98195 USA Google Res Brain Team Mountain View CA USA Univ Washington AI2 Seattle WA 98195 USA

ISBN: (纸本)9781713899921

Does progress on imageNet transfer to real-world datasets? We investigate this question by evaluating imageNet pre-trained models with varying accuracy (57% -83%) on six practical image classification datasets. In particular, we study datasets collected with the goal of solving real-world tasks (e.g., classifying images from camera traps or satellites), as opposed to web-scraped benchmarks collected for comparing models. On multiple datasets, models with higher imageNet accuracy do not consistently yield performance improvements. For certain tasks, interventions such as data augmentation improve performance even when architectures do not. We hope that future benchmarks will include more diverse datasets to encourage a more comprehensive approach to improving learning algorithms.

关键词： image classification

来源：评论

学校读者我要写书评

暂无评论

Emerging Versatile Context Representation on Modulation Recognition with Vision Transformer 2

Emerging Versatile Context Representation on Modulation Reco...

引用

2nd IEEE International conference on Signal, Information and Data processing, ICSIDP 2024

作者： Chen, Baihong Rao, Bin Zou, Xiaohai Wang, Wei Sun Yat-Sen University School of Electronics and Communication Engineering Shenzhen China

ISBN: (纸本)9798331515669

Accurate recognition of intra-pulse modulation patterns is essential for enhancing radar system performance. Tranditional recognition algorithms are typically designed under ideal conditions and handcrafted features, the presence of substantial noise interference in complex electromagnetic conditions can significantly degrade the performance of these algorithms. Although Transformers have achieved great success with computer vision tasks, existing vision transformers are lack of capturing long-range dependencies. To remedy these flaws, this paper proposes an accurate automatic modulation recognition algorithm based on vision transformer. First, we utilize traditional image processing techniques to provide position of signal in time-frequency image and introduce a radius basis function to generate the positional embeddings. This procedure enhances the spatial awareness of our model. Second, dynamic scales of 2D patches are embedded as the input to the encoders. Third, inspired by dilated convolutions, vanilla attention is reformed in the cascade encoders. Experiments and ablation studies exhibit the proposed algorithm performs well under conditions of various signal-to-noise ratio conditions. It is worth mentioning that the classification accuracy can be up to 93.4% when SNR is -10 dB, and reach to 94.4% when SNR is 0 dB, which indicates the proposed algorithm demonstrates superior robustness against noise perturbations compared to other methods. The present work provides a sound experimental basis for further studying automatic modulation classification for the sake of future field application in electronic warfare systems. © 2024 IEEE.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Generalized Single-image-Based Morphing Attack Detection Using Deep Representations from Vision Transformer

Generalized Single-Image-Based Morphing Attack Detection Usi...

引用

IEEE/CVF conference on Computer Vision and Pattern Recognition (CVPR)

作者： Zhang, Haoyu Ramachandra, Raghavendra Raja, Kiran Busch, Christoph Norwegian Univ Sci & Technol Trondheim Norway Darmstadt Univ Appl Sci Darmstadt Germany

ISBN: (纸本)9798350365474

Face morphing attacks have posed severe threats to Face Recognition systems (FRS), which are operated in border control and passport issuance use cases. Correspondingly, morphing attack detection algorithms (MAD) are needed to defend against such attacks. MAD approaches must be robust enough to handle unknown attacks in an open-set scenario where attacks can originate from various morphing generation algorithms, post-processing and the diversity of printers/scanners. The problem of generalization is further pronounced when the detection has to be made on a single suspected image. In this paper, we propose a generalized single-image-based MAD (S-MAD) algorithm by learning the encoding from Vision Transformer (ViT) architecture. Compared to CNN-based architectures, ViT model has the advantage on integrating local and global information and hence can be suitable to detect the morphing traces widely distributed among the face region. Extensive experiments are carried out on face morphing datasets generated using publicly available FRGC face datasets. Several state-of-the-art (SOTA) MAD algorithms, including representative ones that have been publicly evaluated, have been selected and benchmarked with our ViT-based approach. Obtained results demonstrate the improved detection performance of the proposed S-MAD method on inter-dataset testing (when different data is used for training and testing) and comparable performance on intra-dataset testing (when the same data is used for training and testing) experimental protocol.

关键词： Face Recognition Morphing Attack Detection

来源：评论

学校读者我要写书评

暂无评论

image Encryption Based on Chen Chaotic System, OpenSSL S-Box and the Fibonacci Q-Matrix 27

Image Encryption Based on Chen Chaotic System, OpenSSL S-Box...

引用

27th IEEE Signal processing: algorithms, Architectures, Arrangements, and Applications, SPA 2024

作者： Lotfy, Rana Gabr, Mohamed Mamdouh, Eyad Aboshousha, Amr Alexan, Wassim El-Damak, Dina Fathy, Abdallah Mansour, Marvy Badr Monir German University in Cairo Faculty of MET Cairo Egypt German University in Cairo Physics Department Cairo Egypt German University in Cairo Faculty of IET Cairo Egypt ElSewedy University of Technology - Polytechnic of Egypt Faculty of Engineering Technology Department of Electronic Engineering Technology Cairo Egypt The British University in Egypt Faculty of Engineering Department of Electrical Engineering Cairo Egypt

ISBN: (纸本)9788362065486

This paper presents a novel image encryption algorithm that leverages the chaotic properties of the Chen system, the cryptographic strength of OpenSSL, and the mathematical robustness of the Fibonacci Q-Matrix. The proposed method begins by generating an encryption key using the Chen chaotic system, known for its sensitivity to initial conditions and complex dynamic behavior. This key is then utilized in conjunction with a substitution box (S box) generated through OpenSSL to introduce non-linearity and diffusion into the encryption process. To further enhance security, the resulting image data undergoes a series of multiplications by a large number of Fibonacci Q-Matrices, exploiting their recursive properties for added complexity and confusion. Numerical results demonstrate the proposed algorithm's superior performance in terms of security and efficiency, making it a promising solution for safeguarding digital images against unauthorized access and cryptographic attacks. © 2024 Division of Signal processing and Electronic systems, Poznan University of Technology (DSPES PUT).

关键词： Encryption algorithms

来源：评论

学校读者我要写书评

暂无评论

Application of artificial intelligence algorithm model in fault diagnosis of interactive information 6

Application of artificial intelligence algorithm model in fa...

引用

6th International conference on image, Video processing, and Artificial Intelligence, ivPAI 2024

作者： Jin, Hulin Jin, Zhiran Kim, Yong-Guk International Institute of Engineering Psychology DenverCO80202 United States Foothill Preparatory School Temple CityCA91780 United States Department of Computer Engineering Sejong University Seoul05006 Korea Republic of

ISBN: (纸本)9781510681781

Interactive information fault diagnosis technology is a new type of fault diagnosis technology which is integrated by information fusion, artificial intelligence, computer science and other disciplines. It can extract interactive information data of equipment in real time, analyze the characteristics of fault information, and then find the change and trend of equipment operating status. However, there are still some shortcomings in the practical application of this technology, for example, it is difficult to deal with a large number of complex interactive information data, and it is difficult to carry out effective fault diagnosis. Therefore, an intelligent fault diagnosis method with improved grey relational degree is proposed in this paper. This method can effectively extract the interactive information feature data of the equipment, and carry out the correlation analysis of the fault feature data through the grey correlation analysis algorithm, which provides a new idea for the fault diagnosis of the equipment. © 2024 SPIE.

关键词： Correlation coefficients Evolutionary algorithms Matrices Artificial intelligence Fuzzy logic algorithms Data processing Algorithm development Feature extraction Intelligence systems

来源：评论

学校读者我要写书评

暂无评论

Neural Photo-Finishing

引用

ACM TRANSACTIONS ON GRAPHICS 2022年第6期41卷 p1-15页

作者： Tseng, Ethan Zhang, Yuxuan Jebe, Lars Zhang, Xuaner Xia, Zhihao Fan, Yifei Heide, Felix Chen, Jiawen Princeton Univ Princeton NJ 08544 USA Adobe San Jose CA USA

image processing pipelines are ubiquitous and we rely on them either directly, by filtering or adjusting an image post-capture, or indirectly, as image signal processing (ISP) pipelines on broadly deployed camera systems. Used by artists, photographers, system engineers, and for downstream vision tasks, traditional image processing pipelines feature complex algorithmic branches developed over decades. Recently, image-to-image networks have made great strides in image processing, style transfer, and semantic understanding. The differentiable nature of these networks allows them to fit a large corpus of data;however, they do not allow for intuitive, fine-grained controls that photographers find in modern photo-finishing tools. This work closes that gap and presents an approach to making complex photo-finishing pipelines differentiable, allowing legacy algorithms to be trained akin to neural networks using first-order optimization methods. By concatenating tailored network proxy models of individual processing steps (e.g. white-balance, tone-mapping, color tuning), we can model a non-differentiable reference image finishing pipeline more faithfully than existing proxy image-to-image network models. We validate the method for several diverse applications, including photo and video style transfer, slider regression for commercial camera ISPs, photography-driven neural demosaicking, and adversarial photo-editing.

关键词： image processing photo-finishing raw processing

来源：评论

学校读者我要写书评

暂无评论

Smart and Quick Parking Spot Detection 15

Smart and Quick Parking Spot Detection

引用

15th International conference on Computing Communication and Networking Technologies, ICCCNT 2024

作者： Veeramalla, Santhosh Kumar Gali, Rama Lakshmi Ranjeeth, M. Tara, Saikumar Hindumathi, V. Bhargava Kumar, L. Bvrit Hyderabad College of Engineering for Women Department of Ece Hyderabad India

ISBN: (纸本)9798350370249

The rapid expansion of urban areas has intensified the challenge of finding parking spaces for drivers. Intelligent parking systems emerge as a crucial solution by providing real-time detection of available spaces. While various methods exist, from basic magnetic sensors to advanced computer vision algorithms, their effectiveness can be hindered by environmental factors. Recognizing the importance of automated systems, this paper focuses on implementing an image processing approach for parking space detection. By strategically placing cameras in parking lots, the system identifies vacant spaces and transforms images to determine availability. Integrating these techniques into mobile applications could further aid drivers in locating parking spots, improving their overall experience. Ultimately, accurate and efficient parking space identification is vital for reducing congestion, enhancing air quality, and improving urban environments. © 2024 IEEE.

关键词： CCTV Computer vision image processing Parking availability Parking guidance Smart parking Sustainable cities

来源：评论

学校读者我要写书评

暂无评论

An Efficient Implementation of the Threshold-based VNG Demosaicing with Reduced Calculations

An Efficient Implementation of the Threshold-based VNG Demos...

引用

2024 Smart systems Integration conference and Exhibition

作者： Alnaser, Yousef Ravichandran, Anbumani Langer, Jan Fraunhofer ENAS Chemnitz Germany TU Chemnitz Chemnitz Germany

ISBN: (纸本)9798350388787;9798350388770

Color Filter Arrays (CFA) are essential components of digital cameras and image sensors to capture the color information needed to produce full-color images from only a single image sensor per pixel. Many methods and algorithms have been proposed to recover the missing color information of CFAs. In this work, we use a simplified version of the Theshold-based Variable Number of Gradients algorithm proposed by Chang et al. to estimate the full-color information from Bayer images. We also show that the slight modification to algorithm does not effect images quality while making it more compatible with hardware. We propose an efficient implementation of the algorithm that reduces the number of calculations per pixel at the cost of increased memory resources. Our implementation targets an image processing pipeline in an FPGA platform which is short on LUTs and FF resources but has DSPs and BRAMs to spare. We buffer the absolute differences and average color components to be shared and re-used between neighboring pixels, on two levels: within the same row, and between different rows. The latter strategy reduces the number of absolute differences calculated every cycle from 32 to 4 and average color components from 32 to 6. However, the memory requirements are increased from storing 4 image rows to 18 image rows. We implement the solutions on an FPGA using high-level synthesis (HLS) and optimize it to further reduce resources.

关键词： Demosaicing Variable Number of Gradients Bayer image FPGA

来源：评论

学校读者我要写书评

暂无评论

Retrofitting a Legacy Cutlery Washing Machine Using Computer Vision 16th

Retrofitting a Legacy Cutlery Washing Machine Using Computer...

引用

16th International conference on Computational Collective Intelligence (ICCCI)

作者： Fwa, Hua Leong Singapore Management Univ 81 Victoria St Singapore 188065 Singapore

ISBN: (纸本)9783031702587;9783031702594

Industry 4.0, the digitalization of manufacturing promises to lead to lowered cost, efficient processes and even discovery of new business models. However, many of the enterprises have huge investments in legacy machines which are not 'smart'. In this study, we thus designed a cost-efficient solution to retrofit a legacy conveyor belt-based cutlery washing machine with a commodity web camera. We then applied computer vision (using both traditional image processing and deep learning techniques) to infer the speed and utilization of the machine. We detailed the algorithms that we designed for computing both speed and utilization. With the existing operational constraints of our client, frequent re-training of the deep learning model for object detection is not feasible. Thus, we compared the generalizability of the two techniques across 'unseen' cutleries and found traditional image processing to be generalizable across 'unseen' images. Our proposed final solution uses traditional image processing for computation of utilization but a hybrid of traditional image processing and deep learning model for speed computation as it is more reliable. Our client has implemented our proposed solution for one conveyor belt-based cutlery washing machine and will be planning to scale this to multiple conveyor belt-based cutlery washing machines.

关键词： Industry 4.0 Computer Vision Deep Learning image processing

来源：评论

学校读者我要写书评

暂无评论

image Recognition and Position Technology Based on Super-pixel Fuzzy C-Means Clustering in Industrial Assembly System 3

Image Recognition and Position Technology Based on Super-pix...

引用

3rd International conference on image processing and Intelligent Control, IPIC 2023

作者： Yuan, Hailiang Sun, Weitao Wang, Hailing Tianjin Vocational College of Mechanics and Electricity Tianjin China Northwestern Polytechnical University Xi’an China

ISBN: (纸本)9781510668140

Improved fuzzy c-means (FCM) clustering algorithms have been widely used for image recognition and localization. However, in industrial assembly systems, the unsatisfactory pixel merging and segmentation results between local adjacent windows, combined with the differences in the shape, size, and material of parts, as well as variations in lighting conditions, make target image recognition and localization a challenge. Most algorithms struggle to achieve the expected results and have high computational complexity. In this study, we propose a super-resolution-based FCM clustering algorithm that is faster and more accurate for image recognition and localization in industrial assembly systems with irregular part sizes. We first use multiscale morphological gradient operations to obtain high-resolution images. Then, we use the fast FCM clustering algorithm to achieve the recognition and extraction of specific target images. Finally, we use the Sobel operator to determine the target's position. The experimental results demonstrate that the proposed algorithm shows higher accuracy and efficiency in image recognition and localization for industrial assembly systems. © 2023 SPIE.

关键词： image recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：