检索结果-内蒙古大学图书馆

2023 International Conference on Intelligent Systems for Communication, IoT and Security, ICISCoIS 2023

作者： Sakthimohan, M. Elizabeth Rani, G. Navaneethakrishnan, M. Janani, K. Nithva, v. Pranav, R. KGiSL Institute of Technology Department of ECE Tamilnadu Coimbatore India KGiSL Institute of Technology Department of CSBS Tamilnadu Coimbatore India St. Joseph College of Engineering Dept of CSE Tamilnadu Sriperumbudur India KGiSL Institute of Technology Department of CSE Tamilnadu Coimbatore India

ISBN: (纸本)9798350335835

Face detection applications using digital photos are critical in the face recognition process. This application is used in biometric recognition systems, search systems, and security systems. Artificial intelligence and machine learning are combined in computer vision. Using computer techniques, it can extract information from images and videos. Many prior studies used various methods and programming languages to create face detection applications. The most crucial aspect of computer vision is object detection. Locating the face is the primary step in computer vision to detect the face part in the input image. For the Java programming language, the Open-Source Computer vision Library (OpenCv) is a free open-source library for object detection. The Haar cascade classifier is one of the object detection techniques. By counting the number of pictures in a square form on an image, this technique can easily convert an object. The use of face detection in digital photos using the Haar Cascade Classifier and image transformation into grey / grayscale images using the OpenCv library are discussed in this paper. This methodology provides the better investigation accuracy of the outcomes in input photos. © 2023 IEEE.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

2023 Asia-Pacific Conference on image processing, Electronics and Computers, IPEC 2023

2023 Asia-Pacific Conference on Image Processing, Electronic...

引用

2023 Asia-Pacific Conference on image processing, Electronics and Computers, IPEC 2023

The proceedings contain 52 papers. The topics discussed include: improvement of remote sensing image target detection algorithm based on YOLO v5;A Study of Chan-vese model with the introduction of edge information;real-time monitoring algorithm of muscle state based on sEMG signal;lane detection network with direction context;anomaly pixel detection via dual-branch uncertainty metrics;high precision license plate recognition algorithm in open scene;implementation and design of metro process quality inspection system based on image processing technology;the research on remote sensing image change detection based on deep learning;research on aircraft wheel hub pose detection method based on machine vision;lunar dome detection method based on few-shot object detection;and image enhancement algorithm of foggy sky with sky based on sky segmentation.

关键词：

来源：评论

学校读者我要写书评

暂无评论

vPL: visual Proxy Learning Framework for Zero-Shot Medical image Diagnosis

VPL: Visual Proxy Learning Framework for Zero-Shot Medical I...

引用

2024 Conference on Empirical Methods in Natural Language processing, EMNLP 2024

作者： Liu, Jiaxiang Hu, Tianxiang Xiong, Huimin Du, Jiawei Feng, Yang Wu, Jian Zhou, Joey Tianyi Liu, Zuozhu ZJU-Angelalign R&D Center for Intelligence Healthcare Zhejiang University China Angelalign Research Institute Angelalign Technology Inc. China Singapore Singapore

ISBN: (纸本)9798891761681

vision-language models like CLIP, utilizing class proxies derived from class name text features, have shown a notable capability in zero-shot medical image diagnosis which is vital in scenarios with limited disease databases or labeled ***, insufficient medical text precision and the modal disparity between text and vision spaces pose challenges for such *** show analytically and experimentally that enriching medical texts with detailed descriptions can markedly enhance the diagnosis performance, with the granularity and phrasing of these enhancements having a crucial impact on CLIP's understanding of medical images;and learning proxies within the vision domain can effectively circumvent the modal gap *** on our analysis, we propose a medical visual proxy learning framework comprising two key components: a text refinement module that creates high-quality medical text descriptions, and a stable Sinkhorn algorithm for an efficient generation of pseudo labels which further guide the visual proxy *** method elevates the vanilla CLIP inference by supplying meticulously crafted clues to leverage CLIP's existing interpretive power and using the feature of refined texts to bridge the vision-text *** effectiveness and robustness of our method are clearly demonstrated through extensive ***, our method outperforms the state-of-the-art zero-shot medical image diagnosis by a significant margin, ranging from 1.69% to 15.31% on five datasets covering various diseases, confirming its immense potential in zero-shot diagnosis across diverse medical applications. © 2024 Association for Computational Linguistics.

关键词： Zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

Rate-Distortion in image Coding for machines

Rate-Distortion in Image Coding for Machines

引用

Picture Coding Symposium (PCS)

作者： Harell, Alon De Andrade, Anderson Bajic, Ivan, v Simon Fraser Univ Sch Engn Sci Burnaby BC Canada

ISBN: (纸本)9781665492577

In recent years, there has been a sharp increase in transmission of images to remote servers specifically for the purpose of computer vision. In many applications, such as surveillance, images are mostly transmitted for automated analysis, and rarely seen by humans. Using traditional compression for this scenario has been shown to be inefficient in terms of bit-rate, likely due to the focus on human based distortion metrics. Thus, it is important to create specific image coding methods for joint use by humans and machines. One way to create the machine side of such a codec is to perform feature matching of some intermediate layer in a Deep Neural Network performing the machine task. In this work, we explore the effects of the layer choice used in training a learnable codec for humans and machines. We prove, using the data processing inequality, that matching features from deeper layers is preferable in the sense of rate-distortion. Next, we confirm our findings empirically by re-training an existing model for scalable human-machine coding. In our experiments we show the trade-off between the human and machine sides of such a scalable model, and discuss the benefit of using deeper layers for training in that regard.

关键词： image coding Deep neural networks Collaborative intelligence Object detection

来源：评论

学校读者我要写书评

暂无评论

Sub-Pixel counting based diameter measurement algorithm for industrial machine vision

引用

MEASUREMENT 2024年 225卷

作者： Poyraz, Ahmet Gokhan Kacmaz, Mehmet Gurkan, Hakan Dirik, Ahmet Emir Bursa Tech Univ Dept Elect & Elect Engn TR-16310 Bursa Turkiye Dogu Pres R&D TR-1610 Bursa Turkiye Bursa Uludag Univ Dept Comp Engn TR-16120 Bursa Turkiye

In recent years, there has been a notable surge in the utilization of industrial image processing applications across various sectors, including automotive, medical, and space industries. These applications rely on specialized camera systems and advanced image processing techniques to accurately measure working products with precise tolerances. This research presents a novel fast algorithm for measuring the diameter of a ring, employing a subpixel counting method. The algorithm classifies image pixels into two categories: full pixels and transition pixels. Full pixels reside entirely within the inner region of the workpiece, while transition pixels represent gray pixels that reside at the boundary between the workpiece and its background. To ensure accurate determination of the object area, the proposed method incorporates normalization to account for the contribution of transition pixels alongside full pixels. Subsequently, the circle area equation is employed to calculate the diameter. Moreover, a robust threshold selection method is introduced to effectively distinguish pixels with gray intensities. The experimental setup consists of an industrial camera equipped with telecentric lenses and appropriate illumination. The results demonstrate that the proposed algorithm achieves a 3-10 % improvement in accuracy compared to existing approaches. In terms of measuring sensitivity, the operational sensitivity of the proposed methodology is quantified as 1/20th of the pixel size, exhibiting an average uncertainty of 1 mu m. Furthermore, the proposed method surpasses existing works by at least 12.5 % to 35 % in terms of benchmarking computing time.

关键词： Subpixel Diameter measurement image processing Industrial machine vision Radius O-ring

来源：评论

学校读者我要写书评

暂无评论

Spatially Aware Style Transfer

引用

Computational Mathematics and Modeling 2023年第2期34卷 144-156页

作者： Ustyuzhanin, A.O. Kitov, v.v. LLC Yandex.Technologies Moscow Russian Federation Laboratory of Artificial Intelligence Plekhanov Russian University of Economics Moscow Russian Federation Faculty of Computational Mathematics and Cybernetics Lomonosov Moscow State University Moscow Russian Federation

The task of image style transfer is to automatically redraw (using neural networks) an image with some content (for example, a family photo) in the style set by another image (for example, a van Gogh painting), which finds applications in advertising, design, entertainment, and other fields. Common stylization algorithms extract and apply the style evenly, which limits the expressiveness of the stylized result and does not correspond to the real work of artists who use the style differently for different objects in the image, such as a portrait of a person and background objects. The paper proposes an improved style transfer algorithm due to non-uniform styling: style and content images are divided into regions, then each content region is styled with a style from the most appropriate areas of the style image. Improvements of the proposed method compared to existing analogues are shown at a qualitative level, as well as by polling respondents who were required to choose the best stylization among stylizations using different methods in random order. © Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： Computer vision image processing Neural network Style transfer

来源：评论

学校读者我要写书评

暂无评论

End-to-end processing of passive thermography sequences from outdoor concrete infrastructure inspection 45

End-to-end processing of passive thermography sequences from...

引用

Thermosense: Thermal Infrared applications XLv 2023

作者： Pozzer, Sandra Ebrahimi, Samira Refai, Ahmed El López, Fernando Maldague, Xavier Department of Electrical and Computer Engineering Laval University 1065 Av. de la Médecine Quebec CityQCG1V 0A6 Canada Department of Civil Engineering Laval University 1065 Av. de la Médecine Quebec CityQCG1V 0A6 Canada TORNGATS 5635 Rue Rideau Quebec CityQCG2E 5V9 Canada

ISBN: (纸本)9781510661868

This study developed an end-to-end procedure to overcome common issues faced during the analysis of passive infrared thermography (IRT) image sequences from outdoor concrete infrastructures. The processing pipeline includes the automatic pre-processing of raw thermograms, data cleaning and organization, image adjustment, and sequential image registration. One image registration method was implemented, and the results were evaluated using the Euclidean distance metric. Furthermore, the resulting sequences were processed using state-of-art signal processing techniques to increase the signature contrast of subsurface defects. The results from outdoor infrared thermography surveys over two academic samples are presented, where one image per minute was taken for 24 hours on slabs and columns representative structures. By addressing the difficulties encountered during the analysis of passive IRT sequences, our contribution can broaden the spectrum of the application of IRT as an nondestructive testing (NDT) method for the condition assessment of concrete infrastructure. © 2023 SPIE.

关键词： Thermography (imaging)

来源：评论

学校读者我要写书评

暂无评论

Exploring Stochastic Autoregressive image Modeling for visual Representation 37

Exploring Stochastic Autoregressive Image Modeling for Visua...

引用

37th AAAI Conference on Artificial Intelligence (AAAI) / 35th Conference on Innovative applications of Artificial Intelligence / 13th Symposium on Educational Advances in Artificial Intelligence

作者： Qi, Yu Yang, Fan Zhu, Yousong Liu, Yufei Wu, Liwei Zhao, Rui Li, Wei Tsinghua Univ Beijing Peoples R China SenseTime Res Hong Kong Peoples R China Chinese Acad Sci Inst Automat Beijing Peoples R China Shanghai Jiao Tong Univ Qing Yuan Res Inst Shanghai Peoples R China

ISBN: (纸本)9781577358800

Autoregressive language modeling (ALM) has been successfully used in self-supervised pre-training in Natural language processing (NLP). However, this paradigm has not achieved comparable results with other self-supervised approaches in computer vision (e.g., contrastive learning, masked image modeling). In this paper, we try to find the reason why auto-regressive modeling does not work well on vision tasks. To tackle this problem, we fully analyze the limitation of visual autoregressive methods and proposed a novel stochastic auto-regressive image modeling (named SAIM) by the two simple designs. First, we serialize the image into patches. Second, we employ the stochastic permutation strategy to generate an effective and robust image context which is critical for vision tasks. To realize this task, we create a parallel encoder-decoder training process in which the encoder serves a similar role to the standard vision transformer focusing on learning the whole contextual information, and meanwhile the decoder predicts the content of the current position so that the encoder and decoder can reinforce each other. Our method significantly improves the performance of autoregressive image modeling and achieves the best accuracy (83.9%) on the vanilla viT-Base model among methods using only imageNet-1K data. Transfer performance in downstream tasks also shows that our model achieves competitive performance. Code is available at https://***/qiy20/SAIM.

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

EGF: An Improved Edge Detection Model for Low-Resolution images 2

EGF: An Improved Edge Detection Model for Low-Resolution Ima...

引用

2nd IEEE International Conference on Futuristic Technologies, INCOFT 2023

作者： Deepak Raj, D.M. Shanmuganathan, Harinee Geetha, A. Keerthika, v. Alliance University Department of Computer Science and Engineering Banglore India

ISBN: (纸本)9798350308846

Edge detection can benefit many different industries and domains, including computer vision, machine learning, image analysis, remote sensing, thermal imaging, pattern recognition, and medical imaging. The technique of determining the borders between several objects or regions in an image is known as edge detection. The edges of an object in a picture serve as the object's limits and can reveal crucial details about the object's size, shape, and position. Since low-resolution images have low pixel densities or pixel values, which muddy the images, detecting edges in them is demanding work. This paper proposes a novel edge-detection approach called EGF (Extended Gaussian Filter) for low-resolution images. EGF utilizes the basic concept of Gaussian filter to find the edges of images. The objective function of EGF is developed to reduce the noise and pixel differentiation in images. The outcomes show that the suggested strategy outperforms the conventional edge detection technique. © 2023 IEEE.

关键词： edge detection gaussian filter image processing machine learning pre-processing

来源：评论

学校读者我要写书评

暂无评论

Ferroelectric photosensor network: an advanced hardware solution to real-time machine vision

引用

NATURE COMMUNICATIONS 2022年第1期13卷 1707页

作者： Cui, Boyuan Fan, Zhen Li, Wenjie Chen, Yihong Dong, Shuai Tan, Zhengwei Cheng, Shengliang Tian, Bobo Tao, Ruiqiang Tian, Guo Chen, Deyang Hou, Zhipeng Qin, Minghui Zeng, Min Lu, Xubing Zhou, Guofu Gao, Xingsen Liu, Jun-Ming South China Normal Univ South China Acad Adv Optoelect Inst Adv Mat Guangzhou 510006 Peoples R China South China Normal Univ South China Acad Adv Optoelect Guangdong Prov Key Lab Opt Informat Mat & Technol Guangzhou 510006 Peoples R China East China Normal Univ Key Lab Polar Mat & Devices Minist Educ Shanghai 200241 Peoples R China South China Normal Univ Natl Ctr Int Res Green Optoelect Guangzhou 510006 Peoples R China Nanjing Univ Lab Solid State Microstruct Nanjing 210093 Peoples R China Nanjing Univ Innovat Ctr Adv Microstruct Nanjing 210093 Peoples R China

Robust, fast, and low-power hardware platforms are desirable for the implementation of real-time machine vision. Here the authors develop a computing-in-sensor network using ferroelectric photo sensors with remanent-polarization-controlled photo responsivities. Nowadays the development of machine vision is oriented toward real-time applications such as autonomous driving. This demands a hardware solution with low latency, high energy efficiency, and good reliability. Here, we demonstrate a robust and self-powered in-sensor computing paradigm with a ferroelectric photosensor network (FE-PS-NET). The FE-PS-NET, constituted by ferroelectric photosensors (FE-PSs) with tunable photoresponsivities, is capable of simultaneously capturing and processing images. In each FE-PS, self-powered photovoltaic responses, modulated by remanent polarization of an epitaxial ferroelectric Pb(Zr0.2Ti0.8)O-3 layer, show not only multiple nonvolatile levels but also sign reversibility, enabling the representation of a signed weight in a single device and hence reducing the hardware overhead for network construction. With multiple FE-PSs wired together, the FE-PS-NET acts on its own as an artificial neural network. In situ multiply-accumulate operation between an input image and a stored photoresponsivity matrix is demonstrated in the FE-PS-NET. Moreover, the FE-PS-NET is faultlessly competent for real-time image processing functionalities, including binary classification between 'X' and 'T' patterns with 100% accuracy and edge detection for an arrow sign with an F-Measure of 1 (under 365 nm ultraviolet light). This study highlights the great potential of ferroelectric photovoltaics as the hardware basis of real-time machine vision.

关键词： Ferroelectrics and multiferroics Information storage

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：