检索结果-内蒙古大学图书馆

Computational Imaging and Artificial Intelligence: The Next Revolution of Mobile vision

PROCEEDINGS OF THE IEEE 2023年第12期111卷 1607-1639页

作者： Suo, Jinli Zhang, Weihang Gong, Jin Yuan, Xin Brady, David J. Dai, Qionghai Tsinghua Univ Dept Automat Beijing 100084 Peoples R China Tsinghua Univ Inst Brain & Cognit Sci Beijing 100084 Peoples R China Shanghai Artificial Intelligence Lab Shanghai 200232 Peoples R China Westlake Univ Res Ctr Ind Future Hangzhou 310030 Zhejiang Peoples R China Westlake Univ Sch Engn Hangzhou 310030 Zhejiang Peoples R China Univ Arizona Wyant Coll Opt Sci Tucson AZ 85721 USA

Signal capture is at the forefront of perceiving and understanding the environment;thus, imaging plays a pivotal role in mobile vision. Recent unprecedented progress in artificial intelligence (AI) has shown great potential in the development of advanced mobile platforms with new imaging devices. Traditional imaging systems based on the "capturing images first and processing afterward" mechanism cannot meet this explosive demand. On the other hand, computational imaging (CI) systems are designed to capture high-dimensional data in an encoded manner to provide more information for mobile vision systems. Thanks to AI, CI can now be used in real-life systems by integrating deep learning algorithms into the mobile vision platform to achieve a closed loop of intelligent acquisition, processing, and decision-making, thus leading to the next revolution of mobile vision. Starting from the history of mobile vision using digital cameras, this work first introduces the advancement of CI in diverse applications and then conducts a comprehensive review of current research topics combining CI and AI. Although new-generation mobile platforms, represented by smart mobile phones, have deeply integrated CI and AI for better image acquisition and processing, most mobile vision platforms, such as self-driving cars and drones only loosely connect CI and AI, and are calling for a closer integration. Motivated by this fact, at the end of this work, we propose some potential technologies and disciplines that aid the deep integration of CI and AI and shed light on new directions in the future generation of mobile vision platforms.

关键词： Artificial intelligence (AI) autonomous driving brain science cameras cloud computing computational imag-ing (CI) deep learning edge cloud edge computing machine learning machine vision mobile vision neural networks optics self-driving

来源：评论

学校读者我要写书评

暂无评论

Raspberry Pi-powered door lock with facial recognition

Raspberry Pi-powered door lock with facial recognition

引用

2024 IEEE International Students' Conference on Electrical, Electronics and Computer Science, SCEECS 2024

作者： Jha, Anjali Bulbule, Rajesh Nagrale, Nilesh Belambe, Tanavi Dit E&tc Dept. Pune India

ISBN: (纸本)9798350348460

Facial recognition is a widely-used process that aims to detect and verify an individual's identity. This technique is employed in various applications, such as image and video analysis, surveillance, and security systems. OpenCV is a computer vision technique that enables users to comprehend the storage, modification and feature extraction of images and videos. The advancement of artificial intelligence is heavily reliant on computer vision techniques, particularly in self-driving cars, robotics, and image-editing software. The OpenCV Library is a valuable tool for processing and displaying camera images. This research utilizes the Haar Cascade Classifier, a machine learning-based algorithm, to process a user's face image and detect faces accurately. © 2024 IEEE.

关键词： computer vision Facial recognition Haar Cascade Classifier image processing machine learning OpenCV robotics security systems self-driving cars

来源：评论

学校读者我要写书评

暂无评论

Application of computer vision techniques to fermented foods: An overview

引用

TRENDS IN FOOD SCIENCE & TECHNOLOGY 2025年 160卷

作者： Song, Zheli Li, Yuanbo Zhao, Hongyuan Liu, Xiaogang Ding, Hailong Ding, Qiansu Ma, Dongna Liu, Shuangping Mao, Jian Jiangnan Univ Sch Artificial Intelligence & Comp Sci Wuxi 214122 Jiangsu Peoples R China Jiangnan Univ Natl Engn Res Ctr Cereal Fermentat & Food Biomfg Sch Food Sci & Technol State Key Lab Food Sci & Technol Wuxi 214122 Jiangsu Peoples R China Jiangnan Univ Shaoxing Key Lab Tradit Fermentat Food & Human Hlt Shaoxing Ind Technol Res Inst Shaoxing 312000 Zhejiang Peoples R China Luzhou Laojiao Grp Co Ltd Luzhou 646000 Sichuan Peoples R China Univ Minnesota Twin Cities Dept Agron & Plant Genet St Paul MN USA

Background: Fermented foods are products processed through microbial fermentation and are widely appreciated by consumers around the world for their unique flavors. With advancements in industrial technology and increasing consumer demand, modern techniques are being progressively integrated into the production and quality control of fermented foods to enhance production efficiency and product quality. Among these innovations, computer vision technology stands out as particularly impactful. Scope and approach: This paper provides an overview of the applications of computer vision in the field of fermented foods, focusing on its technical algorithms and applications within the food industry. It outlines the specific uses of computer vision technology across different types of fermented foods and discusses the relevant techniques employed. Finally, this review highlights the transformative potential of adaptive learning and multimodal fusion in addressing current limitations of computer vision for fermented food monitoring. Key findings and conclusions: The adoption of computer vision technology has significantly improved both the efficiency and accuracy of quality control processes in fermented food production. Through non-contact real-time monitoring, researchers can quickly identify the dynamic changes in microorganisms and related parameter indicators during fermentation and evaluate their impact on food quality. These technologies have not only boosted the efficiency of fermented food production but have also enhanced control over product flavor and safety assessments. Despite ongoing challenges in technology implementation and data analysis, the continuous advancements in deep learning and image processing technologies are expected to increase the impact of computer vision in the field of fermented foods, driving sustainable industry development.

关键词： Computer vision Fermented foods Food quality evaluation Nondestructive testing machine learning

来源：评论

学校读者我要写书评

暂无评论

Momentum-Space Tunable Metasurfaces for Switchable image processing

引用

ADVANCED OPTICAL MATERIALS 2025年

作者： Zhang, Kai Wang, Shuo Qiu, Jumin Yang, Muyi Liu, Tingting Xiao, Shuyuan Staude, Isabelle Pertsch, Thomas Wang, Yu Zou, Chengjun Chinese Acad Sci Inst Microelect Beitucheng West Rd 3 Beijing 100029 Peoples R China Univ Chinese Acad Sci Beijing 101408 Peoples R China Nanchang Univ Sch Phys & Mat Sci Nanchang 330031 Peoples R China Friedrich Schiller Univ Jena Inst Solid State Phys D-07743 Jena Germany Friedrich Schiller Univ Jena Abbe Ctr Photon D-07745 Jena Germany Nanchang Univ Sch Informat Engn Nanchang 330031 Peoples R China Nanchang Univ Inst Adv Study Nanchang 330031 Peoples R China

The exceptional ability of optical metasurfaces to manipulate light has enabled integrated analog computing and image processing in ultracompact, energy-efficient platforms that support high speeds. To date, metasurfaces have demonstrated various analog processing functions, including differentiation, convolution, and classification. However, a fundamental limitation of existing designs is their static functionality, which restricts adaptability to diverse application scenarios. To address this challenge, momentum-space reconfigurable metasurfaces operating in the near-infrared range are experimentally demonstrated, capable of switchable image processing functions including image differentiation and bright-field imaging. These meta-devices are achieved by integrating nematic liquid crystals with silicon metasurfaces that support resonances of quasi-bound states in the continuum (quasi-BICs). The quasi-BIC modes enable further design freedom over the angular dispersion of metasurfaces. The results showcase an electrically tunable, CMOS-compatible approach to reconfigurable optical computing, offering significant potential for applications such as online training of diffractive neural networks, machine vision, and augmented reality.

关键词： dielectric metasurfaces liquid crystals momentum-space tunable metasurfaces optical image processing

来源：评论

学校读者我要写书评

暂无评论

Deep learning-based automated steel surface defect segmentation: a comparative experimental study

引用

MULTIMEDIA TOOLS AND applications 2023年第1期83卷 2995-3018页

作者： Sime, Dejene M. Wang, Guotai Zeng, Zhi Peng, Bei Univ Elect Sci & Technol China Sch Mech & Elect Engn Chengdu 611731 Sichuan Peoples R China

The use of machine vision and deep learning for intelligent industrial inspection has become increasingly important in automating the production processes. Despite the fact that machine vision approaches are used for industrial inspection, deep learning-based defect segmentation has not been widely studied. While state-of-the-art segmentation methods are often tuned for a specific purpose, extending them to unknown sets or other datasets, such as defect segmentation datasets, require further analysis. In addition, recent contributions and improvements in image segmentation methods have not been extensively investigated for defect segmentation. To address these problems, we conducted a comparative experimental study on several recent state-of-the-art deep learning-based segmentation methods for steel surface defect segmentation and evaluated them on the basis of segmentation performance, processing time, and computational complexity using two public datasets, NEU-Seg and Severstal Steel Defect Detection (SSDD). In addition we proposed and trained a hybrid transformer-based encoder with CNN-based decoder head and achieved state-of-the-art results, a Dice score of 95.22% (NEU-Seg) and 95.55% (SSDD).

关键词： machine vision Automated inspection Deep learning Defect segmentation

来源：评论

学校读者我要写书评

暂无评论

Make a Long image Short: Adaptive Token Length for vision Transformers

Make a Long Image Short: Adaptive Token Length for Vision Tr...

引用

5th International Workshop on Learning with Imbalanced Domains - Theory and applications / European Conference on machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD)

作者： Zhou, Qiqi Zhu, Yichen Shanghai Univ Elect Power Shanghai Peoples R China Midea Grp Shanghai Peoples R China

ISBN: (纸本)9783031434143;9783031434150

The vision transformer is a model that breaks down each image into a sequence of tokens with a fixed length and processes them similarly to words in natural language processing. Although increasing the number of tokens typically results in better performance, it also leads to a considerable increase in computational cost. Motivated by the saying "A picture is worth a thousand words," we propose an innovative approach to accelerate the ViT model by shortening long images. Specifically, we introduce a method for adaptively assigning token length for each image at test time to accelerate inference speed. First, we train a Resizable-ViT (ReViT) model capable of processing input with diverse token lengths. Next, we extract token-length labels from ReViT that indicate the minimum number of tokens required to achieve accurate predictions. We then use these labels to train a lightweight Token-Length Assigner (TLA) that allocates the optimal token length for each image during inference. The TLA enables ReViT to process images with the minimum sufficient number of tokens, reducing token numbers in the ViT model and improving inference speed. Our approach is general and compatible with modern vision transformer architectures, significantly reducing computational costs. We verified the effectiveness of our methods on multiple representative ViT models on image classification and action recognition.

关键词： vision transformer token compression

来源：评论

学校读者我要写书评

暂无评论

Remote Sensing Scene Classification with Masked image Modeling 2

Remote Sensing Scene Classification with Masked Image Modeli...

引用

Conference on Microwave Remote Sensing - Data processing and applications ii

作者： Wang, Liya Tien, Alex MITRE Corp Mclean VA 22102 USA

ISBN: (纸本)9781510666931;9781510666948

Remote sensing scene classification has been extensively studied for its critical roles in geological survey, oil exploration, traffic management, earthquake prediction, wildfire monitoring, and intelligence monitoring. In the past, the machine Learning (ML) methods for performing the task mainly used the backbones pretrained in the manner of supervised learning (SL). As Masked image Modeling (MIM), a self-supervised learning (SSL) technique, has been shown as a better way for learning visual feature representation, it presents a new opportunity for improving ML performance on the scene classification task. This research aims to explore the potential of MIM pretrained backbones on four well-known classification datasets: Merced, AID, NWPU-RESISC45, and Optimal-31. Compared to the published benchmarks, we show that the MIM pretrained vision Transformer (ViTs) backbones outperform other alternatives (up to 18% on top 1 accuracy) and that the MIM technique can learn better feature representation than the supervised learning counterparts (up to 5% on top 1 accuracy). Moreover, we show that the general-purpose MIM-pretrained ViTs can achieve competitive performance as the specially designed yet complicated Transformer for Remote Sensing (TRS) framework. Our experiment results also provide a performance baseline for future studies.

关键词： Remote sensing classification self-supervised learning (SSL) Masked image Modeling (MIM) vision Transformer (ViTs)

来源：评论

学校读者我要写书评

暂无评论

Shape Preservation in image Style Transfer for Gaze Estimation 18

Shape Preservation in Image Style Transfer for Gaze Estimati...

引用

18th International Conference on machine vision and applications (MVA)

作者： Mushiake, Daiki Otomo, Kentaro Nakatani, Chihiro Ukita, Norimichi Toyota Technol Inst Toyota Japan

ISBN: (纸本)9784885523434

This paper proposes image style transfer with shape preservation for gaze estimation. While several shape preservation constraints are proposed, we present additional shape preservation constraints using (i) dense pixelwise correspondences between the original and its transferred images and (ii) task-driven learning using gaze estimation error for directly improving gaze direction estimation. A variety of experiments with other SOTA methods, publicly-available datasets, and ablation studies validate the effectiveness of our method.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

Advanced machine vision techniques for real-time quality detection and grading of navel oranges using high-resolution imaging and deep learning algorithms

Advanced machine vision techniques for real-time quality det...

引用

International Conference on Computer vision, Robotics, and Automation Engineering, CRAE 2024

作者： Zhang, Xiaoyan Beijing Institute of Technology Zhuhai Zhuhai519000 China

ISBN: (纸本)9781510682283

In recent years, the rapid development of computer vision and artificial intelligence has significantly advanced agricultural applications, particularly in the quality detection and grading of navel oranges. This review explores the latest advancements in machine vision technologies for navel orange quality assessment, focusing on image processing, deep learning algorithms, and multispectral imaging. image processing techniques, fundamental to machine vision systems, enable the extraction of visual features such as color, shape, and texture, enhancing detection accuracy. Recent studies have demonstrated the efficacy of deep learning algorithms, particularly convolutional neural networks (CNNs), in achieving high-precision grading and real-time performance. Furthermore, multispectral and hyperspectral imaging technologies offer rich spectral information, facilitating more accurate quality detection and maturity assessment. Despite significant progress, challenges such as complex natural environments, lighting conditions, and the high cost of imaging equipment persist. Future research directions include integrating multi-source data fusion, developing efficient deep learning algorithms, and promoting cost-effective imaging technologies. Our proposed advanced algorithm employs the DAHENG machine vision experimental platform and MER-132-30UC camera for high-resolution image acquisition, with HALCON software for sophisticated image processing and real-time classification. This system enhances accuracy and efficiency in navel orange quality detection and grading, addressing key challenges in current agricultural practices. © 2024 SPIE.

关键词： Citrus fruits

来源：评论

学校读者我要写书评

暂无评论

Atomvision: A machine vision Library for Atomistic images

引用

JOURNAL OF CHEMICAL INFORMATION AND MODELING 2023年第6期63卷 1708-1722页

作者： Choudhary, Kamal Gurunathan, Ramya DeCost, Brian Biacchi, Adam NIST Mat Measurement Lab Gaithersburg MD 20899 USA NIST Phys Measurement Lab Gaithersburg MD 20899 USA

Computer vision techniques have immense potential for materials design applications. In this work, we introduce an integrated and general-purpose Atomvision library that can be used to generate and curate microscopy image (such as scanning tunneling microscopy and scanning transmission electron microscopy) data sets and apply a variety of machine learning techniques. To demonstrate the applicability of this library, we (1) establish an atomistic image data set of about 10 000 materials with large structural and chemical diversity, (2) develop and compare convolutional and atomistic line graph neural network models to classify the Bravais lattices, (3) demonstrate the application of fully convolutional neural networks using U-Net architecture to pixelwise classify atom versus background, (4) use a generative adversarial network for super resolution, (5) curate an image data set on the basis of natural language processing using an open-access arXiv data set, and (6) integrate the computational framework with experimental microscopy images for Rh, Fe3O4, and SnS systems. The Atomvision library is available at https://***/ usnistgov/atomvision.

关键词： Scanning tunneling microscopy

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：