检索结果-内蒙古大学图书馆

A review of digital image processing techniques for optical tweezers: from algorithms to applications

JOURNAL OF MODERN OPTICS 2025年

作者： Tang, Weijie Zhou, Ruohan Hu, Sheng Northeastern Univ Coll Informat Sci & Engn Shenyang 110819 Peoples R China Hebei Key Lab Micronano Precis Opt Sensing & Measu Qinhuangdao 066004 Peoples R China

As a ubiquitous manipulation tool, optical tweezers are widely used in biochemistry and applied physics, so that a wide range of microscopic and nanoscopic particles could be investigated. In recent years, digital image processing techniques for improving target particle observation have diversified, leading to the development of numerous automatic tasks. The technique was developed in response to the need for multi-particle manipulation and feature detection. Here we describe how digital image processing can be used to enhance the capabilities of optical manipulation. In particular, cutting-edge image processing techniques that rely on artificial intelligence development are making optical trapping more widely accessible and enabling automatic manipulation of microscopic and nanoscopic particles.

关键词： Optical tweezers image processing machine vision image algorithm artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

Exploring photosensitive nanomaterials and optoelectronic synapses for neuromorphic artificial vision

引用

CURRENT OPINION IN SOLID STATE & MATERIALS SCIENCE 2025年 35卷

作者： Lee, Hyun-Haeng Ro, Jun-Seok Kim, Kwan-Nyeong Park, Hea-Lim Lee, Tae-Woo Seoul Natl Univ Dept Mat Sci & Engn Seoul 08826 South Korea Seoul Natl Univ Sci & Technol Dept Mat Sci & Engn Seoul 01811 South Korea Seoul Natl Univ Inst Engn Res Res Inst Adv Mat Dept Chem & Biol EngnInterdisciplinary Program Bi 1 Gwanak Ro Seoul 08826 South Korea SN Display Co Ltd Seoul 08826 South Korea

Artificial vision systems will be essential in intelligent machine-vision applications such as autonomous vehicles, bionic eyes, and humanoid robot eyes. However, conventional digital electronics in these systems face limitations in system complexity, processing speed, and energy consumption. These challenges have been addressed by biomimetic approaches utilizing optoelectronic synapses inspired by the biological synapses in the eye. Nano- materials can confine photogenerated charge carriers within nano-sized regions, and thus offer significant potential for optoelectronic synapses to perform in-sensor image-processing tasks, such as classifying static multicolor images and detecting dynamic object movements. We introduce recent developments in optoelectronic synapses, focusing on use of photosensitive nanomaterials. We also explore applications of these synapses in recognizing static and dynamic optical information. Finally, we suggest future directions for research on optoelectronic synapses to implement neuromorphic artificial vision.

关键词： Optoelectronic synapses Nanomaterials Artificial vision systems Artificial synapses Neuromorphic bioelectronics

来源：评论

学校读者我要写书评

暂无评论

Task-Switchable Pre-Processor for image Compression for Multiple machine vision Tasks

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR vIDEO TECHNOLOGY 2024年第7期34卷 6416-6429页

作者： Yang, Mingyi Yang, Fei Murn, Luka Blanch, Marc Gorriz Sock, Juil Wan, Shuai Yang, Fuzheng Herranz, Luis Xidian Univ Sch Telecommun Engn Xian Peoples R China Nankai Univ Coll Comp Sci Tianjin 300350 Peoples R China BBC Res & Dev London EC4Y 0DS England Northwestern Polytech Univ Sch Elect & Informat Xian 710072 Peoples R China RMIT Univ Sch Engn Melbourne Vic 3001 Australia Univ Autonoma Barcelona Comp Vis Ctr Barcelona 08193 Spain

visual content is increasingly being processed by machines for various automated content analysis tasks instead of being consumed by humans. Despite the existence of several compression methods tailored for machine tasks, few consider real-world scenarios with multiple tasks. In this paper, we aim to address this gap by proposing a task-switchable pre-processor that optimizes input images specifically for machine consumption prior to encoding by an off-the-shelf codec designed for human consumption. The proposed task-switchable pre-processor adeptly maintains relevant semantic information based on the specific characteristics of different downstream tasks, while effectively suppressing irrelevant information to reduce bitrate. To enhance the processing of semantic information for diverse tasks, we leverage pre-extracted semantic features to modulate the pixel-to-pixel mapping within the pre-processor. By switching between different modulations, multiple tasks can be seamlessly incorporated into the system. Extensive experiments demonstrate the practicality and simplicity of our approach. It significantly reduces the number of parameters required for handling multiple tasks while still delivering impressive performance. Our method showcases the potential to achieve efficient and effective compression for machine vision tasks, supporting the evolving demands of real-world applications.

关键词： Task analysis Codecs machine vision image coding Semantics Bit rate Feature extraction image compression for machine vision Pre-processor Multiple tasks

来源：评论

学校读者我要写书评

暂无评论

Illumination Consistency processing Based on Illumination Domain Signal-Guided Unsupervised Generative Adversarial Network for Flotation Froth images

引用

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT 2025年 74卷

作者： Wang, Xiaoli Zhang, Yinan Kong, Lingshuang Zhou, Jiayi Yang, Chunhua Cent South Univ Sch Automat Changsha 410083 Peoples R China Changsha Univ Sch Elect Informat & Elect Engn Changsha 410083 Peoples R China

In the machine vision-based online monitoring of the flotation process, froth images acquired in real-time are subject to color distortion and excessive bright spots caused by inconsistent illumination, which hinders the effectiveness of image analysis and further online measurement for operating performance indicators. Current image processing methods struggle to correct color distortion and remove excess bright spots in froth images simultaneously. Therefore, in this article, an illumination domain signal-guided unsupervised generative adversarial network (IDS-GUGAN) is proposed for illumination consistency processing of flotation froth images. First, considering the varying effects of inconsistent illumination on froth images, the illumination domain signal-guided image generation (IDS-GIG) mechanism based on the theory of unsupervised disentangled representation learning is designed to achieve adaptive correction of froth images with varying degrees of distortion. Moreover, a novel lightweight double-closed-loop network architecture is introduced to support unsupervised learning utilizing unpaired froth images and improve computational efficiency, which makes the proposed approach highly suitable for industrial applications. Comprehensive experiments on a real tungsten cleaner flotation process dataset and two public benchmark datasets related to image illumination processing tasks consistently endorse the superiority of IDS-GUGAN.

关键词： Flotation froth image generative adversarial network (GAN) illumination consistency processing unsupervised disentangled representation learning Flotation froth image generative adversarial network (GAN) illumination consistency processing unsupervised disentangled representation learning

来源：评论

学校读者我要写书评

暂无评论

Scalable image Coding for Humans and machines

引用

IEEE TRANSACTIONS ON image processing 2022年 31卷 2739-2754页

作者： Choi, Hyomin Bajic, Ivan, v Simon Fraser Univ Sch Engn Sci Burnaby BC V5A 1S6 Canada

At present, and increasingly so in the future, much of the captured visual content will not be seen by humans. Instead, it will be used for automated machine vision analytics and may require occasional human viewing. Examples of such applications include traffic monitoring, visual surveillance, autonomous navigation, and industrial machine vision. To address such requirements, we develop an end-to-end learned image codec whose latent space is designed to support scalability from simpler to more complicated tasks. The simplest task is assigned to a subset of the latent space (the base layer), while more complicated tasks make use of additional subsets of the latent space, i.e., both the base and enhancement layer(s). For the experiments, we establish a 2-layer and a 3-layer model, each of which offers input reconstruction for human vision, plus machine vision task(s), and compare them with relevant benchmarks. The experiments show that our scalable codecs offer 37%-80% bitrate savings on machine vision tasks compared to best alternatives, while being comparable to state-of-the-art image codecs in terms of input reconstruction.

关键词： Task analysis image reconstruction image coding Scalability Object detection Multitasking Transforms image compression deep neural network multitask network scalable coding latent-space scalability

来源：评论

学校读者我要写书评

暂无评论

Leveraging Swin Transformer for Local-to-Global Weakly Supervised Semantic Segmentation 13

Leveraging Swin Transformer for Local-to-Global Weakly Super...

引用

13th Iranian/3rd International machine vision and image processing Conference (MvIP)

作者： Ahmadi, Rozhan Kasaei, Shohreh Sharif Univ Technol Dept Comp Engn Tehran Iran

ISBN: (纸本)9798350350494;9798350350500

In recent years, weakly supervised semantic segmentation using image-level labels as supervision has received significant attention in the field of computer vision. Most existing methods have addressed the challenges arising from the lack of spatial information in these labels by focusing on facilitating supervised learning through the generation of pseudolabels from class activation maps (CAMs). Due to the localized pattern detection of Convolutional Neural Networks (CNNs), CAMs often emphasize only the most discriminative parts of an object, making it challenging to accurately distinguish foreground objects from each other and the background. Recent studies have shown that vision Transformer (viT) features, due to their global view, are more effective in capturing the scene layout than CNNs. However, the use of hierarchical viTs has not been extensively explored in this field. This work explores the use of Swin Transformer by proposing "SWTformer" to enhance the accuracy of the initial seed CAMs by bringing local and global views together. SWTformer-v1 generates class probabilities and CAMs using only the patch tokens as features. SWTformer-v2 incorporates a multi-scale feature fusion mechanism to extract additional information and utilizes a background-aware mechanism to generate more accurate localization maps with improved cross-object discrimination. Based on experiments on the PascalvOC 2012 dataset, SWTformer-v1 achieves a 0.98% mAP higher localization accuracy, outperforming state-of-the-art models. It also yields comparable performance by 0.82% mIoU on average higher than other methods in generating initial localization maps, depending only on the classification network. SWTformer-v2 further improves the accuracy of the generated seed CAMs by 5.32% mIoU, further proving the effectiveness of the local-to-global view provided by the Swin transformer. Code available at: https://***/RozhanAhmadi/SWTformer

关键词： Weakly Supervised Semantic Segmentation Class Activation Map Hierarchical vision Transformer image-level label

来源：评论

学校读者我要写书评

暂无评论

Dual-Adaptive Heterojunction Synaptic Transistors for Efficient machine vision in Harsh Lighting Conditions

引用

ADvANCED MATERIALS 2024年第32期36卷 2404160-2404160页

作者： Wang, Yiru Nie, Shimiao Liu, Shanshuo Hu, Yunfei Fu, Jingwei Ming, Jianyu Liu, Jing Li, Yueqing He, Xiang Wang, Le Li, Wen Yi, Mingdong Ling, Haifeng Xie, Linghai Huang, Wei Nanjing Univ Posts & Telecommun NJUPT State Key Lab Organ Elect & Informat Displays Nanjing 210023 Peoples R China Nanjing Univ Posts & Telecommun NJUPT Inst Adv Mat IAM Nanjing 210023 Peoples R China Northwestern Polytech Univ Frontiers Sci Ctr Flexible Elect FSCFE MIIT Key Lab Flexible Elect KLoFE Xian 710072 Peoples R China

Photoadaptive synaptic devices enable in-sensor processing of complex illumination scenes, while second-order adaptive synaptic plasticity improves learning efficiency by modifying the learning rate in a given environment. The integration of above adaptations in one phototransistor device will provide opportunities for developing high-efficient machine vision system. Here, a dually adaptable organic heterojunction transistor as a working unit in the system, which facilitates precise contrast enhancement and improves convergence rate under harsh lighting conditions, is reported. The photoadaptive threshold sliding originates from the bidirectional photoconductivity caused by the light intensity-dependent photogating effect. Metaplasticity is successfully implemented owing to the combination of ambipolar behavior and charge trapping effect. By utilizing the transistor array in a machine vision system, the details and edges can be highlighted in the 0.4% low-contrast images, and a high recognition accuracy of 93.8% with a significantly promoted convergence rate by about 5 times are also achieved. These results open a strategy to fully implement metaplasticity in optoelectronic devices and suggest their vision processing applications in complex lighting scenes. Organic heterojunction transistors are designed to integrate light intensity-adaptive threshold sliding and second-order adaptive metaplasticity. The unique dual adaptability enables the highlighting of 0.4% low-contrast images, and the efficient recognition can be achieved benefiting from the learning rate changes in the backpropagation process. image

关键词： adaptation machine vision metaplasticity organic heterojunction visuomorphic computing

来源：评论

学校读者我要写书评

暂无评论

Advancing image captioning with v16HP1365 encoder and dual self-attention network

引用

MULTIMEDIA TOOLS AND applications 2024年第34期83卷 80701页

作者： Jaiswal, Tarun Pandey, Manju Tripathi, Priyanka Natl Inst Technol Dept Comp Applicat Raipur Chhattisgarh India

image captioning generates textual description from the corresponding input image with the help of computer vision and natural language processing. In recent years, deep learning approaches have shown promise in image captioning. This research introduces a novel image captioning architecture comprising a dual self-attention fused encoder-decoder framework. The vGG16 Hybrid Places 1365 (v16HP1365) encoder captures diverse visual features from images, enhancing the quality of image representations. In this article, the Gated Recurrent Unit (GRU) is considered as a decoder for conducting word-level language modeling. Additionally, the dual self-attention network embedded in the architecture allows for capturing contextual image information to provide accurate content descriptions and relationship understanding. Experimental evaluations on the COCO dataset showcase superior performance, surpassing existing methods in terms of captioning quality metrics. This approach holds potential for applications such as aiding the visually impaired and advancing content retrieval. Future work aims to extend the model to support multilingual captioning.

关键词： vGG16 Hybrid Places 1365 Gated recurrent unit Recurrent neural network Decoder image Captioning Dual self-attention and encoder

来源：评论

学校读者我要写书评

暂无评论

Early vision on the Focal-Plane with High Dynamic Range Pixels

Early Vision on the Focal-Plane with High Dynamic Range Pixe...

引用

2024 International Workshop on the Theory of Computational Sensing and its applications to Radar, Multimodal Sensing and Imaging, CoSeRa 2024

作者： Jaklin, Marko Garcia-Lesta, D. Lopez, P. Brea, v.M. Santiago de Compostela Spain

ISBN: (纸本)9798350365504

This paper introduces a high dynamic range pixel for early vision processing. Early vision is the first stage to subsequently extract semantic information for image processing or video analytics. This paper proposes to bring said processing to the focal plane, next to a high dynamic range image sensor working on the principle of lateral overflow capacitor. This brings the benefits of processing scenes with a wide dynamic range in a power efficient manner. Circuit simulations for edge detection, as an example of early vision processing conveyed in this paper, show that our proposal meets the accuracy typically found in applications like machine vision. Simulations are in XFAB's XS018 technology. © 2024 IEEE.

关键词： machine vision

来源：评论

学校读者我要写书评

暂无评论

Smartphone based app development with machine learning using Hibiscus sabdariffa L. extract for pH estimation

引用

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS 2025年 257卷

作者： Aydin, Omer Faruk Aydin, Merve Demir, Melisa Caliskan Kahraman, Sibel Istanbul Aydin Univ Dept Comp Programming Istanbul Turkiye Marmara Univ Dept Control & Automat Technol Istanbul Turkiye Istanbul Aydin Univ Dept Ind Engn Istanbul Turkiye Istanbul Aydin Univ Dept Food Engn Istanbul Turkiye

This study presents a novel approach for pH estimation in buffer solutions using images of solutions prepared with Hibiscus sabdariffa L. as a natural pH indicator. The images of the solutions, each displaying distinctive colours indicative of their pH levels, were transformed into standardized 200x200-pixel images through the application of image processing techniques. Following this, a pH prediction model was constructed using the Adaptive Boosting regressor algorithm. The pH values of the training data used when training the model were distributed irregularly between 0-14. The models were trained with 94 pictures and 1880 experimental values. In addition, a reliable pre-processing part has been placed into the model using image processing techniques, allowing test data to be obtained in any desired environment. The obtained training and test data were separated from noise parameters, affecting the prediction results negatively. A smartphone application based on the model has been developed and made available to everyone. This innovative methodology bridges the gap between traditional pH measurement techniques and computer vision, offering amore accessible and eco-friendly means of pH assessment. The practical applications of this research extend to various fields, including environmental monitoring, agriculture, and educational settings.

关键词： machine learning image processing pH estimation Hibiscus sabdariffa L. Smartphone

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：