检索结果-内蒙古大学图书馆

Sequential RGB color imaging with a millimeter-scale monochrome camera with a rolling shutter

APPLIED OPTICS 2023年第17期62卷 4496-4504页

作者： Anspach, Jordan Dickensheets, David L. Montana State Univ Elect & Comp Engn Dept Bozeman MT 59717 USA

Applications are growing for ultracompact millimeter-scale cameras. For color images, these sensors commonly uti-lize a Bayer mask, which can negatively and perceptibly have an impact on image resolution and quality, especially for low pixel-count submillimeter sensors. To alleviate this, we built a time-multiplexed RGB LED illumination system synchronized to the rolling shutter of a monochrome camera. The sequential images are processed and displayed as near real-time color video. Experimental comparison with an identical sensor with a Bayer color mask showed significant improvement in the MTF curves and to perceived image clarity. Trade-offs with respect to system complexity and color motion artifacts are discussed.

关键词： image processing image quality real time implementation Reconstruction algorithms Spectral imaging Tunable filters

来源：评论

学校读者我要写书评

暂无评论

Quantum edge detection of medical images using novel enhanced quantum representation and hill entropy approach

引用

SIGNAL image AND video processing 2024年第2期18卷 1803-1819页

作者： Chaduvula, Kavitha Indira, D. N. V. S. L. S. Markapudi, Baburao Kalyanapu, Srinivas Seshadri Rao Gudlavalleru Engn Coll Dept Informat Technol Gudlavalleru 521356 Andhra Pradesh India Seshadri Rao Gudlavalleru Engn Coll Dept Comp Sci & Engn Gudlavalleru 521356 Andhra Pradesh India Seshadri Rao Gudlavalleru Engn Coll Dept Artificial Intelligence & Data Sci Gudlavalleru 521356 Andhra Pradesh India

Cutting-edge medical image analysis, driven by quantum-based techniques, offers automated information extraction from images, revolutionizing health care. Traditional methods are being outpaced by the demand for advanced real-time digital image processing. This article introduces an innovative approach to medical image edge detection based on entropy. In recent years, various quantum representation models have emerged, addressing the complex nature of medical images characterized by dark backgrounds and low contrast. To enhance image quality, the article introduces the novel enhanced quantum representation model, which leverages the colour operations of Caraiman's quantum image representation model to improve the greyscale values of individual pixels. However, the article acknowledges that quantum noise remains a challenge in image processing due to statistical fluctuations in medical imaging. To combat this, the article introduces a neural network-based hybrid filter, comprising neural edge enhancers and bilateral filters. The neural filter acts as a fusion operator, effectively eliminating quantum noise from the output image. Another challenge addressed in this work is the time complexity of edge detection. The article presents a novel methodology for edge extraction based on Hill entropy for medical images, which involves segmenting the image into objects and backgrounds using a threshold value. This method aims to reduce computation time while producing high-quality edge detection. The proposed algorithm is implemented using MATLAB software and evaluated on various images. The results demonstrate the algorithm's effectiveness, with a notably higher peak signal-to-noise ratio of 41.5312%, a lower mean square error of 0.0214%, and an improved contrast-to-noise ratio of 42.59%. These outcomes underscore the algorithm's superior performance in edge detection for medical images, offering a remarkable accuracy of 97.5% compared to traditional methods.

关键词： Quantum image processing Edge extraction Quantum representation model Neural network Hill entropy Hybrid filter Quantum noise

来源：评论

学校读者我要写书评

暂无评论

image tampering localization network based on multi-class attention and progressive subtraction

引用

SIGNAL image AND video processing 2025年第1期19卷 1-10页

作者： Shao, Yunxue Dai, Kun Wang, Lingfeng Nanjing Tech Univ Coll Artif Intelligence Nanjing 211816 Peoples R China Weiqiao UCAS Sci & Technol Pk Binzhou Inst Technol Binzhou 256606 Peoples R China Beijing Univ Chem Technol Coll Informat Sci & Technol Beijing 100029 Peoples R China

image tamper localization is an important research topic in the field of computer vision, which aims at identifying and localizing human-modified regions in images. In this paper, we propose a new image tampering localization network, which is named MAPS-Net. It combines the advantages of efficient multi-scale attention, shift operation, and progressive subtraction, which not only improves the sensitivity and generalization to novel data tampering behaviors but also significantly reduces the computation time. MAPS-Net consists of upper and lower branches, which are the fake edge-enhancing branch and the interfering factors-weakening branch. The fake edge-enhancing branch uses an efficient multi-scale edge residual module to enhance the expressiveness of the features, while the interfering factors-weakening branch uses progressive subtraction to weaken the interference of image content fluctuations in capturing general tampering behaviors. Finally, the features of both branches are fused with a position attention mechanism via a shift operation to capture the spatial relationships between different views. Experiments conducted on several publicly available datasets show that MAPS-Net outperforms existing mainstream models in both image tampering detection and localization, especially in image tampering localization in real scenes. Code is available at: https://***/dklive1999/MAPS-Net.

关键词： image tampering localization EMER modules Progressive subtraction Shift operation

来源：评论

学校读者我要写书评

暂无评论

HFR-video-Based Fingertip Velocimeter for Multifinger Tapping Detection

引用

IEEE SENSORS JOURNAL 2023年第10期23卷 10673-10682页

作者： Wang, Feiyue Hu, Shaopeng Shimasaki, Kohei Ishii, Idaku Hiroshima Univ Grad Sch Adv Sci & Engn Smart Robot Lab Higashihiroshima 7398527 Japan

In this study, we propose a novel concept of a software-based fingertip velocimeter using high-frame-rate (HFR) video processing that can simultaneously estimate when and where an operator taps with his/her finger by detecting the high-frequency component that develops when the fingertip actively contacts something. Our softwarebased fingertip velocimeter can precisely estimate the velocities of multiple fingers through HFR video processing in real time. Digital image correlation (DIC) operating at every frame for sub-pixel-precision velocity estimation is hybridized with convolution neural network (CNN)-based object detection operating at intervals of dozens of frames to robustly update the fingertip ROI regions during the frame-by-frame DIC operation. We developed a real-time multifinger tapping detection system that can execute DIC operation on 720x540 resolution images at 500 frames/s with CNN-based fingertip detection at 30 frames/s. By presenting several experimental results for finger tapping detection, including virtual keyboard interaction with a ten-finger keyboard input, the effectiveness of our fingertip velocimeter as a finger tapping interface was demonstrated, which can simultaneously estimate the tapping positions and moments of multiple fingers when finger tapping is performed ten times or more in a second.

关键词： Convolution neural network (CNN)-based fingertip detection digital image correlation (DIC) fingertip velocimeter high-speed vision software sensor.

来源：评论

学校读者我要写书评

暂无评论

LUMINATE: LINGUISTIC UNDERSTANDING AND MULTI-GRANULARITY INTERACTION FOR video OBJECT SEGMENTATION 31

LUMINATE: LINGUISTIC UNDERSTANDING AND MULTI-GRANULARITY INT...

引用

2024 International conference on image processing

作者： Tekchandani, Rahul Maheshwari, Ritik Hambarde, Praful Tazi, Satya Narayan Vipparthi, Santosh Kumar Murala, Subrahmanyam CVPR Lab Indian Inst Technol Ropar Ropar India GEC Ajmer Ajmer India CVPR Lab Trinity Coll Dublin Dublin Ireland

ISBN: (纸本)9798350349405;9798350349399

Referring video Object Segmentation (R-VOS) is a challenging task that involves segmenting objects in a video based on linguistic descriptions. In this paper, we introduce a novel multi-granularity referring video Object segmentation framework, termed as LUMINATE. The LUMINATE framework introduces a streamlined approach to cross-modal fusion. The proposed LUMINATE enhanced interaction between visual and textual modalities begins with cross-attention between the vision encoder's query and the text encoder's key-value pairs, and vice versa. The results are then concatenated with the respective queries of the vision and text encoders, fostering a comprehensive understanding of semantic relationships. The combined features are fed into the Transformer Encoder for further refinement and integration into the segmentation pipeline. Extensive experiments on benchmark datasets, including Ref-DAVIS, demonstrate that our proposed LUMINATE approach achieves better results than state-of-the-art methods in terms of Jaccard and F-measure evaluation metrics. Furthermore, the efficiency of our multi-object R-VOS variant is highlighted, achieving a threefold speed improvement while maintaining satisfactory segmentation performance. The proposed approach contributes to advancing the capabilities of R-VOS models, paving the way for improved multimodal reasoning and real-world applications.

关键词： Referring video object segmentation Cross-modal Fusion Transformer

来源：评论

学校读者我要写书评

暂无评论

Human elbow flexion behaviour recognition based on posture estimation in complex scenes

引用

IET image processing 2023年第1期17卷 178-192页

作者： Gong, Faming Li, Yunjing Yuan, Xiangbing Liu, Xin Gao, Yating China Univ Petr East China Coll Comp Sci & Technol Qingdao 266580 Peoples R China Sinopec Gtp Offshore Oil Prod Plant Dongying Peoples R China

Aiming at the difficulty of recognising the smoking and making phone calls behaviours of people in the complex background of construction sites, a method of recognising human elbow flexion behaviour based on posture estimation is proposed. The human upper body key points needed are retrained based on AlphaPose to achieve human object localization and key points detection. Then, a mathematical model for human elbow flexion behaviour discrimination (HEFBD model) is proposed based on human key points, as well as locating the region of interest for small object detection and reducing the interference of complex background. A super-resolution image reconstruction method is used for pre-processing some blurred images. In addition, YOLOv5s is improved by adding a small object detection layer and integrating a convolutional block attention model to improve the detection performance. The detection precision of this method is improved by 5.6%, and the false detection rate caused by complex background is reduced by 13%, which outperforms other state-of-the-art detection methods and meets the requirement of real-time performance.

关键词： human upper body key points feature extraction posture estimation detection performance human key points video signal processing human elbow flexion behaviour discrimination complex background key points detection Neural nets HEFBD model mathematical model super-resolution image reconstruction method Optical, image and video signal processing Computer vision and image processing techniques convolutional block attention model construction sites convolutional neural nets smoking human object localization object detection state-of-the-art detection methods image recognition making phone calls pose estimation object detection layer human elbow flexion behaviour recognition video signal processing biomechanics image motion analysis image reconstruction image resolution complex scenes false detection rate detection precision

来源：评论

学校读者我要写书评

暂无评论

Efficient 2D DCT architecture based on approximate compressors for image compression with HEVC intra-prediction

引用

JOURNAL OF real-time image processing 2023年第2期20卷 22页

作者： Akman, Ali Cekli, Serap Istanbul Ticaret Univ Dept Comp Engn Istanbul Turkiye Istanbul Univ Cerrahpasa Dept Elect Elect Engn Istanbul Turkiye

This study presents a design of two-dimensional (2D) discrete cosine transform (DCT) architecture to be used with high-efficiency video coding (HEVC) intra-prediction method in image compression. Since the amount of calculation required by the transform step in HEVC is high and accordingly the power consumption is high, a novel DCT architecture for HEVC is proposed to reduce this calculation complexity and power consumption. This architecture is based on erroneous calculations in the steps, which can be ignored in the quantizing step. For this purpose, approximate 5:3 compressor circuits with different error rates are designed and used instead of addition/subtraction in DCT architecture. This DCT architecture is designed to support 4 x 4, 8 x 8, 16 x 16 and 32 x 32 transform blocks. The designed architecture is performed on FPGA and experiments are conducted. In these experiments, hardware performance parameters are examined, and it is proved that the use of approximate compressor can provide advantages on power consumption and physical area. The efficiency of the proposed architecture is investigated by performing image compression and video coding tests.

关键词： HEVC Intra-prediction image compression Approximate compressor 2D DCT architecture

来源：评论

学校读者我要写书评

暂无评论

FPGA-based Moving Object Detection Leveraging Frame Difference and Morphological Filtering 2

FPGA-based Moving Object Detection Leveraging Frame Differen...

引用

2nd International conference on Computer Vision and Intelligent Technology

作者： Wang, Yuexin Jing, Xiaochen Wang, Wenhao Zhu, Mengli Zhu, Dongchen Chinese Acad Sci Shanghai Inst Microsyst & Informat Technol Bion Vis Syst Lab Shanghai Peoples R China Univ Chinese Acad Sci Beijing Peoples R China

ISBN: (纸本)9798331540050;9798331540043

Moving object detection plays a significant role in video surveillance. However, existing moving object detection methods often rely on software implementations, which means low real-time performance and high power consumption. This paper's core detection algorithm employs a frame difference method that is enhanced by morphological filtering. Additionally, we propose an architecture that integrates FPGA(Field Programmable Gate Array) and ARM(Advanced RISC Machine), fully leveraging the parallel computing advantages of FPGA and the high processing efficiency of ARM. The system utilizes a ZYNQ7000 SoC, coupled with an OV7725 camera for image capture and DDR3 SDRAM for data caching, to address the challenges of high-speed data processing and low power consumption. Experimental results show that the system meets the requirements for high real-time performance and low power consumption with a frame rate of 85.9375 frames per second and a total power consumption of 1.101 W.

关键词： FPGA frame difference method moving object detection real-time video processing

来源：评论

学校读者我要写书评

暂无评论

Enhancing Intra Block Copy Prediction for Plenoptic 2.0 video Coding under Macropixel Constraints

Enhancing Intra Block Copy Prediction for Plenoptic 2.0 Vide...

引用

2024 conference on Visual Communications and image processing

作者： Vinh Van Duong Thuc Nguyen Huu Yim, Jonghoon Jeon, Byeungwoo Sungkyunkwan Univ Dept Elect & Comp Engn Seoul South Korea Samsung Res Seoul South Korea

ISBN: (纸本)9798331529543;9798331529550

In this paper we introduce a novel approach to better utilize the intra block copy (IBC) prediction tool in encoding lenslet light field video (LFV) captured using plenoptic 2.0 cameras. Although the IBC tool has been recognized as promising for encoding LFV content, its fundamental limit due to its original design rooted for encoding conventional videos suggests slight modification possibility to better suit the property of LFV content. Observing the inherently large amount of repetitive image patterns due to the microlens array (MLA) structure of plenoptic cameras, several techniques are suggested in this paper to enhance the IBC coding tool itself for more efficiently encoding LFV contents. Our experimental results demonstrate that the proposed method significantly enhances the IBC coding performance in case of encoding LFV contents while concurrently reducing encoding time.

关键词： Light field plenoptic video coding fast motion estimation video coding microlens image

来源：评论

学校读者我要写书评

暂无评论

Performance comparison of throughput between AVC, HEVC and VVC hardware CABAC decoder

引用

JOURNAL OF real-time image processing 2023年第2期20卷 26页

作者： Menasri, Wahiba Skoudarli, Abdellah Univ Yahia Fares Medea Fac Technol Lab Renewable Energies & Mat Medea 26000 Algeria USTHB Fac Elect & Informat Lab Image Proc & Radiat BP 32 Algires Algeria

This paper proposes a performance comparison of throughput between context-based adaptive binary arithmetic decoding (CABAC) processes adopted in the three recent video codecs: advanced video coding (AVC), high efficiency video coding (HEVC), and versatile video coding (VVC). Consequently, in order to highlight the performance and the modification in three CABAC versions: the three main stages of CABAC decoding Context Selection and Modeling (CSM), Binary Arithmetic Decoding (BAD) and De-binarization (DBZ) are designed, described in VHDL language and implemented on Field Programmable Gate Array (FPGA) device. Firstly, the most efficient CSM is obtained for CABAC VVC with maximum frequency of 183.8 MHz and low power consumption of 0.346 mW. Secondly, the BAD in RM is modified only in the last video standard VVC. The most efficient design of BAD RM is given in the AVC and HEVC version of CABAC with maximum frequency of 261.75 MHz. Thirdly, the BAD in BM and TM are the same adopted in the three CABAC version, with maximum frequencies of 439.657 MHz and 798.861 MHz, respectively. Thirdly, the de-binarization codes are also the same adopted in the three last CABAC versions. Consequently, high frequency of 789.26 MHz is obtained in DBZ but the resources cost and power consumption are greater than that given in CSM and BAD stages. Finally, high throughput of 178.13 bins/s is given by our proposed design of VVC CABAC decoder.

关键词： AVC VVC HEVC CABAC Throughput FPGA

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：