检索结果-内蒙古大学图书馆

A Survey on Self-Supervised Learning: Algorithms, applications, and Future Trends

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND machine INTELLIGENCE 2024年第12期46卷 9052-9071页

作者： Gui, Jie Chen, Tuo Zhang, Jing Cao, Qiong Sun, Zhenan Luo, Hao Tao, Dacheng Southeast Univ Sch Cyber Sci & Engn Nanjing 210096 Peoples R China Univ Sydney Sch Comp Sci Camperdown NSW 2050 Australia Nanyang Technol Univ Coll Comp & Data Sci Nanyang Ave Singapore 639798 Singapore JD Explore Acad Beijing 101111 Peoples R China Chinese Acad Sci Ctr Res Intelligent Percept & Comp Beijing 100190 Peoples R China Alibaba Grp Hangzhou 310052 Peoples R China

Deep supervised learning algorithms typically require a large volume of labeled data to achieve satisfactory performance. However, the process of collecting and labeling such data can be expensive and time-consuming. Self-supervised learning (SSL), a subset of unsupervised learning, aims to learn discriminative features from unlabeled data without relying on human-annotated labels. SSL has garnered significant attention recently, leading to the development of numerous related algorithms. However, there is a dearth of comprehensive studies that elucidate the connections and evolution of different SSL variants. This paper presents a review of diverse SSL methods, encompassing algorithmic aspects, application domains, three key trends, and open research questions. First, we provide a detailed introduction to the motivations behind most SSL algorithms and compare their commonalities and differences. Second, we explore representative applications of SSL in domains such as image processing, computer vision, and natural language processing. Lastly, we discuss the three primary trends observed in SSL research and highlight the open questions that remain.

关键词： Self-supervised learning contrastive learning generative model representation learning transfer learning

来源：评论

学校读者我要写书评

暂无评论

Diagnosis of Advanced Diabetic Retinopathy from Fundus image Using machine Learning

Diagnosis of Advanced Diabetic Retinopathy from Fundus Image...

引用

2024 International Conference on Artificial Intelligence and Quantum Computation-Based Sensor applications, ICAIQSA 2024

作者： Mankar, Bhagyashri Bawankar, Bhushan Gabhane, Diksha Wanjari, Mohini Department of Information Technology Nagpur India J.D. College of Engineering Department of Artificial Intelligence and Machine Learning Nagpur India

ISBN: (纸本)9798331517953

Diabetic Retinopathy is an eye disease which mainly caused to the diabetic patients. The patients who have been suffering from diabetes since long time have major chances to suffer from Diabetic Retinopathy (DR) as well. If the DR is not detected early, the patient may experience vision loss, so early detection of this disease is very important. Micro-aneurysm and exudates are the features from which we can identify whether patients suffer from DR or not. Identifying retinal micro-aneurysms and exudates from the retinal fundus image is the primary goal of the proposed work to test for diabetic retinopathy. Firstly, the pre-processing is performed on Fundus image. Preprocessing techniques include adaptive median filtering, adaptive histogram equalization, and grey scale conversion. Feature extraction is utilized to extract the features from the fundus image after pre-processing. The extracted features are act as an input to the SvM classifier. The fundus image is classified as normal or impacted using a vector machine. © 2024 IEEE.

关键词： Median filters

来源：评论

学校读者我要写书评

暂无评论

Bionic model of information processing in the retina

Bionic model of information processing in the retina

引用

2024 International Conference on Artificial Intelligence, Computer, Data Sciences, and applications, ACDSA 2024

作者： Bohlmann, Sabine Clausthal University of Technology Institute of Computer Science Clausthal-Zellerfeld Germany

ISBN: (纸本)9798350394528

Today, bionic models for vision applications base on the general information pathways, structure and characteristics of the visual system implemented in intelligent algorithms, mostly based on AI, to improve the resolution, colour and contrast of images. These approaches often use machine learning techniques and neural networks, or simply mathematical pre-processing filters. The major drawback of all these methods is that they require a lot of computing power. But nature can't afford that kind of processing power just to improve contrast/resolution, detect contours or the true colours of an image and it doesn't use learning strategies or high-order mathematical equations to detect contours. Nature works very simply and very efficiently, it can only work with potentials and their differences. There is therefore a huge gap between AI-based approaches and image processing in the human visual system. This opens the way for a new bionic model of the general pathways of the retina - using only the simple principles of nature, potentials and their differences. © 2024 IEEE.

关键词： image enhancement

来源：评论

学校读者我要写书评

暂无评论

A Design of Smart Unmanned vending machine for New Retail Based on Binocular Camera and machine vision

引用

IEEE CONSUMER ELECTRONICS MAGAZINE 2022年第4期11卷 21-31页

作者： Liu, Lizheng Cui, Jianjun Huan, Yuxiang Zou, Zhuo Hu, Xiaoming Zheng, Lirong Fudan Univ Shanghai Peoples R China Fudan Univ Sch Informat Sci & Technol Shanghai Peoples R China Royal Inst Technol Stockholm Sweden

The smart unmanned vending machine using machine vision technology suffers from the sharp decrease of detection accuracy due to the incomplete image collection of items by monocular camera in complex environment, and the lack of obvious features in dense stacking of items. In this article, a binocular camera system is designed to effectively solve the problems of distortion and coverage caused by monocular camera. Besides, an image-stitching algorithm is developed to splice the images captured by the camera, which reliefs the burden of computation for back-end recognition processing brought by the binocular camera. A new neural network structure-the YOLOv3-TinyE is proposed based on YOLOv3-tiny model. Based on the dataset of 21,000 images captured in real scenarios containing 20 different type of beverages, the comparison experimental results show that YOLOv3-TinyE model achieves the mean average precision of 99.15%, and the inference speed is 2.91 times faster than that of YOLOv3 model, and the detection accuracy of YOLOv3-TinyE model based on binocular vision is higher than that based on monocular vision. The results suggest that the designed method achieves the goal in terms of inference speed and average precision, that is, it is able to satisfy the requirements for real-world applications.

关键词： Cameras image stitching Servers image sensors Feature extraction Face recognition Containers

来源：评论

学校读者我要写书评

暂无评论

Adversarial Encoder-Driven Filter for Targeted image Tagging: A Novel Approach to visual Content Manipulation 6

Adversarial Encoder-Driven Filter for Targeted Image Tagging...

引用

6th IEEE International Conference on image processing, applications and Systems, IPAS 2025

作者： Mckee, Cole Flowers, Dominic Wood, Jesse Shafer, Ethan United States Military Academy Department of Electrical Engineering and Computer Science West PointNY10996 United States

ISBN: (纸本)9798331506520

Computer vision, driven by artificial intelligence, has become pervasive in diverse applications such as self-driving cars and law enforcement. However, the susceptibility of these systems to attacks has raised significant concerns among researchers. This paper addresses the vulnerability of image tagging algorithms, particularly focusing on misclassifications induced by autoencoders. We present experiments conducted on Amazon Rekognition, where we developed a specialized autoencoder to manipulate the latent space, forcing it to align with specific tags. By integrating this manipulated latent space with other images, we demonstrate the ability to increase the confidence of a specific tag on Amazon Rekognition, leading to more false positives of the chosen tag. Our study showcases a practical method to exploit Amazon's Rekognition image tagging algorithm using a black box approach. © 2025 IEEE.

关键词： Adversarial machine learning

来源：评论

学校读者我要写书评

暂无评论

Research on Unmarked Motion Action Recognition Technology Based on Computer vision 5

Research on Unmarked Motion Action Recognition Technology Ba...

引用

5th Asia-Pacific Conference on image processing, Electronics and Computers, IPEC 2024

作者： Zhao, Lun Shandong Vocational College of Science and Technology Weifang China

ISBN: (纸本)9798350374407

The main purpose of this study is to explore the issues of real-time, accurate, and unmarked recognition of sports movements in recent years. By reviewing the relevant research on machine learning or deep learning for specific sports or target actions based on computer vision image data input, the aim is to provide references for the application of unmarked motion capture technology in the field of sports motion recognition. The research employed a literature review methodology, conducting searches in six databases, namely Web of Science, PubMed, Scopus, Google Scholar, IEEE Xplore, and China National Knowledge Infrastructure (CNKI), covering publications from January 2000 to June 2020. Through boolean logic operations on the retrieved literature, key information such as first author/publication year, types/targets of motion, participant information, camera parameters, image feature extraction techniques, action recognition algorithms, evaluation methods for action recognition quality, training and validation methods for image data, and performance metrics for action recognition were extracted. After screening, a total of 23 articles were included in the study. The findings revealed that $39 \%$ of the studies utilized machine learning algorithms based on support vector machines, while $35 \%$ employed deep learning algorithms based on convolutional neural networks. Commonly used evaluation metrics for action recognition quality included classification accuracy, confusion matrix, and displacement error. The development of computer vision motion capture, models, and algorithms demonstrated promising applications in areas such as action technique recognition and sports performance analysis. Traditional machine learning algorithms like support vector machines and principal component analysis remain dominant in action recognition technology;however, in certain scenarios, the performance of deep learning algorithms surpassed that of traditional machine learning methods.

关键词： Motion capture

来源：评论

学校读者我要写书评

暂无评论

PEPPR-DWS on FPGA: Elevating Universal Parallelism and Precision Through Pulse-Enhanced Push-Relabel and Diffusion Wave Search

引用

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2025年第1期44卷 157-171页

作者： Dong, Zehua Zhang, Boyu Jiang, Yucheng Yu, Yu Li, Han Mai, Songping Tsinghua Univ Shenzhen Int Grad Sch Shenzhen 518055 Peoples R China Peng Cheng Lab Dept Circuit & Syst Shenzhen 518066 Peoples R China

The push-relabel algorithm is recognized as one of the efficient algorithms in the field of graph cut, finding widespread applications in computer vision. While its pixel-level parallel implementations are prevalent, existing methods predominantly rely on checkerboard scheduling, imposing inherent constraints on neighborhood size, limited to four. This limitation compromises both algorithm precision and efficiency, hindering real-time and high-precision applications. To address these issues, this article introduces a novel approach to accelerate push-relabel algorithm implementation on FPGA in a more universal and efficient manner, supporting variable-sized image block operations. First, by introducing the deferred update strategy, we realize the pulse-enhanced parallel push-relabel (PEPPR) algorithm to address data contention and conflict in parallel processing. Second, the simultaneous weighted push method is proposed, further enhancing parallel operations. Lastly, we introduce the efficient diffusion wave search (DWS) algorithm to expedite algorithm convergence and reduce redundancy. While achieving a modest $1.7\times $ acceleration compared to state-of-the-art implementations, the proposed algorithm (PEPPR-DWS) successfully overcomes the inherent limitations of checkerboard scheduling in full pixel-level parallelism. In the test based on Middlebury benchmark v3, the proposed 8-neighborhood implementation exhibits a reduction of error rate by over 1% compared to the typical 4-neighborhood implementation. It provides a versatile and efficient solution for high-precision and real-time applications, holding substantial potential for practical applications.

关键词： Parallel processing Field programmable gate arrays Convergence Graphics processing units Scheduling Optimization Design automation Diffusion wave search (DWS) field programmable gate array (FPGA) acceleration hardware-friendly pixel-level parallelism push-relabel simultaneous weighted push (SWP)

来源：评论

学校读者我要写书评

暂无评论

A Novel Intensity-Corrected Blue Channel Compensation and Edge-Preserving Contrast Enhancement Using Laplace Filter and Sigmoid Function for Sand-Dust image Enhancement

引用

IEEE ACCESS 2025年 13卷 43127-43144页

作者： Masood, Muhammad Khawaja Kashif Baro, Enrique Nava Roth, Pablo Otero Univ Malaga Inst Ocean Engn Res Malaga 29071 Spain Qassim Univ Elect Engn Dept Buraydah 52571 Saudi Arabia

Outdoor computer vision systems face significant challenges due to reduced visibility and severe color distortion in the images captured in sand-dust-affected environments. This study aims to improve the visibility of sand-dust-degraded images. To achieve this goal, a novel and effective method is proposed to remove the sand-dust color cast and enhance image visibility. The proposed method combines two essential color model methods to remove the sand-dust color cast and enhance image clarity. In the initial phase, sand-dust removal is achieved using a novel Intensity-corrected blue channel compensation along with white balancing for color adjustment based on the Red-Green-Blue (RGB) color model. In the next phase, a novel Edge-preserving contrast enhancement method is applied to improve the visibility under sand-dust conditions. This method consists of CLAHE, a Gaussian blur filter, a Laplace filter, and the sigmoid function. Using the Hue-Saturation-value (HSv) color model, CLAHE is applied for contrast enhancement;the Gaussian blur filter removes high-frequency noise, and the Laplace filter enhances edge detection, all targeting the v (value) channel to refine image details, while the sigmoid function adjusts saturation in the Saturation (S) channel, ensuring natural color balance and improved feature visibility. In-depth qualitative and quantitative evaluations are conducted on images with varying levels of sand-dust intensity (weak, moderate, strong, extreme). The proposed method shows superior performance in Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and processing speed, while significantly reducing computational complexity. Compared to the state-of-the-art CNN and all previous methods, our proposed method is efficient for real-time applications with minimal hardware requirements, making it ideal for embedded vision systems. Furthermore, a novel Energy Efficiency Index (EEI) is used to assess computational cost-effectiveness. The ev

关键词： image color analysis image edge detection image restoration visualization Distortion Computational efficiency Green products Lighting Indexes Energy efficiency Sand-dust images intensity corrected blue channel compensation contrast enhancement Laplace filter sigmoid function CLAHE Gaussian blur filter

来源：评论

学校读者我要写书评

暂无评论

Analysis of COvID-19 CT Chest image Classification using Dl4jMlp Classifier and Multilayer Perceptron in WEKA Environment

引用

CURRENT MEDICAL IMAGING 2024年第1期20卷 1-7页

作者： Sreejith, S. Ajayan, J. Reddy, N. v. Uma Devasenapati, Babu S. Rebelli, Shashank New Horizon Coll Engn Bengaluru Karnataka India SR Univ Warangal Telangana India CMR Univ Bengaluru Karnataka India

Introduction: In recent years, various deep learning algorithms have exhibited remarkable performance in various data-rich applications, like health care, medical imaging, as well as in computer vision. COvID-19, which is a rapidly spreading virus, has affected people of all ages both socially and economically. Early detection of this virus is therefore important in order to prevent its further spread. Methods: COvID-19 crisis has also galvanized researchers to adopt various machine learning as well as deep learning techniques in order to combat the pandemic. Lung images can be used in the diagnosis of COvID-19. Results: In this paper, we have analysed the COvID-19 chest CT image classification efficiency using multilayer perceptron with different imaging filters, like edge histogram filter, colour histogram equalization filter, color-layout filter, and Garbo filter in the WEKA environment. Conclusion: The performance of CT image classification has also been compared comprehensively with the deep learning classifier Dl4jMlp. It was observed that the multilayer perceptron with edge histogram filter outperformed other classifiers compared in this paper with 89.6% of correctly classified instances.

关键词： COvID-19 classification Computed tomography Deep learning Multilayer perceptron Confusion matrix WEKA

来源：评论

学校读者我要写书评

暂无评论

Comparative Study on Evaluating the Performance of Automated Bacterial Colony Counting with Available APP and Software on Generated image Dataset

引用

SN Computer Science 2025年第4期6卷 1-16页

作者： Arora, Prachi Tewary, Suman Krishnamurthi, Srinivasan Kumari, Neelam School of Computing Indian Institute of Information Technology Una (IIIT-Una) Una 177209 India Academy of Scientific and Innovative Research (AcSIR) Ghaziabad 201002 India Thin Film Coating Facility CSIR-Central Scientific Instruments Organisation (CSIR-CSIO) Sector 30-C Chandigarh 160030 India Materials Science and Sensor Applications CSIR-Central Scientific Instruments Organisation (CSIR-CSIO) Sector 30-C Chandigarh 160030 India Advanced Materials and Processes Division CSIR-National Metallurgical Laboratory (CSIR-NML) Jamshedpur 831007 India MTCC-Gene bank CSIR-Institute of Microbial Technology (CSIR-IMTECH) Sector 39-A Chandigarh 160039 India

Recent developments in image analysis and interpretation using computer vision techniques have shown potential for novel applications in microbiology laboratories to support the task of automation, aiming for faster and more reliable detection. image processing techniques and machine learning models can be valuable tools in the screening process, helping technicians spend less time classifying no-growth results and quickly separating the categories for further analysis. In this context, creating a dataset of different bacterial strain images is a fundamental objective for developing and improving the accuracy of image processing models. Therefore, this manuscript acquired a dataset of water samples with different bacterial strain images on a petri dish following a standardized process with controlled conditions of positioning and lighting. The image acquisition device was also developed with a light-emitting diode (LED) and diffuser as a lighting source and a smartphone camera with 16 MP resolution. In addition, the present manuscript also focuses on comparing the accuracy of the proposed algorithm with the available apps and software using the custom-built imaging device. Hence, the resulting dataset consists of 100 images, which is helpful for researchers working in image processing to develop an algorithm for automated counting of bacterial colonies on petri dishes. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025.

关键词： Bacterial colonies image datasets image processing Imaging device Segmentation Water samples

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：