检索结果-内蒙古大学图书馆

7th International Conference on image and Signal processing and their applications, ISPA 2022

作者： Samia, Bougareche Soraya, Zehani Malika, Mimi Biskra University Dept. of Electrical Engineering Biskra Algeria Mostaganem University Dept. of Electrical Engineering Mostaganem Algeria

ISBN: (纸本)9781665480420

Fashion is the way we present ourselves which mainly focuses on vision, has attracted great interest from computer vision researchers. It is generally used to search fashion products in online shopping malls to know the descriptive information of the product. The main objectives of our paper is to use deep learning (DL) and machine learning (ML) methods to correctly identify and categorize clothing images. In this work, we used ML algorithms (support vector machines (SVM), K-Nearest Neirghbors (KNN), Decision tree (DT), Random Forest (RF)), DL algorithms (Convolutionnal Neurals Network (CNN), AlexNet, GoogleNet, LeNet, LeNet5) and the transfer learning using a pretrained models (VGG16, MobileNet and RestNet50). We trained and tested our models online using google colaboratory with Tensorflow/Keras and Scikit-Learn libraries that support deep learning and machine learning in Python. The main metric used in our study to evaluate the performance of ML and DL algorithms is the accuracy and matrix confusion. The best result for the ML models is obtained with the use of ANN (88.71%) and for the DL models is obtained for the GoogleNet architecture (93.75%). The results obtained showed that the number of epochs and the depth of the network have an effect in obtaining the best results. © 2022 IEEE.

关键词： image classification

来源：评论

学校读者我要写书评

暂无评论

MAFT: An image Super-Resolution Method Based on Mixed Attention and Feature Transfer 1

引用

6th Asia Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data (APWeb-WAIM)

作者： Liu, Xin Li, Jing Cui, Yuanning Zhu, Wei Qian, Luhong Nanjing Univ Aeronaut & Astronaut Coll Artificial Intelligence Coll Comp Sci & Technol Nanjing 211106 Peoples R China Nanjing Univ Nanjing 210023 Peoples R China Kunshan Huaheng Welding Co Ltd Suzhou 215300 Peoples R China

ISBN: (数字)9783031251986

ISBN: (纸本)9783031251979;9783031251986

Reference-based image super-resolution methods, which enhance the restoration of a low-resolution (LR) images by introducing an additional high-resolution (HR) reference image, have made rapid and remarkable progress in the field of image super-resolution in recent years. Most of the existing methods use an implicit correspondence matching approach to transfer HR features from the reference image (Ref) to the LR image. However, these methods lack the further judgment and processing of the HR features from Ref, which limits them in challenging cases. In this paper, We propose an image super-resolution method based on mixed attention and feature transfer (MAFT). First, we obtain the deep features of the LR and Ref images through the encoder network, then extract the transferred features from Ref through the attention network, and perform adaptive optimization processing on the features, and finally fuse the transferred features with LR features to achieve a high-quality image reconstruction. The quantitative and qualitative experiments on these benchmarks, i.e., CUFED5, Urban100 and Manga109, show that MAFT outperforms the state-of-the-art baselines with significant improvements.

关键词： Computer vision machine learning Super-resolution Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Speech Emotion Recognition Using CNN-LSTM and vision Transformer 13th

Speech Emotion Recognition Using CNN-LSTM and Vision Transf...

引用

13th International Conference on Innovations in Bio-Inspired Computing and applications, IBICA 2022, and 12th World Congress on Information and Communication Technologies, WICT 2022

作者： Kumar, C S Ayush Maharana, Advaith Das Krishnan, Srinath Murali Hanuma, Sannidhi Sri Sai Lal, G. Jyothish Ravi, Vinayakumar Amrita School of Engineering Coimbatore Amrita Vishwa Vidyapeetham Coimbatore India Center for Artificial Intelligence Prince Mohammad Bin Fahd University Khobar Saudi Arabia

ISBN: (纸本)9783031274985

The importance of speech emotion recognition has increased as a result of the acceptance of intelligent conversational assistant services. The communication between humans and machines may be made better via emotion recognition and analysis. We propose the application of attention based deep learning techniques to process and recognize speech emotions. In this paper we look at two major approaches CNN-LSTM and Mel Spectrogram-vision Transformer based models and is compared over to the existing benchmarks. The experimental results roots for the feature extraction strategy of deep learning based approaches, eliminating the need of handpicking the features for traditional machine learning (ML) classifiers present in the current literature. A comparative study and evaluation between CNN-LSTM and vision Transformers (ViT) have been evaluated and established from the experimental results. Both the models performed similarly with CNN-LSTM giving an accuracy of 88.50% when compared to the accuracy of 85.36% by ViT surpassing the existing benchmarks and providing the scope of study of attention and image processing based learning for speech emotion recognition. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： image processing

来源：评论

学校读者我要写书评

暂无评论

Recovering image Information from Speckle Noise by image processing 23

Recovering Image Information from Speckle Noise by Image Pro...

引用

6th International Conference on machine vision and applications, ICMVA 2023

作者： Nie, Jianlin Hanson, Steen G. Takeda, Mitsuo Wang, Wei Xi'An Technological University Shaanxi Xi'an China DTU Fotonik Department of Photonics Engineering Technical University of Denmark RoskildeDk-4000 Denmark Utsunomiya University Utsunomiya Tochigi Japan School of Engineering and Physical Sciences Heriot-Watt University EdinburghEH14 4AS United Kingdom

ISBN: (纸本)9781450399531

As a kind of noise, speckle seriously affects the imaging quality of optical imaging system. However, the speckle image carries a large amount of information related to the physical characteristics of the object surface, which can be used as the basis to identify and judge hidden objects. In this paper, speckle noise removal in optical imaging is studied. The average is derived for the squared moduli of spectra of short-exposure speckle images to recover the amplitude information. At the same time, cross-spectrum function is used to recover the phase information. We use this method to process the images. Then, the simulation experiment analysis is carried out by varying two aspects: the stacking numbers and the different objects. The results show that this method can recover the feature information from the speckle image, thus verifying the feasibility of the method. © 2023 ACM.

关键词： image processing

来源：评论

学校读者我要写书评

暂无评论

Real-Time Facial Expression Recognition Using Edge AI Accelerators

Real-Time Facial Expression Recognition Using Edge AI Accele...

引用

作者： Smith, Mark Heath University of South Carolina

学位级别：M.S., Master of Science/Master of Surgery

Facial expression recognition is a popular and challenging area of research in machine learning applications. Facial expressions are critical to human communication and allow us to convey complex thoughts and emotions beyond spoken language. The complexity of facial expressions creates a difficult problem for computer vision systems, especially edge computing systems. Current Deep Learning (DL) methods rely on large-scale Convolutional Neural Networks (CNN) which require millions of floating point operations (FLOPS) to accomplish similar image classification tasks. However, on edge and IoT devices, large-scale convolutional models can cause problems due to memory and power limitations. The intent of this work is to propose a neural network architecture inspired by deep CNNs which is tuned for deployment on edge devices and small-form-factor edge AI accelerators. This will be carried out by strategically reducing the size of the network while still achieving good discrimination between classes. Additionally, performance metrics such as latency, accuracy, throughput, and power consumption will be captured and compared with several popular deep CNN models. It is expected that there will be trade-offs between network size and performance when the model is deployed and running model inference on edge AI accelerators such as the Intel Movidius Neural Compute Stick ii and the NVIDIA Jetson Nano GPU accelerator. An additional benefit of smaller-scale convolutional models is that they are better suited to be converted into spiking neural networks and deployed on neuromorphic hardware such as the Intel Loihi neuromorphic chip. Furthermore, this work will also examine various image processing techniques across multiple datasets in an effort to increase the performance of the edge-efficient model.

关键词： Edge computing Energy efficient computing Facial expression recognition image classification machine learning Real time systems

来源：评论

学校读者我要写书评

暂无评论

A view of computational models for image segmentation

引用

Annali dell'Universita di Ferrara 2022年第2期68卷 277-294页

作者： Antonelli, Laura De Simone, Valentina di Serafino, Daniela Institute for High Performance Computing and Networking (ICAR) CNR Via Pietro Castellino 111 Napoli 80131 Italy Department of Mathematics and Physics University of Campania “Luigi Vanvitelli” viale Abramo Lincoln 5 Caserta 81100 Italy Department of Mathematics and Applications “R. Caccioppoli” University of Naples Federico II Via Cintia Monte S. Angelo Napoli 80126 Italy

image segmentation is a central topic in image processing and computer vision and a key issue in many applications, e.g., in medical imaging, microscopy, document analysis and remote sensing. According to the human perception, image segmentation is the process of dividing an image into non-overlapping regions. These regions, which may correspond, e.g., to different objects, are fundamental for the correct interpretation and classification of the scene represented by the image. The division into regions is not unique, but it depends on the application, i.e., it must be driven by the final goal of the segmentation and hence by the most significant features with respect to that goal. Thus, image segmentation can be regarded as a strongly ill-posed problem. A classical approach to deal with ill posedness consists in incorporating in the model a-priori information about the solution, e.g., in the form of penalty terms. In this work we provide a brief overview of basic computational models for image segmentation, focusing on edge-based and region-based variational models, as well as on statistical and machine-learning approaches. We also sketch numerical methods that are applied in computing solutions to these models. In our opinion, our view can help the readers identify suitable classes of methods for solving their specific problems. © 2022, The Author(s).

关键词： Ill-posed problems image segmentation machine-learning Numerical optimization

来源：评论

学校读者我要写书评

暂无评论

Photonic signal processor based on a Kerr microcomb for real-time video image processing

引用

COMMUNICATIONS ENGINEERING 2023年第1期2卷 94页

作者： Tan, Mengxi Xu, Xingyuan Boes, Andreas Corcoran, Bill Nguyen, Thach G. Chu, Sai T. Little, Brent E. Morandotti, Roberto Wu, Jiayang Mitchell, Arnan Moss, David J. Beihang Univ Sch Elect & Informat Engn Beijing 100191 Peoples R China Swinburne Univ Technol Opt Sci Ctr Hawthorn Vic 3122 Australia RMIT Univ Sch Engn Melbourne Vic 3001 Australia Beijing Univ Posts & Telecommun State Key Lab Informat Photon & Opt Commun Beijing 100876 Peoples R China Univ Adelaide Inst Photon & Adv Sensing IPAS Adelaide SA 5005 Australia Univ Adelaide Sch Elect & Elect Engn Adelaide SA 5005 Australia Monash Univ Dept Elect & Comp Syst Engn Clayton Vic 3168 Australia City Univ Hong Kong Dept Phys & Mat Sci Tat Chee Ave Hong Kong Peoples R China Chinese Acad Sci Xian Inst Opt & Precis Mech Xian Peoples R China INRS Energie Materiaux & Telecommun 1650 Blvd Lionel Boulet Varennes J3X 1S2 PQ Canada

Signal processing has become central to many fields, from coherent optical telecommunications, where it is used to compensate signal impairments, to video image processing. image processing is particularly important for observational astronomy, medical diagnosis, autonomous driving, big data and artificial intelligence. For these applications, signal processing traditionally has mainly been performed electronically. However these, as well as new applications, particularly those involving real time video image processing, are creating unprecedented demand for ultrahigh performance, including high bandwidth and reduced energy consumption. Here, we demonstrate a photonic signal processor operating at 17 Terabits/s and use it to process video image signals in real-time. The system processes 400,000 video signals concurrently, performing 34 functions simultaneously that are key to object edge detection, edge enhancement and motion blur. As compared with spatial-light devices used for image processing, our system is not only ultra-high speed but highly reconfigurable and programable, able to perform many different functions without any change to the physical hardware. Our approach is based on an integrated Kerr soliton crystal microcomb, and opens up new avenues for ultrafast robotic vision and machine learning.

关键词： Frequency combs Solitons

来源：评论

学校读者我要写书评

暂无评论

Regional Transformer for image Super-Resolution 7

Regional Transformer for Image Super-Resolution

引用

7th International Conference on machine vision and Information Technology, CMVIT 2023

作者： Yang, Sen Yang, Jiahong Xu, Dahong Li, Xi Hunan Normal University China Hunan Normal University Key Laboratory of Sports Intelligence Research China

ISBN: (纸本)9781665464857

In the image super-resolution algorithm model, a large receptive field can provide more valuable features, so the Transformer with strong information interaction ability has achieved excellent results in image super-resolution processing applications. However, when the range of the receptive field reaches a certain critical value, the restoration performance of the super-resolution algorithm also reaches a certain critical value, which indicates that unconditionally increasing the receptive field will not continue to promote the improvement of the restoration performance. At the same time, the larger the receptive field range, the more data the model needs to process, which also seriously increases the computational complexity of the algorithm. In order to exchange information in a wider range more effectively, in this paper, a new type of super-resolution network based on Transformer, namely Regional Transformer, is designed. The key element in the newly designed network structure is the Region Block (RB) with the Boundary Restriction (BR) mechanism. In addition, the paper designs a Boundary Restriction based on coarse-To-fine pipes. This paper conducts a large number of experiments on multiple datasets, and the experiments show that the network structure designed in this paper has a significant improvement in performance. © 2023 IEEE.

关键词： Restoration

来源：评论

学校读者我要写书评

暂无评论

Development of a fusion technique and an algorithm for merging images recorded in the IR and visible spectrum in dust and fog 19

Development of a fusion technique and an algorithm for mergi...

引用

Conference on Electro-Optical and Infrared Systems - Technology and applications XIX

作者： Semenishchev, Evgeny Zelensky, Aleksandr Alepko, Andrey Zhdanova, Marina Voronin, Viacheslav Ilyukhin, Yury Tula State Univ TulSU Lab Cognit Technol & Simulat Syst 92 Sq Lenina Tula 300012 Tula Russia Moscow State Tech Univ STANKIN Ctr Cognit Technol & Machine Vis 1a Vadkovsky Moscow 127055 Russia

ISBN: (纸本)9781510655461

The article proposes a fusion technique and an algorithm for combining images recorded in the IR and visible spectrum in relation to the problem of processing products by robotic complexes in dust and fog. Primary data processing is based on the use of a multi-criteria processing with complex data analysis and cross-change of the filtration coefficient for different types of data. The search for base points is based on the application of the technique of reducing the range of clusters (image simplification) and searching for transition boundaries using the approach of determining the slope of the function in local areas. As test data used to evaluate the effectiveness, pairs of test images obtained by sensors with a resolution of 1024x768 (8 bit, color image, visible range) and 640x480 (8 bit, color, IR image) are used. images of simple shapes are used as analyzed objects.

关键词： image fusion machine vision preprocessing IR noise robotic complexes

来源：评论

学校读者我要写书评

暂无评论

S²NeRF: Neural Radiance Fields Training with Sparse Points and Sparse Views 17th

S<SUP>2</SUP>NeRF: Neural Radiance Fields Training with Spar...

引用

17th International Conference on Intelligent Robotics and applications

作者： Zhang, Zhihong Wang, Wenjun Qi, Dexin Mei, Xuesong Xi An Jiao Tong Univ Sch Mech Engn Xian 710049 Shaanxi Peoples R China State Key Lab Mfg Syst Engn Xian 710049 Shaanxi Peoples R China Shaanxi Key Lab Intelligent Robots Xian 710049 Shaanxi Peoples R China

ISBN: (纸本)9789819607730;9789819607747

Neural volume rendering methods, especially NeRF, have demonstrated remarkable performance in novel view synthesis. However, NeRF relies solely on image data and lacks explicit geometric information, necessitating a large number of posed images and a computationally intensive ray sampling strategy to learn accurate scene representations. This poses challenges and may result in incomplete or locally optimal scene geometry when views are sparse or incomplete, as the limited views may not provide sufficient constraints to determine a unique geometry solution for complex scenes. Meanwhile, sparse point clouds provide an attractive source of scene information, especially for geometry, to complement images in neural scene representations, particularly when input views are sparse. To overcome these limitations, we propose (SNeRF)-Ne-2, a novel Neural Radiance Field that simultaneously incorporates features from both point clouds and images for volume rendering. Specifically, (SNeRF)-Ne-2 extracts patch-wise point features from point clouds and raywise image features from adjacent views. Then the scene feature volume is constructed by implicitly fusing these point and image features through self-attention. Finally, the volume feature is utilized to render novel views of the scene. Experimental results on the challenging TartanAir dataset demonstrate that, thanks to the integration of feature volume from point clouds and images, (SNeRF)-Ne-2 achieves state-of-the-art performance in novel view synthesis.

关键词： machine Learning Computer vision Volume Rendering

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：