检索结果-内蒙古大学图书馆

Robust automatic crater detection at all latitudes on Mars with deep-learning

PLANETARY AND SPACE SCIENCE 2025年 260卷

作者： Martinez, L. Andrieu, F. Schmidt, F. Talbot, H. Bentley, M. S. Univ Paris Saclay CNRS GEOPS F-91405 Orsay France Inst Univ France IUF Paris France CentraleSupelec Paris Saclay Univ Inria Gif Sur Yvette France European Space Astron Ctr Camino Bajo CastilloUrb Villafranca CastilloVill Madrid 28692 Spain

Understanding the distribution and characteristics of impact craters on planetary surfaces is essential for unraveling geological processes and the evolution of celestial bodies. Several machine learning and AI-based approaches have been proposed to detect craters on planetary surface images automatically. However, designing a robust tool for an entire complex planet such as Mars, is still an open problem. This article presents a novel approach using the Faster Region-based Convolutional Neural Network (Faster R-CNN) for such a detection. The proposed method involves the pre-processing, training and crater detection steps, which are especially designed for robustness regarding latitude and complex geomorphological features. The objectives of this studies are to (i) be robust at all latitudes and (ii) for >= 1 km diameter crater sizes. (iii) To propose an open-source and re-usable algorithm that (iv) only needs an image to run. Extensive experiments on high-resolution planetary imagery demonstrate excellent performances with an average precision AP(50)>0.82 with an intersection over union criterion IoU >= 0.5, irrespective of crater scale. For mid and high latitudes (higher than 48 degrees north and south), performance decreases down to AP(50)similar to 0.7, which is still better than the current state of the art. Loss of performance is mostly due to strong shadowing effects. Our results also highlight the versatility and potential of our robust model for automating the analysis of craters across different celestial bodies. The automated crater detection tool presented in this article is publicly available as open-source and holds great promise for future scientific research of space exploration missions.

关键词： Detection algorithm Impact craters Computer vision Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

Deep Learning-Based Depth Map Defect Removal for Industrial applications 10

Deep Learning-Based Depth Map Defect Removal for Industrial ...

引用

Conference on Optical Metrology and Inspection for Industrial applications X

作者： Voronin, V. Gapon, N. Zhdanova, M. Tokareva, O. Khamidullin, I. Semenishchev, E. Moscow State Univ Technol STANKIN Ctr Cognit Technol & Machine Vis Moscow Russia Don State Tech Univ Rostov Na Donu Russia

ISBN: (纸本)9781510667877;9781510667884

A system for determining the distance from the robot to the scene is useful for object tracking, and 3-D reconstructions may be desired for many manufacturing and robotic tasks. While the robot is processing materials, such as welding parts, milling, drilling, etc., fragments of materials fall on the camera installed on the robot, introducing unnecessary information when building a depth map, as well as the emergence of new lost areas, which leads to incorrect determination of the size of objects. There is a problem comprising a decrease in the accuracy of planning the movement trajectory caused by wrong sections on the depth map because of erroneous distance determination to objects. We present an approach combining defect detection and depth reconstruction algorithms. The first step for image defect detection is based on a convolutional auto-encoder (U-Net). The second step is a depth map reconstruction using a spatial reconstruction based on a geometric model with contour and texture analysis. We apply contour restoration and texture synthesis for image reconstruction. A method is proposed for restoring the boundaries of objects in an image based on constructing a composite curve by cubic splines. Our technique outperforms the state-of-the-art methods quantitatively in reconstruction accuracy on the RGB-D benchmark for evaluating manufacturing vision systems.

关键词： depth map defect detection image reconstruction manufacturing deep learning U-Net

来源：评论

学校读者我要写书评

暂无评论

Beyond images: data visualization through headline analysis in historical newspaper with computer vision

Beyond images: data visualization through headline analysis ...

引用

2023 International Workshop on Signal processing and machine Learning, WSPML 2023

作者： Yip, Michael Kin-Fu Lum, Vincent Wai-Yip Digital Initiatives The Chinese University of Hong Kong Library Hong Kong

ISBN: (纸本)9781510671928

This paper introduces an innovative method that combines Computer vision and Deep Learning to extract headlines from a historical newspaper. Through the illustrations from historical newspapers, one of our goals is to use these extracted headlines to support digital humanities. The research goes beyond traditional image analysis by exploring how new digital technologies can facilitate the understanding of newspaper content by visualizing through time and place. The experimental results reveal that our recommended approaches, which involve Optical Character Recognition (OCR) with scraping and Deep Learning Object Detection models, can successfully obtain the required information for more advanced analytics. Due to the distinctive historical and humanities values, we chose "The Hongkong News" from the Hong Kong Early Tabloid Newspaper collection to illustrate the efficacy of our methodology. In addition, we constructed several visualization applications to demonstrate the viability of our suggested approaches. © 2023 SPIE. All rights reserved.

关键词： Geographic information systems

来源：评论

学校读者我要写书评

暂无评论

Efficient Hardware Architectures for Accelerating Deep Neural Networks: Survey

引用

IEEE ACCESS 2022年 10卷 131788-131828页

作者： Dhilleswararao, Pudi Boppu, Srinivas Manikandan, M. Sabarimalai Cenkeramaddi, Linga Reddy Indian Inst Technol Bhubaneswar Sch Elect Sci Bhubaneswar 752050 India Indian Inst Technol Bhubaneswar Sch Elect Sci Bhubaneswar 678557 India Univ Agder Dept ICT N-4879 Grimstad Norway

In the modern-day era of technology, a paradigm shift has been witnessed in the areas involving applications of Artificial Intelligence (AI), machine Learning (ML), and Deep Learning (DL). Specifically, Deep Neural Networks (DNNs) have emerged as a popular field of interest in most AI applications such as computer vision, image and video processing, robotics, etc. In the context of developed digital technologies and the availability of authentic data and data handling infrastructure, DNNs have been a credible choice for solving more complex real-life problems. The performance and accuracy of a DNN is a way better than human intelligence in certain situations. However, it is noteworthy that the DNN is computationally too cumbersome in terms of the resources and time to handle these computations. Furthermore, general-purpose architectures like CPUs have issues in handling such computationally intensive algorithms. Therefore, a lot of interest and efforts have been invested by the research fraternity in specialized hardware architectures such as Graphics processing Unit (GPU), Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), and Coarse Grained Reconfigurable Array (CGRA) in the context of effective implementation of computationally intensive algorithms. This paper brings forward the various research works on the development and deployment of DNNs using the aforementioned specialized hardware architectures and embedded AI accelerators. The review discusses the detailed description of the specialized hardware-based accelerators used in the training and/or inference of DNN. A comparative study based on factors like power, area, and throughput, is also made on the various accelerators discussed. Finally, future research and development directions, such as future trends in DNN implementation on specialized hardware accelerators, are discussed. This review article is intended to guide hardware architects to accelerate and improve the effe

关键词： machine learning field programmable gate array (FPGA) deep neural networks (DNN) deep learning (DL) application specific integrated circuits (ASIC) artificial intelligence (AI) central processing unit (CPU) graphics processing unit (GPU) hardware accelerators

来源：评论

学校读者我要写书评

暂无评论

Classification Of Solid Waste Using Computer vision Techniques 26

Classification Of Solid Waste Using Computer Vision Techniqu...

引用

26th IEEE Signal processing: Algorithms, Architectures, Arrangements, and applications, SPA 2023

作者： Akdemir, Burak Aytac, Eyup Enes Tosun, Erkani Mert Yuksel, Seniha Esen Hacettepe University Department of Electrical and Electronics Engineering Ankara Turkey

ISBN: (纸本)9798350304985

In recent years, waste management and need for recycling has gained importance more than ever due to the increase in population. Because of this reason, making recycling easily applicable and available with reasonable cost is very crucial. Mainstream recycling methods rely on human effort hence there is a room for automatization. The most potentially improvable step of the recycling is the classification of materials in which solid wastes are classified with respect to their materials. In this work, an alternative method to classification step in mainstream recycling is proposed. Proposed method includes an active heating unit that consists of 3 thermal lamps. It also includes a FLIR T420 thermal camera. Objects to be classified with respect to their material are heated using the heating unit, then left for passive cooling. The process is observed using the thermal camera. Obtained images are then used in machine learning methods for classification. The proposed method aims to automatize the classification step in recycling as well as making it cheaper and easily applicable. The accuracy of the system is calculated in the dataset that is collected in our lab. © 2023 Division of Signal processing and Electronic Systems, Poznan University of Technology (DSPES PUT).

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

Age detection by optimizing the structure of layers and neurons in the neural network

引用

JOURNAL OF OPTICS-INDIA 2024年第2期53卷 1186-1202页

作者： Jiang, Zhenghong Zhou, Chunrong Chongqing Vocat Coll Transportat Sch Big Data Jiangjin 402247 Chongqing Peoples R China

Age detection is a fundamental task in computer vision with numerous applications, from targeted advertising to security systems. This paper proposes a robust approach for age estimation based on local binary patterns to extract features associated with face images. The goal of accurately predicting people's ages from facial images is to overcome challenges such as changes in lighting conditions, poses, and facial expressions. The proposed method uses a combination of feature extraction, feature selection, and machine learning algorithms, which we named Hybrid method. At first, facial landmarks are detected to determine the key points of the face and enable the extraction of the corresponding facial features. These features are then fed into a feature selection algorithm to identify the most distinctive ones, reducing dimensionality and increasing model efficiency. To evaluate the proposed approach, extensive experiments are conducted on benchmark datasets, including different age groups and ethnicities. The results show the effectiveness of the proposed method in achieving high accuracy and robustness in age estimation. As shown in the calculation results, the detection rate and accuracy of Hybrid method age estimation calculations are better than competing methods. For Hybrid method, the mean absolute error is 4.94 years, with a standard deviation of 4.74 years. From the point of view of average absolute error, this age estimation method is superior to other methods that have been presented to date. The proposed method for estimating the age of people has a final sensitivity of 97.2%, an accuracy of 96.8%, and a precision of 99.1%. In addition, it is stated in the specifications of the implementation system that the program can be executed in about 3.5 s, which is a suitable speed for estimating the age of people based on their face photographs.

关键词： ANN Hybrid method Age detection image processing

来源：评论

学校读者我要写书评

暂无评论

Automatic image Caption Generation Using ResNet & Torch vision 4th

Automatic Image Caption Generation Using ResNet & Torch Visi...

引用

4th International Conference on machine Learning, image processing, Network Security and Data Sciences (MIND)

作者： Verma, Vijeta Saritha, Sri Khetwat Jain, Sweta Maulana Azad Natl Inst Technol Bhopal India

ISBN: (纸本)9783031243660;9783031243677

image captioning is a task through which a textual description can be generated that illustrated the action performed in the image. It is one of the most complicated research areas where only the machine learning approach can intervene. In the area of image captioning, a system should be intelligent enough to understand the semantic knowledge to recognize the object present in the image and the situation that evolves with it. In the proposed work an image captioning system has been generated using ResNet along with CNN and RNN. CNN is used as an encoder and RNN is used as a decoder. The system is able to infer the situation precisely for MSCOCO benchmark. The model has been trained with ResNet152 which effectively utilizes the layers and minimizes the computational time. ResNet skips the convolutional layers that solved the gradient exploding problem, that is why it is also known as skip connection. System perceived better Bilingual Evaluation Understudy (BLEU), METEOR, CIDEr, and Rouge score as compared to the previously implemented model. BLEU score has been evaluated with four parameters as B1, B2, B3 and B4 i.e., 0.57, 0.404, 0.279, 0.191 respectively. METEOR, CIDEr and Rouge have been depicted as 0.195, 0.396 and 0.6 respectively. Model has been better utilized to train the samples by reducing the size of the image and enhancing the brightness with pillow. System also uses the Torch vision library to enhance the model for better predicting the situation.

关键词： image captioning ResNet CNN RNN MSCOCO BLEU machine learning

来源：评论

学校读者我要写书评

暂无评论

Software to Assist Visually Impaired People During the Craps Game Using machine Learning on Python Platform 2nd

Software to Assist Visually Impaired People During the Craps...

引用

2nd International Conference on Smart Technologies, Systems and applications (SmartTech-IC)

作者： Hernandez Diaz, Nicolas Penaloza, Yersica C. Yuliana Rios, Y. Magre Colorado, Luz A. Univ Tecnol Bolivar Parque Ind & Tecnol Carlos Velez Pombo Cartagena De Indias Colombia Univ Pamplona Km 1 Via Bucaramanga Pamplona Colombia

ISBN: (纸本)9783030991708;9783030991692

Pattern recognition is a prominent area of research in computer vision, where different methods have been proposed in the last 50 years. This work presents the development of a Python API to identify the result of two six-sided dice used in the game called "Craps" as a no-controlled environment to help visually impaired people. The software is structured in four stages. The first one is capturing images through a device with a digital camera connected to the web via IP address. The second stage corresponds to the captured image processing;it is necessary to establish a standard image size and resize and equalize the digitized image. The third stage seeks to segment the object of study by artificial vision techniques to identify the result of the dice after being thrown. Finally, the fourth stage is to interpret the result and play it through a speaker. The expected possible result is a system that integrates the four stages mentioned above through an intuitive and accessible low-cost Python API, mainly aimed at visually impaired people.

关键词： Craps game Visually impaired people Non controlled environment Python API Artificial vision techniques image processing

来源：评论

学校读者我要写书评

暂无评论

3D Object Reconstruction with Deep Learning 13th

3D Object Reconstruction with Deep Learning

引用

13th IFIP TC 12 International Conference on Intelligent Information processing, iiP 2024

作者： Aremu, Stephen S. Taherkhani, Aboozar Liu, Chang Yang, Shengxiang School of Computer Science and Informatics De Montfort University Leicester United Kingdom Digital Factory Department Shenyang Institute of Automation Chinese Academy of Sciences Shenyang110016 China

ISBN: (纸本)9783031579189

Recent advancements and breakthroughs in deep learning have accelerated the rapid development in the field of computer vision. Having recorded a huge success in 2D object perception and detection, a lot of progress has also been made in 3D object reconstruction. Since humans can infer and relate better with 3D world images by just a single view 2D image of the object, it is necessary to train computers to think in 3D to achieve some key applications of computer vision. The use of deep learning in 3D object reconstruction of single-view images is rapidly evolving and recording significant results. In this research, we explore the Facebook well-known hybrid approach called Mesh R-CNN that combines voxel generation and triangular mesh reconstruction to generate 3D mesh structure of an object from a 2D single-view image. Although the reconstruction of objects with varying geometry and topology was achieved by Mesh R-CNN, the mesh quality was affected due to topological errors like self-intersection, causing non-smooth and rough mesh generation. In this research, Mesh R-CNN with Laplacian Smoothing (Mesh R-CNN-LS) was proposed to use the Laplacian smoothing and regularization algorithm to refine the non-smooth and rough mesh. The proposed Mesh R-CNN-LS helps to constrain the triangular deformation and generate a better and smoother 3D mesh. The proposed Mesh R-CNN-LS was compared with the original Mesh R-CNN on the Pix3D dataset and it showed better performance in terms of the loss and average precision score. © IFIP International Federation for Information processing 2024.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

A General Protocol to Probe Large vision Models for 3D Physical Understanding 38

A General Protocol to Probe Large Vision Models for 3D Physi...

引用

38th Conference on Neural Information processing Systems, NeurIPS 2024

作者： Zhan, Guanqi Zheng, Chuanxia Xie, Weidi Zisserman, Andrew VGG University of Oxford United Kingdom SAI Shanghai Jiao Tong University China

Our objective in this paper is to probe large vision models to determine to what extent they 'understand' different physical properties of the 3D scene depicted in an image. To this end, we make the following contributions: (i) We introduce a general and lightweight protocol to evaluate whether features of an off-the-shelf large vision model encode a number of physical 'properties' of the 3D scene, by training discriminative classifiers on the features for these properties. The probes are applied on datasets of real images with annotations for the property. (ii) We apply this protocol to properties covering scene geometry, scene material, support relations, lighting, and view-dependent measures, and large vision models including CLIP, DINOv1, DINOv2, VQGAN, Stable Diffusion. (iii) We find that features from Stable Diffusion and DINOv2 are good for discriminative learning of a number of properties, including scene geometry, support relations, shadows and depth, but less performant for occlusion and material, while outperforming DINOv1, CLIP and VQGAN for all properties. (iv) It is observed that different time steps of Stable Diffusion features, as well as different transformer layers of DINO/CLIP/VQGAN, are good at different properties, unlocking potential applications of 3D physical understanding. Our project page is https://***/~vgg/research/phy-sd/. © 2024 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：