Understanding the distribution and characteristics of impact craters on planetary surfaces is essential for unraveling geological processes and the evolution of celestial bodies. Several machine learning and AI-based ...
详细信息
Understanding the distribution and characteristics of impact craters on planetary surfaces is essential for unraveling geological processes and the evolution of celestial bodies. Several machine learning and AI-based approaches have been proposed to detect craters on planetary surface images automatically. However, designing a robust tool for an entire complex planet such as Mars, is still an open problem. This article presents a novel approach using the Faster Region-based Convolutional Neural Network (Faster R-CNN) for such a detection. The proposed method involves the pre-processing, training and crater detection steps, which are especially designed for robustness regarding latitude and complex geomorphological features. The objectives of this studies are to (i) be robust at all latitudes and (ii) for >= 1 km diameter crater sizes. (iii) To propose an open-source and re-usable algorithm that (iv) only needs an image to run. Extensive experiments on high-resolution planetary imagery demonstrate excellent performances with an average precision AP(50)>0.82 with an intersection over union criterion IoU >= 0.5, irrespective of crater scale. For mid and high latitudes (higher than 48 degrees north and south), performance decreases down to AP(50)similar to 0.7, which is still better than the current state of the art. Loss of performance is mostly due to strong shadowing effects. Our results also highlight the versatility and potential of our robust model for automating the analysis of craters across different celestial bodies. The automated crater detection tool presented in this article is publicly available as open-source and holds great promise for future scientific research of space exploration missions.
A system for determining the distance from the robot to the scene is useful for object tracking, and 3-D reconstructions may be desired for many manufacturing and robotic tasks. While the robot is processing materials...
详细信息
ISBN:
(纸本)9781510667877;9781510667884
A system for determining the distance from the robot to the scene is useful for object tracking, and 3-D reconstructions may be desired for many manufacturing and robotic tasks. While the robot is processing materials, such as welding parts, milling, drilling, etc., fragments of materials fall on the camera installed on the robot, introducing unnecessary information when building a depth map, as well as the emergence of new lost areas, which leads to incorrect determination of the size of objects. There is a problem comprising a decrease in the accuracy of planning the movement trajectory caused by wrong sections on the depth map because of erroneous distance determination to objects. We present an approach combining defect detection and depth reconstruction algorithms. The first step for image defect detection is based on a convolutional auto-encoder (U-Net). The second step is a depth map reconstruction using a spatial reconstruction based on a geometric model with contour and texture analysis. We apply contour restoration and texture synthesis for image reconstruction. A method is proposed for restoring the boundaries of objects in an image based on constructing a composite curve by cubic splines. Our technique outperforms the state-of-the-art methods quantitatively in reconstruction accuracy on the RGB-D benchmark for evaluating manufacturing vision systems.
This paper introduces an innovative method that combines Computer vision and Deep Learning to extract headlines from a historical newspaper. Through the illustrations from historical newspapers, one of our goals is to...
详细信息
In the modern-day era of technology, a paradigm shift has been witnessed in the areas involving applications of Artificial Intelligence (AI), machine Learning (ML), and Deep Learning (DL). Specifically, Deep Neural Ne...
详细信息
In the modern-day era of technology, a paradigm shift has been witnessed in the areas involving applications of Artificial Intelligence (AI), machine Learning (ML), and Deep Learning (DL). Specifically, Deep Neural Networks (DNNs) have emerged as a popular field of interest in most AI applications such as computer vision, image and video processing, robotics, etc. In the context of developed digital technologies and the availability of authentic data and data handling infrastructure, DNNs have been a credible choice for solving more complex real-life problems. The performance and accuracy of a DNN is a way better than human intelligence in certain situations. However, it is noteworthy that the DNN is computationally too cumbersome in terms of the resources and time to handle these computations. Furthermore, general-purpose architectures like CPUs have issues in handling such computationally intensive algorithms. Therefore, a lot of interest and efforts have been invested by the research fraternity in specialized hardware architectures such as Graphics processing Unit (GPU), Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), and Coarse Grained Reconfigurable Array (CGRA) in the context of effective implementation of computationally intensive algorithms. This paper brings forward the various research works on the development and deployment of DNNs using the aforementioned specialized hardware architectures and embedded AI accelerators. The review discusses the detailed description of the specialized hardware-based accelerators used in the training and/or inference of DNN. A comparative study based on factors like power, area, and throughput, is also made on the various accelerators discussed. Finally, future research and development directions, such as future trends in DNN implementation on specialized hardware accelerators, are discussed. This review article is intended to guide hardware architects to accelerate and improve the effe
In recent years, waste management and need for recycling has gained importance more than ever due to the increase in population. Because of this reason, making recycling easily applicable and available with reasonable...
详细信息
Age detection is a fundamental task in computer vision with numerous applications, from targeted advertising to security systems. This paper proposes a robust approach for age estimation based on local binary patterns...
详细信息
Age detection is a fundamental task in computer vision with numerous applications, from targeted advertising to security systems. This paper proposes a robust approach for age estimation based on local binary patterns to extract features associated with face images. The goal of accurately predicting people's ages from facial images is to overcome challenges such as changes in lighting conditions, poses, and facial expressions. The proposed method uses a combination of feature extraction, feature selection, and machine learning algorithms, which we named Hybrid method. At first, facial landmarks are detected to determine the key points of the face and enable the extraction of the corresponding facial features. These features are then fed into a feature selection algorithm to identify the most distinctive ones, reducing dimensionality and increasing model efficiency. To evaluate the proposed approach, extensive experiments are conducted on benchmark datasets, including different age groups and ethnicities. The results show the effectiveness of the proposed method in achieving high accuracy and robustness in age estimation. As shown in the calculation results, the detection rate and accuracy of Hybrid method age estimation calculations are better than competing methods. For Hybrid method, the mean absolute error is 4.94 years, with a standard deviation of 4.74 years. From the point of view of average absolute error, this age estimation method is superior to other methods that have been presented to date. The proposed method for estimating the age of people has a final sensitivity of 97.2%, an accuracy of 96.8%, and a precision of 99.1%. In addition, it is stated in the specifications of the implementation system that the program can be executed in about 3.5 s, which is a suitable speed for estimating the age of people based on their face photographs.
image captioning is a task through which a textual description can be generated that illustrated the action performed in the image. It is one of the most complicated research areas where only the machine learning appr...
详细信息
ISBN:
(纸本)9783031243660;9783031243677
image captioning is a task through which a textual description can be generated that illustrated the action performed in the image. It is one of the most complicated research areas where only the machine learning approach can intervene. In the area of image captioning, a system should be intelligent enough to understand the semantic knowledge to recognize the object present in the image and the situation that evolves with it. In the proposed work an image captioning system has been generated using ResNet along with CNN and RNN. CNN is used as an encoder and RNN is used as a decoder. The system is able to infer the situation precisely for MSCOCO benchmark. The model has been trained with ResNet152 which effectively utilizes the layers and minimizes the computational time. ResNet skips the convolutional layers that solved the gradient exploding problem, that is why it is also known as skip connection. System perceived better Bilingual Evaluation Understudy (BLEU), METEOR, CIDEr, and Rouge score as compared to the previously implemented model. BLEU score has been evaluated with four parameters as B1, B2, B3 and B4 i.e., 0.57, 0.404, 0.279, 0.191 respectively. METEOR, CIDEr and Rouge have been depicted as 0.195, 0.396 and 0.6 respectively. Model has been better utilized to train the samples by reducing the size of the image and enhancing the brightness with pillow. System also uses the Torch vision library to enhance the model for better predicting the situation.
Pattern recognition is a prominent area of research in computer vision, where different methods have been proposed in the last 50 years. This work presents the development of a Python API to identify the result of two...
详细信息
ISBN:
(纸本)9783030991708;9783030991692
Pattern recognition is a prominent area of research in computer vision, where different methods have been proposed in the last 50 years. This work presents the development of a Python API to identify the result of two six-sided dice used in the game called "Craps" as a no-controlled environment to help visually impaired people. The software is structured in four stages. The first one is capturing images through a device with a digital camera connected to the web via IP address. The second stage corresponds to the captured imageprocessing;it is necessary to establish a standard image size and resize and equalize the digitized image. The third stage seeks to segment the object of study by artificial vision techniques to identify the result of the dice after being thrown. Finally, the fourth stage is to interpret the result and play it through a speaker. The expected possible result is a system that integrates the four stages mentioned above through an intuitive and accessible low-cost Python API, mainly aimed at visually impaired people.
Recent advancements and breakthroughs in deep learning have accelerated the rapid development in the field of computer vision. Having recorded a huge success in 2D object perception and detection, a lot of progress ha...
详细信息
Our objective in this paper is to probe large vision models to determine to what extent they 'understand' different physical properties of the 3D scene depicted in an image. To this end, we make the following ...
暂无评论