The online car is the product of the "Interonline+ " era. As a new city intelligent transportation mode, it breaks the old interest pattern and the barrier of taxi industry based on franchise. In recent year...
详细信息
Smart devices components come in a variety of shapes, sizes and texture. To perform a high throughput inspection on all the six sides of the surface of the component is challenging due to tolerances of product dimensi...
详细信息
Recognition of Bonsai plant is one of the most challenging task. This is because most of the people have less knowledge about Bonsai especially for a beginner. For those who new to this field, it might be hard for the...
详细信息
ISBN:
(纸本)9781665485296
Recognition of Bonsai plant is one of the most challenging task. This is because most of the people have less knowledge about Bonsai especially for a beginner. For those who new to this field, it might be hard for them to recognize and identify the species of Bonsai because of its similarity in terms of shape, colour and etc. The incorrect identification of species, may resulting in damaging the Bonsai plant. Furthermore, different species of Bonsai may have different ways to take care of it. Therefore, the information about the Bonsai need to be accessible with the recognition of the species. As a solution, the aims of this project is to develop a system for recognising three species of Bonsai: 1) Adenium, 2) Red Japanese Maple and 3) Natal Plum by using its leaf. The project implemented a Rapid Application Development (RAD) Model as the methodology. There are four phases in RAD: 1) Planning, 2) Design, 3) Implementation and 4) Finalization. In pre-processed phase, feature extraction of the leaf is using colour moment and Gray-Level Co- occurrence Matrix (GLCM) were used for extracting the colour of the leaf. The species of Bonsai has been classified using Support Vector machine-based approach and the system has been successfully recognize the species of Bonsai with accuracy of 98.2%.
In lossy image compression, the objective is to achieve minimal signal distortion while compressing images to a specified bit rate. The increasing demand for visual analysis applications, particularly in classificatio...
详细信息
In lossy image compression, the objective is to achieve minimal signal distortion while compressing images to a specified bit rate. The increasing demand for visual analysis applications, particularly in classification tasks, has emphasized the significance of considering semantic distortion in compressed images. To bridge the gap between image compression and visual analysis, we propose a Rate-Distortion -Classification (RDC) model for lossy image compression, offering a unified framework to optimize the trade-off between rate, distortion, and classification accuracy. The RDC model is extensively analyzed both statistically on a multi-distribution source and experimentally on the widely used MNIST dataset. The findings reveal that the RDC model exhibits desirable properties, including monotonic non-increasing and convex functions, under certain conditions. This work provides insights into the development of human-machine friendly compression methods and Video Coding for machine (VCM) approaches, paving the way for end-to-end image compression techniques in real-world applications. & COPY;2023 Elsevier Inc. All rights reserved.
Fisheye cameras, with their ultra-wide-angle field of view, can capture a larger scene in a single shot compared to traditional lenses. This capability makes them highly valuable in areas such as security surveillance...
详细信息
ISBN:
(数字)9798350355413
ISBN:
(纸本)9798350355420
Fisheye cameras, with their ultra-wide-angle field of view, can capture a larger scene in a single shot compared to traditional lenses. This capability makes them highly valuable in areas such as security surveillance, panoramic photography, autonomous driving, and robot navigation. However, building an efficient and accurate visual system is crucial for realizing these applications. This paper proposes an improved algorithm based on fisheye image correction and stitching. First, a fast fisheye image distortion correction method is introduced to eliminate the severe distortion caused by the fisheye lens. Then, the o-FAST algorithm is used for feature point detection, ensuring rotational invariance of the feature points, and the KNN algorithm is employed for initial matching. On this basis, the PROSAC algorithm is applied to remove incorrect matches and compute the homography matrix between images. Finally, seamless image stitching is achieved using a gradual weighted fusion method. The research results demonstrate that this method significantly improves both image quality and computational efficiency, making it effective for generating and processing panoramic images. Experimental results further confirm that the method enhances stitching accuracy while also increasing processing speed, showcasing its high practical value.
The main purpose of this study is to explore the issues of real-time, accurate, and unmarked recognition of sports movements in recent years. By reviewing the relevant research on machine learning or deep learning for...
详细信息
ISBN:
(数字)9798350374407
ISBN:
(纸本)9798350374414
The main purpose of this study is to explore the issues of real-time, accurate, and unmarked recognition of sports movements in recent years. By reviewing the relevant research on machine learning or deep learning for specific sports or target actions based on computer visionimage data input, the aim is to provide references for the application of unmarked motion capture technology in the field of sports motion recognition. The research employed a literature review methodology, conducting searches in six databases, namely Web of Science, PubMed, Scopus, Google Scholar, IEEE Xplore, and China National Knowledge Infrastructure (CNKI), covering publications from January 2000 to June 2020. Through boolean logic operations on the retrieved literature, key information such as first author/publication year, types/targets of motion, participant information, camera parameters, image feature extraction techniques, action recognition algorithms, evaluation methods for action recognition quality, training and validation methods for image data, and performance metrics for action recognition were extracted. After screening, a total of 23 articles were included in the study. The findings revealed that $39 \%$ of the studies utilized machine learning algorithms based on support vector machines, while $35 \%$ employed deep learning algorithms based on convolutional neural networks. Commonly used evaluation metrics for action recognition quality included classification accuracy, confusion matrix, and displacement error. The development of computer vision motion capture, models, and algorithms demonstrated promising applications in areas such as action technique recognition and sports performance analysis. Traditional machine learning algorithms like support vector machines and principal component analysis remain dominant in action recognition technology; however, in certain scenarios, the performance of deep learning algorithms surpassed that of traditional machine learning methods.
Different libraries allow performing computer vision tasks, e.g., object recognition, in almost every mobile device that has a computing capability. In modern smartphones, such tasks are compute-intensive, energy hung...
详细信息
ISBN:
(纸本)9781665433266
Different libraries allow performing computer vision tasks, e.g., object recognition, in almost every mobile device that has a computing capability. In modern smartphones, such tasks are compute-intensive, energy hungry computation running on the GPU or the particular machine Learning (ML) processor embedded in the device. Task offloading is a strategy adopted to move compute-intensive tasks and hence their energy consumption to external computers, in the edge network or in the cloud. In this paper, we report an experimental study that measure under different mobile computer vision set-ups the energy reduction when the inference of an imageprocessing is moved to an edge node, and the capability to still meet real-time requirements. In particular, our experiments show that offloading the task - in our case real-time object recognition - to a possible next-to-the-user node allows saving about the 70% of battery consumption while maintaining the same frame rate (fps) that local processing can achieve.
In the era of digitization and big data, the world is inundated with an ever-growing volume of visual content, be it images or videos. As organizations strive to harness the potential of these multimedia data sources,...
详细信息
ISBN:
(纸本)9798400709418
In the era of digitization and big data, the world is inundated with an ever-growing volume of visual content, be it images or videos. As organizations strive to harness the potential of these multimedia data sources, there is an increasing need for advanced imageprocessing techniques that can automate the analysis and extraction of valuable information. Amazon Web Services (AWS) Rekognition emerges as a powerful solution in this landscape, offering a comprehensive system for image and video analysis through the lens of machine learning and computer vision. This paper delves into the realm of imageprocessing using AWS Rekognition, unveiling the transformative capabilities of this cloud-based service and its applications in various domains. As we embark on this journey, we will explore the principles, methodologies, and real-world implications of leveraging AWS Rekognition for image analysis.
In various fields such as medical imaging, object detection, and video surveillance, multi view natural language query systems utilize image data to provide a more comprehensive perspective, allowing users to intuitiv...
In various fields such as medical imaging, object detection, and video surveillance, multi view natural language query systems utilize image data to provide a more comprehensive perspective, allowing users to intuitively query and obtain information. Due to the lack of a deep understanding of natural language in the hard coded matching rule method, the query results do not match the user's intentions and are difficult to meet practical application needs. Therefore, this article introduces machinevision algorithms for optimization and improvement. This article first discusses the system architecture of four modules: data input and preprocessing, visual feature extraction, natural language understanding and matching, and result generation and feedback. Then, the application of machinevision technology in the system was analyzed using two calculation formulas: grayscale conversion and binarization, and natural language processing technology was briefly discussed. Subsequently, a context understanding module was added to construct a multi view natural language query system based on machinevision. Finally, two sets of simulation experiments were conducted to draw the following conclusion: compared with traditional methods, the overall average improvement in image recognition accuracy indicators is about 14.3%, while the overall average improvement in response speed indicators is about 26.5%. This research system can effectively process images from different perspectives and match them with natural language queries.
The integration of human-robot interaction (HRI) technologies with industrial automation has become increasingly essential for enhancing productivity and safety in manufacturing environments. In this paper, we propose...
详细信息
ISBN:
(数字)9798331506520
ISBN:
(纸本)9798331506537
The integration of human-robot interaction (HRI) technologies with industrial automation has become increasingly essential for enhancing productivity and safety in manufacturing environments. In this paper, we propose a novel approach to address these challenges by using stereo vision and gesture control in cooperative robotic cells. Our system enables seamless authentication of operators and real-time verification of task execution, ensuring compliance with established protocols and safety *** features of our system include its gesture-based operation with gesture recognition algorithms, allowing operators to interact with robotic systems intuitively and efficiently. By leveraging stereo vision, our system accurately tracks the operators’ movement within the workspace, facilitating precise task execution and object *** present a detailed description of our system architecture, experimental configuration, and real-world performance assessment. Our results demonstrate the effectiveness and feasibility of our approach in enhancing operational efficiency, ensuring quality, and improving the overall user experience in industrial automation.
暂无评论