In recent years, with the development of artificial intelligence technology, intelligent robots are more and more widely used in many fields. In this paper, an intelligent patrol wheeled robot based on image recogniti...
详细信息
Binocular stereo vision is a commonly applied computer vision technique with a wide range of applications in 3D scene perception. However, binocular stereo matching algorithms are computationally intensive and complic...
详细信息
Binocular stereo vision is a commonly applied computer vision technique with a wide range of applications in 3D scene perception. However, binocular stereo matching algorithms are computationally intensive and complicated. In addition, some traditional platforms are unable to meet the real-time and energy efficient dual requirements. In this paper, we proposed a hardware/software co-design FPGA (Field Programmable Gate Array) approach to overcome these limitations. Based on the characteristics of binocular stereo vision, we modularize the system functions to achieve the hardware/software partitioning. This accelerates the data processing on the FPGA, while simultaneously performing data control on the ARM (Advanced RISC machine) cores. The parallelism of the FPGA allows for a full-pipeline design that is synchronized with an identical system clock for the simultaneous running of multiple stereo processing components, thus improving the processing speed. Furthermore, to minimize hardware costs, the collected images and data are compressed prior to matching, while the precision is subsequently enhanced during post-processing. The proposed system was evaluated on the PYNQ-Z2 development board, with experimental results revealing its high real-time performance and low power consumption for a 100M clock frequency. Compared with existing designs, the simple yet flexible system demonstrated a higher imageprocessing speed and less hardware resource overhead (thus lower power consumption). The average error rate of the BM matching algorithm was also improved, particularly with the limited PYNQ-Z2 hardware resource. The proposed system has been opened on GitHub.
In this paper, an online monitoring system of welding quality based on machinevision and machine learning was proposed. A high-speed CCD camera was used to monitor the tail end of the molten pool, and the remove smal...
详细信息
The neuromorphic vision system (NvS) equipped with optoelectronic synapses integrates perception, storage, and processing and is expected to address the issues of traditional machinevision. However, owing to their la...
详细信息
The neuromorphic vision system (NvS) equipped with optoelectronic synapses integrates perception, storage, and processing and is expected to address the issues of traditional machinevision. However, owing to their lack of stereo vision, existing NvSs focus on 2D imageprocessing, which makes it difficult to solve problems such as spatial cognition errors and low-precision interpretation. Consequently, inspired by the human visual system, an NvS with stereo vision is developed to achieve 3D object recognition, depending on the prepared ReS2 optoelectronic synapse with 12.12 fJ ultralow power consumption. This device exhibits excellent optical synaptic plasticity derived from the persistent photoconductivity effect. As the cornerstone for 3D vision, color planar information is successfully discriminated and stored in situ at the sensor end, benefiting from its wavelength-dependent plasticity in the visible region. Importantly, the dependence of the channel conductance on the target distance is experimentally revealed, implying that the structure information on the object can be directly captured and stored by the synapse. The 3D image of the object is successfully reconstructed via fusion of its planar and depth images. Therefore, the proposed 3D-NvS based on ReS2 synapses for 3D objects achieves a recognition accuracy of 97.0%, which is much higher than that for 2D objects (32.6%), demonstrating its strong ability to prevent 2D-photo spoofing in applications such as face payment, entrance guard systems, and others.
作者:
Muniraj, InbarasanLiFE Lab
Alliance School of Applied Engineering Alliance University Karnataka Bengaluru562106 India
Artificial intelligence techniques, such as machine learning (ML) and deep learning (DL), are now widely used in various vision-based applications. Here, we summarize some of the most recent advances in Computational ...
详细信息
image-to-image translation is the process of transforming an image from one domain to another, where the goal is to learn the mapping between an input image and an output image. This task has been generally performed ...
详细信息
machine learning has become the state-of-the-art technique for many tasks including computer vision, natural language processing, speech processing tasks, etc. However, the unique challenges posed by machine learning ...
详细信息
machine learning has become the state-of-the-art technique for many tasks including computer vision, natural language processing, speech processing tasks, etc. However, the unique challenges posed by machine learning suggest that incorporating user knowledge into the system can be beneficial. The purpose of integrating human domain knowledge is also to promote the automation of machine learning. Human-in-the-loop is an area that we see as increasingly important in future research due to the knowledge learned by machine learning cannot win human domain knowledge. Human-in-the-loop aims to train an accurate prediction model with minimum cost by integrating human knowledge and experience. Humans can provide training data for machine learning applications and directly accomplish tasks that are hard for computers in the pipeline with the help of machine-based approaches. In this paper, we survey existing works on human-in-the-loop from a data perspective and classify them into three categories with a progressive relationship: (1) the work of improving model performance from data processing, (2) the work of improving model performance through interventional model training, and (3) the design of the system independent human-in-the-loop. Using the above categorization, we summarize the major approaches in the field;along with their technical strengths/weaknesses, we have a simple classification and discussion in natural language processing, computer vision, and others. Besides, we provide some open challenges and opportunities. This survey intends to provide a high-level summarization for human-in-the-loop and to motivate interested readers to consider approaches for designing effective human-in-the-loop solutions. Keywords: Human-in-the-loop machine learning Deep learning Data processing Computer vision Natural language processing (C) 2022 Elsevier B.v. All rights reserved.
Recent years have seen the emergence of computer vision, a subfield of artificial intelligence (AI), as a technology that has the potential to revolutionize agricultural practices. This might have an impact on many ag...
详细信息
ISBN:
(纸本)9783031686160;9783031686177
Recent years have seen the emergence of computer vision, a subfield of artificial intelligence (AI), as a technology that has the potential to revolutionize agricultural practices. This might have an impact on many agricultural practices and crop management techniques. This is due to the fact that computer vision can examine pictures and identify patterns of data. This article provides a summary of the uses of computer vision in agriculture as well as the consequences such applications have had. The issues of precision agriculture, disease diagnosis, crop monitoring, and yield computation may be overcome with the use of computer vision technologies such as image recognition, object detection, and pattern analysis. Specifically, it investigates the ways in which these strategies are useful. In addition to this, it investigates the benefits, drawbacks, and possible future applications of computer vision in agriculture, with a particular emphasis on the potential for the sector to improve its levels of productivity, sustainability, and profitability. According to this in-depth analysis, computer vision has revolutionized the agricultural industry and contributed significantly to economic growth. In order to evaluate the financial impacts that computer vision has had on the agricultural industry, this study looks at a wide range of academic papers, publications, and reports. The findings highlight the advancements, benefits, challenges, and future opportunities presented by computer vision technology in the areas of crop monitoring, precision farming, animal management, and harvesting. According to the evaluation, computer vision has the potential to improve farming in terms of productivity, resource allocation, cost reduction, and sustainable practices.
The Internet of Things (IoT) provides a collaborative infrastructure to communicate smart devices with cloud-edge healthcare applications, medical devices, wearable biosensors, etc. On the other hand, crowd counting a...
详细信息
The Internet of Things (IoT) provides a collaborative infrastructure to communicate smart devices with cloud-edge healthcare applications, medical devices, wearable biosensors, etc. On the other hand, crowd counting as one of computer vision approaches is an emerging topic to detect any objects with static or dynamic mobility in the IoT environments. Smart crowd counting enables pattern recognition for many intelligent applications such as microbiology, surveillance, healthcare systems, crowdedness estimation, and other environmental case studies. According to complicated capturing systems in the IoT environments, crowd counting methods can influence on performance of object detection in the critical case studies using Artificial Intelligence (AI)-based approaches such as machine learning, deep learning, collaborative learning, fuzzy logic and meta-heuristic algorithms. This paper provides a new comprehensive technical analysis for existing AI-based crowd counting approaches in healthcare and medical systems, biotechnology and IoT environments. Meanwhile, it presents a discussion on the existing case studies with respect to analyzing technical aspects and applied algorithms to enhance pattern prediction factors. Finally, some new innovative efforts and challenges are presented for new research upcoming and open issues.
Sensing 3D objects is critical when 2D object recognition is not accessible. A robot pre-trained on a large point-cloud dataset will encounter unseen classes of 3D objects after deploying it. Therefore, the robot shou...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Sensing 3D objects is critical when 2D object recognition is not accessible. A robot pre-trained on a large point-cloud dataset will encounter unseen classes of 3D objects after deploying it. Therefore, the robot should be able to learn continuously in real-world scenarios. Few-shot class-incremental learning (FSCIL) requires the model to learn from few-shot new examples continually and not forget past classes. However, there is an implicit but strong assumption in the FSCIL that the distribution of the base and incremental classes is the same. In this paper, we focus on cross-domain FSCIL for point-cloud recognition. We decompose the catastrophic forgetting into base class forgetting and incremental class forgetting and alleviate them separately. We utilize the base model to discriminate base samples and new samples by treating base samples as in-distribution samples, and new objects as out-of-distribution samples. We retain the base model to avoid catastrophic forgetting of base classes and train an extra domain-specific module for all new samples to adapt to new classes. At inference, we first discriminate whether the sample belongs to the base class or the new class. Once classified at the model level, test samples are then passed to the corresponding model for class-level classification. To better mitigate the forgetting of new classes, we adopt the soft label and hard label replay together. Extensive experiments on synthetic-to-real incremental 3D datasets show that our proposed method can balance the performance between the base and new objects and outperforms the previous state-of-the-art methods.
暂无评论