Computer applications have considerably shifted from single data processing to machine learning in recent years due to the accessibility and availability of massive volumes of data obtained through the internet and va...
详细信息
Computer applications have considerably shifted from single data processing to machine learning in recent years due to the accessibility and availability of massive volumes of data obtained through the internet and various sources. machine learning is automating human assistance by training an algorithm on relevant data. Supervised, Unsupervised, and Reinforcement Learning are the three fundamental categories of machine learning techniques. In this paper, we have discussed the different learning styles used in the field of Computer vision, Deep Learning, Neural networks, and machine learning. Some of the most recent applications of machine learning in computer vision include object identification, object classification, and extracting usable information from images, graphic documents, and videos. Some machine learning techniques frequently include zero-shot learning, active learning, contrastive learning, self-supervised learning, life-long learning, semi-supervised learning, ensemble learning, sequential learning, and multi-view learning used in computer vision until now. There is a lack of systematic reviews about all learning styles. This paper presents literature analysis of how different machine learning styles evolved in the field of Artificial Intelligence (AI) for computer vision. This research examines and evaluates machine learning applications in computer vision and future forecasting. This paper will be helpful for researchers working with learning styles as it gives a deep insight into future directions.
visual segmentation seeks to partition images, video frames, or point clouds into multiple segments or groups. This technique has numerous real-world applications, such as autonomous driving, image editing, robot sens...
详细信息
visual segmentation seeks to partition images, video frames, or point clouds into multiple segments or groups. This technique has numerous real-world applications, such as autonomous driving, image editing, robot sensing, and medical analysis. Over the past decade, deep learning-based methods have made remarkable strides in this area. Recently, transformers, a type of neural network based on self-attention originally designed for natural language processing, have considerably surpassed previous convolutional or recurrent approaches in various visionprocessing tasks. Specifically, vision transformers offer robust, unified, and even simpler solutions for various segmentation tasks. This survey provides a thorough overview of transformer-based visual segmentation, summarizing recent advancements. We first review the background, encompassing problem definitions, datasets, and prior convolutional methods. Next, we summarize a meta-architecture that unifies all recent transformer-based approaches. Based on this meta-architecture, we examine various method designs, including modifications to the meta-architecture and associated applications. We also present several specific subfields, including 3D point cloud segmentation, foundation model tuning, domain-aware segmentation, efficient segmentation, and medical segmentation. Additionally, we compile and re-evaluate the reviewed methods on several well-established datasets. Finally, we identify open challenges in this field and propose directions for future research.
The combination of machinevision and grinding robots can be visualized as a collaboration between human eyes and limbs to achieve a deep integration between external perception and execution actions. This combination...
详细信息
The combination of machinevision and grinding robots can be visualized as a collaboration between human eyes and limbs to achieve a deep integration between external perception and execution actions. This combination will give the grinding robot more operability and flexibility, which will enable it to better realize the purpose of replacing humans with machines. In response to the demand for flexible grinding of titanium surface edges proposed by a titanium manufacturer, this paper conducts an in-depth study on the prototype system of vision-guided grinding robots and related applications. Firstly, this study analyzes the shortcomings of the existing robotic regrinding process and achieves the improvement of the regrinding process by introducing machinevision technology. Subsequently, this study further utilizes machinevision and imageprocessing algorithms to achieve high-quality recognition and high-precision positioning of metal surface edges. Then, the D-H parameter model of the regrinding robot is established, and the planning and simulation of the regrinding trajectory is carried out using the position information of the identified regrinding edges. Finally, the simulation-validated grinding trajectory is introduced into the grinding robot, and the effectiveness of the proposed scheme is verified by actual grinding experiments.
Electron density plays an important role in the study of wave propagation and is known to be associated with the index of refraction and radiation belt diffusion coefficients. The primary objective of our investigatio...
详细信息
Electron density plays an important role in the study of wave propagation and is known to be associated with the index of refraction and radiation belt diffusion coefficients. The primary objective of our investigation is to explore the possibility of implementing an onboard signal processing algorithm to automatically obtain electron densities from the upper hybrid resonance traces of wave spectrograms for future missions. U-Net, developed for biomedical image segmentation, has been adapted as our deep learning architecture with results being compared with those extracted from a more traditional semi-automated method. As a product, electron densities and cyclotron frequencies for the entire DSX mission between 2019 and 2021 are acquired for further analysis and applications. Due to limited space measurements, a synthetic image generator based on data statistics and randomization is proposed as an initial step toward the development of a generative adversarial network in hopes of providing unlimited realistic data sources for advanced machine learning. Plain Language Summary Electron density is the most important fundamental plasma parameter, however, it is very difficult to directly measure in situ due to spacecraft potential. A convolutional neural network (CNN), developed to recognize features from biomedical images, has been adapted to pull out the resonance traces from space wave receivers automatically specifying densities along satellite orbits. The comparison between computer vision based on a CNN and human vision based on a semi-automated extraction is demonstrated in this paper. With additional development and refinement, our proof-of-concept study may be matured to a level suitable for incorporation into onboard signal processing units to reduce human labor and human-in-the-loop induced operational errors during future space missions.
Despite massive development in aerial robotics, precise and autonomous landing in various conditions is still challenging. This process is affected by many factors, such as terrain shape, weather conditions, and the p...
详细信息
Despite massive development in aerial robotics, precise and autonomous landing in various conditions is still challenging. This process is affected by many factors, such as terrain shape, weather conditions, and the presence of obstacles. This paper describes a deep learning-accelerated imageprocessing pipeline for accurate detection and relative pose estimation of the UAv with respect to the landing pad. Moreover, the system provides increased safety and robustness by implementing human presence detection and error estimation for both landing target detection and pose computation. Human presence and landing pad location are performed by estimating the presence probability via segmentation. This is followed by the landing pad keypoints' location regression algorithm, which, in addition to coordinates, provides the uncertainty of presence for each defined landing pad landmark. To perform the aforementioned tasks, a set of lightweight neural network models was selected and evaluated. The resulting measurements of the system's performance and accuracy are presented for each component individually and for the whole processing pipeline. The measurements are performed using onboard embedded UAv hardware and confirm that the method can provide accurate, low-latency feedback information for safe landing support.
Transformer models have achieved outstanding results on a variety of language tasks, such as text classification, ma- chine translation, and question answering. This success in the field of Natural Language processing...
详细信息
Transformer models have achieved outstanding results on a variety of language tasks, such as text classification, ma- chine translation, and question answering. This success in the field of Natural Language processing (NLP) has sparked interest in the computer vision community to apply these models to vision and multi-modal learning tasks. However, visual data has a unique structure, requiring the need to rethink network designs and training methods. As a result, Transformer models and their variations have been suc- cessfully used for image recognition, object detection, seg- mentation, image super-resolution, video understanding, image generation, text-image synthesis, and visual question answering, among other applications.
Lensless imagers based on diffusers or encoding masks enable high -dimensional imaging from a single-shot measurement and have been applied in various applications. However, to further extract image information such a...
详细信息
Lensless imagers based on diffusers or encoding masks enable high -dimensional imaging from a single-shot measurement and have been applied in various applications. However, to further extract image information such as edge detection, conventional post -processing filtering operations are needed after the reconstruction of the original object images in the diffuser imaging systems. Here, we present the concept of a temporal compressive edge detection method based on a lensless diffuser camera, which can directly recover a time sequence of edge images of a moving object from a single-shot measurement, without further post -processing steps. Our approach provides higher image quality during edge detection, compared with the "conventional post -processing method." We demonstrate the effectiveness of this approach by both numerical simulation and experiments. The proof-of-concept approach can be further developed with other image post -processing operations or versatile computer vision assignments toward task-oriented intelligent lensless imaging systems.
Artificial Intelligence (AI) combined with imageprocessing has shown significant improvements through new techniques such as machine Learning (ML) models. This paper introduces the key methods and algorithms used for...
详细信息
Artificial Intelligence (AI) combined with imageprocessing has shown significant improvements through new techniques such as machine Learning (ML) models. This paper introduces the key methods and algorithms used for Drone imageprocessing. We discuss the benefits and limitations of using ML models instead of classical techniques. Our goal is to classify, categorize and describe the methods that are used in realistic settings of diverse domains of applications. We conducted a systematic literature review where systems presented in the papers were analysed based on their domain, task, technology, and efficiency. By extensively reviewing the existing literature, we successfully identified key themes and trends that emerged across the various research questions. The overall findings of the research emphasise the potential of AI and drone imagery in numerous fields. However, the review also uncovered several challenges that necessitate attention, such as issues related to data quality and the requirement for more advanced AI algorithms. The paper outlines significant innovations in the field and offers recommendations for future research directions. By highlighting cross-disciplinary insights, it delves into methodological approaches, exploring commonalities in AI algorithms and UAvs technologies.
The burgeoning fields of the Internet of things (IoT) and artificial intelligence (AI) have escalated the demands for image sensing technologies, necessitating advancements in sensor efficiency and functionality. Trad...
详细信息
The burgeoning fields of the Internet of things (IoT) and artificial intelligence (AI) have escalated the demands for image sensing technologies, necessitating advancements in sensor efficiency and functionality. Traditional image sensors, structured on von Neumann architectures with discrete processing units, face challenges, such as high power consumption, latency, and escalated hardware costs. In this work, we introduced a unique approach through the development of a quasi-one-dimensional nanowire Nb3Se12I-based double-ended photosensor. The advanced sensor not only replicated the adaptive behavior of biological vision systems but also effectively managed the decreased sensitivity triggered by intense light stimuli. The integration of the photothermoelectric and bolometric effects allows the device to operate in a self-powered mode, offering broadband detectivity ranging from visible (405 nm) to midwave infrared (4060 nm). Additionally, the quasi-one-dimensional structure enables an angle-dependent response to polarized light with a polarization ratio of 1.83. Our findings suggest that the biomimetic vision adaptive sensor based on Nb3Se12I could effectively enhance the capabilities of smart optical sensors and machinevision systems.
Retinal fundus imaging plays a crucial role in the diagnosis of ophthalmic diseases such as glaucoma, a significant cause of vision loss worldwide. Accurate detection of glaucoma using imageprocessing, machine learni...
详细信息
暂无评论