Traditional machine learning, mainly supervised learning, follows the assumptions of closed-world learning, i.e., for each testing class, a training class is available. However, such machine learning models fail to id...
详细信息
Traditional machine learning, mainly supervised learning, follows the assumptions of closed-world learning, i.e., for each testing class, a training class is available. However, such machine learning models fail to identify the classes, which were not available during training time. These classes can be referred to as unseen classes. Open-world machine Learning (OWML) is a novel technique, which deals with unseen classes. Although OWML is around for a few years and many significant research works have been carried out in this domain, there is no comprehensive survey of the characteristics, applications, and impact of OWML on the major research areas. In this article, we aimed to capture the different dimensions of OWML with respect to other traditional machine learning models. We have thoroughly analyzed the existing literature and provided a novel taxonomy of OWML considering its two major application domains: Computer vision and Natural Language processing. We listed the available software packages and open datasets in OWML for future researchers. Finally, the article concludes with a set of research gaps, open challenges, and future directions.
Models based on the transformer architecture have seen widespread application across fields such as natural language processing (NLP), computer vision, and robotics, with large language models (LLMs) like ChatGPT revo...
详细信息
Models based on the transformer architecture have seen widespread application across fields such as natural language processing (NLP), computer vision, and robotics, with large language models (LLMs) like ChatGPT revolutionizing machine understanding of human language and demonstrating impressive memory capacity and reproduction capabilities. Traditional machine learning algorithms struggle with catastrophic forgetting, detrimental to the diverse and generalized abilities required for robotic deployment. This article investigates the receptance weighted key value (RWKV) framework, known for its advanced capabilities in efficient and effective sequence modeling, integration with the decision transformer (DT), and experience replay architectures. It focuses on potential performance enhancements in sequence decision-making and lifelong robotic learning tasks. We introduce the decision-RWKV (DRWKV) model and conduct extensive experiments using the D4RL database within the OpenAI Gym environment and on the D'Claw platform to assess the DRWKV model's performance in single-task tests and lifelong learning scenarios, showing its ability to handle multiple subtasks efficiently. The code for all algorithms, training, and image rendering in this study is available online (open source).
Recent advancements in signal processing and computational power have revolutionized computer visionapplications in diverse industries such as agriculture, food processing, biomedical, and the military. These develop...
详细信息
Recent advancements in signal processing and computational power have revolutionized computer visionapplications in diverse industries such as agriculture, food processing, biomedical, and the military. These developments are propelling efforts to automate processes and enhance efficiency. Notably, computational techniques are replacing labor-intensive manual methods for assessing the maturity indices of fruits and vegetables during critical growth *** review paper focuses on recent advancements in computer vision techniques specifically applied to determine the maturity indices of fruits and vegetables within the food processing sector. It highlights successful applications of Nuclear Magnetic Resonance (NMR), Near-Infrared Spectroscopy (NIR), thermal imaging, and image scanning. By examining these techniques, their underlying principles, and practical feasibility, it offers valuable insights into their effectiveness and potential widespread adoption. Additionally, integrating biosensors and AI techniques further improves accuracy and efficiency in maturity index *** summary, this review underscores the significant role of computational techniques in advancing maturity index assessment and provides insights into their principles and effective utilization. Looking ahead, the future of computer vision techniques holds immense potential. Collaborative efforts among experts from various fields will be crucial to address challenges, ensure standardization, and safeguard data privacy. Embracing these advancements can lead to sustainable practices, optimized resource management, and progress across industries. 1. Recent advancements in signal processing and computation drive interest in computer vision across industries.2. The review focuses on non-destructive methods in fruits and vegetables.3. Computational techniques replace manual methods for maturity index determination.4. The principles of techniques are highlighted, along with their successful
With the continuous progress of imageprocessing and machinevision technology, the demand for efficient and real-time processing is becoming more and more prominent, especially in the field of high-noise image proces...
详细信息
ISBN:
(纸本)9798350377040;9798350377033
With the continuous progress of imageprocessing and machinevision technology, the demand for efficient and real-time processing is becoming more and more prominent, especially in the field of high-noise imageprocessing. In this study, an adaptive Gaussian filtering algorithm is proposed, which is implemented based on FPGA and aims to improve the computational efficiency and real-time performance of the imageprocessing system. Compared with the traditional fixed-weight filter, this algorithm is able to dynamically adjust the filtering parameters according to different noise environments, effectively balancing noise suppression and image detail retention. We coded the algorithm using Verilog hardware description language and verified it on PYNQ-Z2 FPGA platform. The experimental results show that the adaptive algorithm outperforms the fixed-weight filtering method in terms of performance, especially in terms of noise suppression and detail preservation. Meanwhile, the FPGA hardware implements the reduction of filtering delay and optimization of resource consumption, making it well suited for real-time applications. This study demonstrates the promise of FPGA adaptive filtering for applications in medical imaging, remote sensing, and intelligent surveillance, which have stringent requirements for high-performance and high-efficiency processing. This research provides new hardware solutions for real-time, high-quality imageprocessing in constrained environments.
Ever since ancient times, earthquakes have been a major threat to the civil infrastructures and the safety of human beings. The majority of casualties in earthquake disasters are caused by the damaged civil infrastruc...
详细信息
Ever since ancient times, earthquakes have been a major threat to the civil infrastructures and the safety of human beings. The majority of casualties in earthquake disasters are caused by the damaged civil infrastructures but not by the earthquake itself. Therefore, the efficient and accurate post-earthquake assessment of the conditions of structural damage has been an urgent need for human society. Traditional ways for post-earthquake structural assessment rely heavily on field investigation by experienced experts, yet, it is inevitably subjective and inefficient. Structural response data are also applied to assess the damage;however, it requires mounted sensor networks in advance and it is not intuitional. As many types of damaged states of structures are visible, computer vision-based post-earthquake structural assessment has attracted great attention among the engineers and scholars. With the development of image acquisition sensors, computing resources and deep learning algorithms, deep learning-based post-earthquake structural assessment has gradually shown potential in dealing with image acquisition and processing tasks. This paper comprehensively reviews the state-of-the-art studies of deep learning-based post -earthquake structural assessment in recent years. The conventional way of imageprocessing and machine learning-based structural assessment are presented briefly. The workflow of the methodology for computer vision and deep learning-based post -earthquake structural assessment was introduced. Then, applications of assessment for multiple civil infrastructures are presented in detail. Finally, the challenges of current studies are summarized for reference in future works to improve the efficiency, robustness and accuracy in this field.
This paper addresses two key limitations in existing image Signal processing (ISP) approaches: the suboptimal performance in low-light conditions and the lack of trainability in traditional ISP methods. To tackle thes...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
This paper addresses two key limitations in existing image Signal processing (ISP) approaches: the suboptimal performance in low-light conditions and the lack of trainability in traditional ISP methods. To tackle these issues, we propose a novel, trainable ISP framework that incorporates both the strengths of traditional ISP techniques and advanced MultiScale Retinex (MSR) algorithms for night-time enhancement. Our method consists of three primary components: an ISP-based Luminance Harmonization layer to initially optimize luminance levels in RAW data, a deep learning-based MSR layer for nuanced decomposition of image components, and a specialized enhancement layer for both precise, regionspecific luminance enhancement and color denoising. The proposed approach is validated through rigorous experiments on machinevision benchmarks and objective visual quality indicators. Our results demonstrate not only a significant improvement over existing methods but also robust adaptability under diverse lighting conditions. This work offers a versatile ISP framework with promising applications beyond its immediate scope.
We consider a variational approach to the problem of structure + texture decomposition (also known as cartoon + texture decomposition). As usual for many variational problems in image analysis and processing, the ener...
详细信息
We consider a variational approach to the problem of structure + texture decomposition (also known as cartoon + texture decomposition). As usual for many variational problems in image analysis and processing, the energy we minimize consists of two terms: a data-fitting term and a regularization term. The main feature of our approach consists of choosing parameters in the regularization term adaptively. Namely, the regularization term is given by a weighted p(.)-Dirichlet-based energy ? a(x)|?u|( p(x)), where the weight and exponent functions are determined from an analysis of the spectral content of the image curvature. Our numerical experiments, both qualitative and quantitative, suggest that the proposed approach delivers better results than state-of-the-art methods for extracting the structure from textured and mosaic images, as well as competitive results on image enhancement problems.
To meet the needs of teaching and practical applications in machinevision technology, a virtual reality-based machinevision experimental platform has been designed and developed. Unity3D was utilized as the developm...
详细信息
In today's ever-changing world, the ability of machine learning models to continually learn new data without forgetting previous knowledge is of utmost importance. However, in the scenario of few-shot class-increm...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
In today's ever-changing world, the ability of machine learning models to continually learn new data without forgetting previous knowledge is of utmost importance. However, in the scenario of few-shot class-incremental learning (FSCIL), where models have limited access to new instances, this task becomes even more challenging. Current methods use prototypes as a replacement for classifiers, where the cosine similarity of instances to these prototypes is used for prediction. However, we have identified that the embedding space created by using the relu activation function is incomplete and crowded for future classes. To address this issue, we propose the Expanding Hyperspherical Space (EHS) method for FSCIL. In EHS, we utilize an odd-symmetric activation function to ensure the completeness and symmetry of embedding space. Additionally, we specify a region for base classes and reserve space for unseen future classes, which increases the distance between class distributions. Pseudo instances are also used to enable the model to anticipate possible upcoming samples. During inference, we provide rectification to the confidence to prevent bias towards base classes. We conducted experiments on benchmark datasets such as CIFAR100 and miniimageNet, which demonstrate that our proposed method achieves state-of-the-art performance.
Feature compression has attracted much attention in recent years due to its promising applications in scenarios where features are transmitted and analyzed by machinevision. However, existing research mainly focuses ...
详细信息
Feature compression has attracted much attention in recent years due to its promising applications in scenarios where features are transmitted and analyzed by machinevision. However, existing research mainly focuses on coarse-grained features extracted from recognition tasks such as classification and detection, neglecting fine-grained features extracted from identification tasks. In this paper, we make a pioneering attempt to study fine-grained feature compression in the context of identification tasks. Our main focus is on the distortion metric, given its critical importance in optimizing the performance of a compression network. We initiate our discussion by reviewing the instance-level metrics in existing literature, highlighting their oversight of the inter-feature relationships. The inter-feature relationships are especially important for identification tasks as they involve similarity comparison among different identities. To address this problem, we propose to consider inter-feature relationships from the perspective of identity information. Specifically, we propose an identity-level metric to incorporate both intra-identity similarity and inter-identity discriminability. The intra-identity similarity constraint aims to cluster features from the same identity, while the inter-identity discriminability constraint ensures that features from different identities deviate from each other. We implement the identity-level metric on four different feature compression networks designed based on feature characteristics. Experimental results show the effectiveness of the proposed identity-level metric on person re-identification and face verification tasks.
暂无评论