The development of automated methods capable of detecting and localizing actions is crucial for a variety of applications, ranging from surveillance and autonomous driving to content moderation. This thesis focuses on...
详细信息
The development of automated methods capable of detecting and localizing actions is crucial for a variety of applications, ranging from surveillance and autonomous driving to content moderation. This thesis focuses on creating action detection methods that deliver robust performances. At the heart of these methods’ robustness lie two fundamental elements: the detection of atomic actions and the ability for compositional understanding. Atomic actions are those that are identifiable from a single image or a short video. In this research, we developed innovative methods to detect and localize such actions that achieve state-of-the art performance. The key strength of these methods lies in their ability to refine visual features both spatially and semantically, enabling precise identification of action-specific regions. For scalability, we further developed a multi-branch network to recognize new composition of objects and actions. Our design ensures that each branch learns decoupled features, allowing the network to transfer previously learned concepts to identify new compositions. This approach outperforms existing methods by a good margin as our extensive experiments on benchmark datasets demonstrate. Further, the correct identification of the attributes of the participating objects in actions helps to detect unknown compositions. Therefore, we have created a network utilizing spatially localized learning to correctly associate objects and attributes. This network achieves state-of-the-art performance in object-attribute association on cluttered scenes. The developed methods in this thesis can do robust action detection at scale and serve as a base for numerous future applications.
In recent years, the country has proposed the strategic development goal of 'Made in China 2025', and the intelligent manufacturing industry has gradually received national attention. Intelligent robots rely o...
详细信息
Due to its uneven and curvy surface, researchers had difficulty in getting the wiper arm surface to be evenly illuminated for appearance defect detection using machinevision. As a result, some defects, especially tho...
详细信息
ISBN:
(纸本)9781665485296
Due to its uneven and curvy surface, researchers had difficulty in getting the wiper arm surface to be evenly illuminated for appearance defect detection using machinevision. As a result, some defects, especially those located at the edge of the region of interest (ROI) were missed. In this paper, the ROI was widened by stitching two sequential images together using Laplacian pyramids. Genetic algorithm was then used to enhance the important features of the defects using the best fitness value, parent mating, crossover and mutation. The algorithm was able to reduce the effect of uneven-illumination by repeating regeneration. The resultant image was converted into binary for defect identification, and localized according to its contour. Experimental results showed 90.5% accuracy.
Recent studies point to an accuracy gap between humans and Artificial Neural Network (ANN) models when classifying blurred images, with humans outperforming ANNs. To bridge this gap, we introduce a spectral channel-ba...
详细信息
ISBN:
(数字)9798331506520
ISBN:
(纸本)9798331506537
Recent studies point to an accuracy gap between humans and Artificial Neural Network (ANN) models when classifying blurred images, with humans outperforming ANNs. To bridge this gap, we introduce a spectral channel-based range-constrained entropy merit function, from which we devise a zero-phase, circular symmetric blind deblurring method. We apply it as a pre-processing step for image classification and test it using pre-trained classification models and images blurred by Gaussian kernels. We compare our method to state-of-the-art restoration methods, showing its superiority, effectively bridging the machine-human gap for most models and blur levels. Our results also rank higher than the competitors in no-reference and full-reference image quality metrics. Notwithstanding the limitation to zero-phase blur, this work shows that, for image pre-processing aimed at visual tasks, it may be advantageous to use merit functions based on vision science and information theory, rather than on the expected error to the latent image.
The traffic density on roads has been increasing rapidly for the past few decades, which has in turn been reflected in the increase in traffic violations and accidents. Official reports from various governments and pr...
详细信息
image denoising remains a key research problem because of its potential role as a pre-processing component in imageprocessing, computer vision, and machine learning tasks. Of the available approaches for image denois...
详细信息
image denoising remains a key research problem because of its potential role as a pre-processing component in imageprocessing, computer vision, and machine learning tasks. Of the available approaches for image denoising, those inspired by anisotropic diffusion processes have been a center of discussion for decades. Despite the efforts and promising results achieved by diffusion-inspired denoising methods, we noted insufficient attention on the design of energy functionals for anisotropic diffusion equations. Most researchers consider heuristic approaches to design diffusivity functionals, a practice that cannot provide mathematical explanations on why their approaches work. The current research presents a strictly convex and Lipschitz energy functional that guarantees a unique solution for an evolutionary process. Based on this functional, we derive an anisotropic diffusion equation for image denoising applications. Experimental results show that an algorithm corresponding to the proposed equation is computationally efficient, and generates informative and visually appealing images with competitive values of peak signal-to-noise ratio and structural similarity. Guided by the compelling properties of our energy functional, we provide an additional insight to describe quality of the results. Implementation codes and test datasets of the proposed approach are publicly accessible at the MATLAB File Exchange (https://www .mathworks .com /matlabcentral /fileexchange /160108- lipschitz -diffusion -inspired -energy-functional).
A system for determination the distance from the robot to the scene is useful for object tracking, and 3-D reconstruction may be desired for many manufacturing and robotic tasks. While the robot is processing material...
详细信息
We propose a novel Dispersion Minimisation framework for event-based vision model estimation, with applications to optical flow and high-speed motion estimation. The framework extends previous event-based motion compe...
详细信息
We propose a novel Dispersion Minimisation framework for event-based vision model estimation, with applications to optical flow and high-speed motion estimation. The framework extends previous event-based motion compensation algorithms by avoiding computing an optimisation score based on an explicit image-based representation, which provides three main benefits: i) The framework can be extended to perform incremental estimation, i.e., on an event-by-event basis. ii) Besides purely visual transformations in 2D, the framework can readily use additional information, e.g., by augmenting the events with depth, to estimate the parameters of motion models in higher dimensional spaces. iii) The optimisation complexity only depends on the number of events. We achieve this by modelling the event alignment according to candidate parameters and minimising the resultant dispersion, which is computed by a family of suitable entropy-based measures. Data whitening is also proposed as a simple and effective pre-processing step to make the framework's accuracy performance more robust, as well as other event-based motion-compensation methods. The framework is evaluated on several challenging motion estimation problems, including 6-DOF transformation, rotational motion, and optical flow estimation, achieving state-of-the-art performance.
The proceedings contain 14 papers. The special focus in this conference is on Context-Aware Systems and applications. The topics include: Prediction of Chaotic Time Series Based on LSTM, Autoencoder and Chaos Theory;a...
ISBN:
(纸本)9783031288159
The proceedings contain 14 papers. The special focus in this conference is on Context-Aware Systems and applications. The topics include: Prediction of Chaotic Time Series Based on LSTM, Autoencoder and Chaos Theory;an Approach to Selecting Students Taking Provincial and National Excellent Student Exams;safe Interaction Between Human and Robot Using vision Technique;application of the imageprocessing Technique for Powerline Robot;collaborative Recommendation with Energy Distance Correlation;blockchain Model in Industrial Pangasius Farming;multiple-Criteria Rating Recommendation with Ordered Weighted Averaging Aggregation Operators;a Survey of On-Chip Hybrid Interconnect for Multicore Architectures;a Framework for Brain-Computer Interfaces Closed-Loop Communication Systems;identification of Abnormal Cucumber Leaves image Based on Recurrent Residual U-Net and Support Vector machine Techniques;lung Lesion images Classification Based on Deep Learning Model and Adaboost Techniques;balltree Similarity: A Novel Space Partition Approach for Collaborative Recommender Systems.
In recent years, there has been a remarkable increase in interest and challenges in imageprocessing and pattern recognition, specifically in the context of air writing. This exciting research area has significant pot...
详细信息
暂无评论