Domain generalization (DG) is a fundamental yet challenging topic in machine learning. Recently, the remarkable zero-shot capabilities of the large pre-trained vision-language model (e.g., CLIP) have made it popular f...
One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and match...
详细信息
ISBN:
(纸本)0819413208
One of the central problems of computer vision is object recognition. A catalogue of model objects is described as a set of features such as edges and surfaces. The same features are extracted from the scene and matched against the models for object recognition. Edges and surfaces extracted from the scenes are often noisy and imperfect. In this paper algorithms are described for improving low level edge and surface features. Existing edge extraction algorithms are applied to the intensity image to obtain edge features. Initial edges are traced by following directions of the current contour. These are improved by using corresponding depth and intensity information for decision making at branch points. Surface fitting routines are applied to the range image to obtain planar surface patches. An algorithm of region growing is developed that starts with a coarse segmentation and uses quadric surface fitting to iteratively merge adjacent regions into quadric surfaces based on approximate orthogonal distance regression. Surface information obtained is returned to the edge extraction routine to detect and remove fake edges. This process repeats until no more merging or edge improvement can take place. Both synthetic (with Gaussian noise) and real images containing multiple object scenes have been tested using the merging criteria. Results appeared quite encouraging.
The traditional imageprocessing techniques require a lot of computational effort due to data on each pixel are computed in a sequential way and the path of information is an A/D converter. The delay accumulation crea...
详细信息
ISBN:
(纸本)0819439835
The traditional imageprocessing techniques require a lot of computational effort due to data on each pixel are computed in a sequential way and the path of information is an A/D converter. The delay accumulation create in this process is unacceptable in real time imageprocessing because of the high information flow managed in the usual vision tasks (e.g. automatic industrial inspection, vision problems in robotics, pattern analysis, etc.). Thus, the use of a massive parallel architecture working with analog signals avoids the previous problems. This is just the basis idea of Cellular Neural Network (CNN's): an array of analogic dynamic processors which cells interact directly within a finite local neighborhood. The local CNN connectivity allow its realization as VLSI chips that can operate at a very high speed and complexity. Nowadays CNN architectures implemented as VLSI chips shows the aptitude of extremely high speed compared with traditional digital imageprocessing tools. The proliferation of more and more sophisticated CNN architectures, and the increasing effort to implant practical system based in CNN chips, make important the development of analog algorithm to perform complex imageprocessing tasks dedicated to many different fields, i.e. industrial applications, robotic systems and pattern recognition. The objective of this work is to generate a learning machine capable of find solutions for complex imageprocessing task by CNN's. First a general machine for automatic analog algorithm design independent of the problem to solve is created, this is accomplished through an evolutionary strategy that is an extension of genetic programming. Second, this work introduces a suite of sub-mechanisms that increase the power of genetic programming and contribute to reduce the enormous space search for producing a plentiful search. Some concepts in this section are related with AI theory, in such a way that in this work we are in the intersection field of AI and Imag
The application of mobile robots in autonomous navigation has contributed to the development of exploration tasks for the recognition of unknown environments. There are different methodologies for obstacles avoidance ...
详细信息
ISBN:
(纸本)9789811391552;9789811391545
The application of mobile robots in autonomous navigation has contributed to the development of exploration tasks for the recognition of unknown environments. There are different methodologies for obstacles avoidance implemented in mobile robots;however, this research introduces a novel approach for a path planning of an unmanned ground vehicle (UGV) using the camera of a drone to get an aerial view that allows to recognize ground features through imageprocessingalgorithms for detecting obstacles and target them in a determined environment. After aerial recognition, a global planner with Rapidly-exploring Random Tree Star (RRT*) algorithm is executed, Dubins curves are themethod used in this case for non-holonomic robots. The study also focuses on determining the compute time which is affected by a growing number of iterations in the RRT*, the value of step size between the tree's nodes and finally the impact of a number of obstacles placed in the environment. This project is the initial part of a larger research about a Collaborative Aerial-Ground Robotic System.
An image recognition algorithm based on ensemble learning algorithm and convolution neural network structure (ELA-CNN) is proposed to solve the problem that a single convolution neural network (CNN) classifier may be ...
详细信息
Quantizing the activation, weight, and gradient to 4-bit is promising to accelerate neural network training. However, existing 4-bit training methods require custom numerical formats which are not supported by contemp...
详细信息
ISBN:
(纸本)9781713899921
Quantizing the activation, weight, and gradient to 4-bit is promising to accelerate neural network training. However, existing 4-bit training methods require custom numerical formats which are not supported by contemporary hardware. In this work, we propose a training method for transformers with all matrix multiplications implemented with the INT4 arithmetic. Training with an ultra-low INT4 precision is challenging. To achieve this, we carefully analyze the specific structures of activation and gradients in transformers to propose dedicated quantizers for them. For forward propagation, we identify the challenge of outliers and propose a Hadamard quantizer to suppress the outliers. For backpropagation, we leverage the structural sparsity of gradients by proposing bit splitting and leverage score sampling techniques to quantize gradients accurately. Our algorithm achieves competitive accuracy on a wide range of tasks including natural language understanding, machine translation, and image classification. Unlike previous 4-bit training methods, our algorithm can be implemented on the current generation of GPUs. Our prototypical linear operator implementation is up to 2.2 times faster than the FP16 counterparts and speeds up the training by 17.8% on average for sufficiently large models. Our code is available at https://***/xijiu9/Train_Transformers_with_INT4.
Intelligent pulse diagnosis robot is a research direction of intelligent medical treatment, many researchers in China and abroad have devoted themselves to promoting the modernization of Pulse-diagnosis with the devel...
详细信息
The availability of new digital detector technologies and high speed computer processing has led to the development of CAD (computer-aided diagnostic) tools that assist radiologists in detecting and characterizing mam...
详细信息
ISBN:
(纸本)0819440094
The availability of new digital detector technologies and high speed computer processing has led to the development of CAD (computer-aided diagnostic) tools that assist radiologists in detecting and characterizing mammographic lesions. To meet the challenge of developing and implementing algorithms that are computationally intensive, it is desirable to develop reusable components that can execute in a distributed environment. It is well know that the Common Object Request Broker Architecture (CORBA) provides an open solution in distributed computing. We have implemented a hybrid component model consisting of a CORBA server and a Contract Net Protocol (CNP) algorithm for distributing tasks to multiple computers for enhanced processing. Support classes were developed to wrap algorithms developed in C to operate within the distributed framework. CORBA provides communication between agents on different computers and computer platforms and the CNP algorithm is used to select the "optimal" computer for performing a task. We have evaluated this framework with CAD processing applied to digitized mammograms by transparently scheduling and distributing multiple tasks on three server computers. We achieved a significant reduction in processing times compared to processing on a single computer.
A new design that connects together many small feed-forward neural networks to form a large, recurrent, image association processor is presented. The resulting recurrent neural network uses local image information in ...
详细信息
ISBN:
(纸本)9780889866027
A new design that connects together many small feed-forward neural networks to form a large, recurrent, image association processor is presented. The resulting recurrent neural network uses local image information in order to form global associations. Just as a single bit of information and its complement hold each other in place in an SR Rip-flop, two arbitrary reciprocal images can hold each other in place in an image processor. The new, T-MAP design consists of a pair of two-dimensional processor grids. Each processing element has only locally connections, but in unison, they form an orderly interconnection pattern analogous to the recurrent bit connections in an SR flip-flop. The central nervous system contains many topological maps that mathematically resemble images. Consequently, algorithms that can store and recall image associations are interesting from both a scientific and an engineering point of view. An array of these association processors is similar to the collection of Brodmann areas in the neocortex. We also present a new, physiologically realistic mechanism for controlling the processor array.
In industrial collaborative robotics, operators and robots perform complex tasks working together without physical barriers. Under this premise, the availability of a flexible, robust and fast interaction system betwe...
详细信息
In industrial collaborative robotics, operators and robots perform complex tasks working together without physical barriers. Under this premise, the availability of a flexible, robust and fast interaction system between the robot and the workers is a necessity. Human beings use voice and gestures to achieve a natural interaction. Taking into account the environmental conditions usually present in workshops with noise and poor lighting conditions, combining both communication channels can contribute to make the interaction more robust. This research work presents a solution to define, setup and run a flexible and robust gesture interaction system to integrate in collaborative robotics applications. (C) 2018 The Authors. Published by Elsevier B.V. Peer-review under responsibility of the scientific committee of the 51st CIRP conference on Manufacturing systems.
暂无评论