The first processing stage in computational vision is to analyze the information content of an image in different scales by Marr's computational vision theory [1]. Multiscale signal decomposition is thus needed. T...
详细信息
This paper presents a framework for dynamic 3D shape reconstruction from multi-viewpoint images using a deformable mesh model. With our method, we can obtain 3D shape and 3D motion of the object simultaneously. We rep...
详细信息
ISBN:
(纸本)953184061X
This paper presents a framework for dynamic 3D shape reconstruction from multi-viewpoint images using a deformable mesh model. With our method, we can obtain 3D shape and 3D motion of the object simultaneously. We represent the shape by a surface mesh model and the motion by translations of its vertices, i.e., deformation. Thus global and local topological structure of the mesh are preserved from frame to frame. This helps us to analyse the motion of the object, to compress the 3D data, and so on. Our model deforms its shape so as to satisfy several constraints. This constraint-based deformation provides a computational framework to integrate several reconstruction cues such as surface texture, silhouette, and motion flow observed in multi-viewpoint images.
This paper presents a solution to the problem of manipulation control: target identification and grasping. The proposed controller is designed for a real platform in combination with a monocular vision system. The obj...
详细信息
This paper presents a solution to the problem of manipulation control: target identification and grasping. The proposed controller is designed for a real platform in combination with a monocular vision system. The objective of the controller is to learn an optimal policy to reach and to grasp a spherical object of known size, randomly placed in the environment. In order to accomplish this, the task has been treated as a reinforcement problem, in which the controller learns by a trial and error approach the situation-action mapping. The optimal policy is found by using the Q-Learning algorithm, a model free reinforcement learning technique, that rewards actions that move the arm closer to the target. The vision system uses geometrical computation to simplify the segmentation of the moving target (a spherical object) and determines an estimate of the target parameters. To speed-up the learning time, the simulated knowledge has been ported on the real platform, an industrial robot manipulator PUMA 560. Experimental results demonstrate the effectiveness of the adaptive controller that does not require an explicit global target position using direct perception of the environment.
In this paper, a lips extraction method that can extract lips region from varying lips shape at the moment of speech by using only one template image is described. The method that is proposed in this paper, has invari...
详细信息
ISBN:
(纸本)0780378660
In this paper, a lips extraction method that can extract lips region from varying lips shape at the moment of speech by using only one template image is described. The method that is proposed in this paper, has invariance for an open and closed mouth, showing or not showing any teeth, and has high speed and high extraction accuracy in consideration for characteristics of the lips by using a genetic method. This method uses the template matching using a genetic algorithm. Furthermore, colour of lips and characteristics of the lips shape variances at the moment of speech in this system are utilized. The effectiveness of this method is demonstrated with only one template for each person being tested and a search object, that is, the varying lips shape at the moment of speech of vowels by means of computer simulations. These computer simulations indicate that this method can extract the varying lips shape at the moment of speech by using only one template. Moreover, in the extraction processing of every vowel, a high speed and high extraction accuracy can be obtained.
The purpose of the medical image segmentation task is to delineate different organs or lesion regions in the image, which is an important aid for intelligent clinical medical diagnosis. Recent approaches suffer from t...
详细信息
ISBN:
(纸本)9781728198354
The purpose of the medical image segmentation task is to delineate different organs or lesion regions in the image, which is an important aid for intelligent clinical medical diagnosis. Recent approaches suffer from the inability to obtain reliable attention, are computationally intensive, and do not exploit the relationships between different samples. We marry convolution and Transformer effectively to establish MCTE for medical image segmentation. The proposed MCTE is an end-to-end network based on U-Net with a parallel learning of three types of attention, namely local attention learning with channel and spatial dimensional convolution, global attention learning with smaller computational effort of swin transformer, and external attention learning with two shared memory storing all medical image information. Extensive experimental results on the ACDC and Synapse dataset, which are widely used for the evaluation of medical image segmentation methods, demonstrate that our proposed method exceeds the compared baseline.
Jupyter Notebook [1] is an open source, interactive computing platform widely used in the scientific computing and artificial intelligence community [2], [3], [4], [5]. The popularity of the platform is a consequence ...
详细信息
ISBN:
(纸本)9781665442152;9781665442145
Jupyter Notebook [1] is an open source, interactive computing platform widely used in the scientific computing and artificial intelligence community [2], [3], [4], [5]. The popularity of the platform is a consequence of the generated single notebook document combining source code, markdown, and visualizations (Fig.1). This makes the platform ideal for tasks such as data analysis and scientific imageprocessing, where repeatability and transparency of analysis tasks are just as important as functionality and performance. However, the obligatory use of code is an obstacle to acceptance of the platform in scientific communities where programming is not generally taught in the curriculum. Consequently, many experimental communities rely on manual imageprocessing using graphical user interfaces [6], [7], [8]. The obvious disadvantages are the lack of repeatability, transparency, and precision in imageprocessing and data analysis tasks. To solve these issues, we propose to extend Jupyter Notebook with visual programming cells. In each visual programming cell, users can create the program by assembling graphical nodes that represent computational instructions, and the textual program is automatically generated and executed by the environment. Cells will support version control aware serialization and deserialization. The core innovation of our proposed work lies in a change of workflow and the adaption of a jupyter-based workflow in experimental communities that have no culture of working with source code. The system can be adapted to multiple applications and domains by integrating new node types. We hereby present an early version of the system and provide one use case from microscopy imageprocessing to demonstrate the integration of existing non-Python software.
The proceedings contain 186 papers. The topics discussed include: a cloud-based architecture for big-data analytics in smart grid: a proposal;optical character recognition for scene text detection, mining and recognit...
ISBN:
(纸本)9781479915972
The proceedings contain 186 papers. The topics discussed include: a cloud-based architecture for big-data analytics in smart grid: a proposal;optical character recognition for scene text detection, mining and recognition;domain knowledge enriched framework for restricted domain question answering system;a novel approach to link semantic gap between images and tags via probabilistic ranking;web based security with LOPass user authentication protocol in mobile application;double ended speech enabled system in Indian travel & tourism industry;particle swarm optimization based parameter optimization technique in medical information hiding;application of imageprocessing for a bubble column reactor;a modern avatar of Julius Ceasar and Vigenere cipher;parallel image segmentation using multi-threading and k-means algorithm;an ICMP based secondary cache approach for the detection and prevention of ARP poisoning;and digital image watermarking using fractional Fourier transform via image compression.
The cellular neural/nonlinear network (CNN) is a powerful tool for image and video signalprocessing, robotic and biological visions. In this paper, the Selected Objects Extraction (SOE) CNN was generated to Direction...
详细信息
ISBN:
(纸本)9780769530734
The cellular neural/nonlinear network (CNN) is a powerful tool for image and video signalprocessing, robotic and biological visions. In this paper, the Selected Objects Extraction (SOE) CNN was generated to Directional Extraction (DE) CNN which enhance the capabilities of CNNs and improve their efficiency. Based on analytical approach, a theorem of designing robust templates for DE CNNs was established, which provides parameter inequalities to determine parameter intervals for implementing the corresponding functions. Several examples are provided to illustrate the effectiveness of the theorem for extracting selected objects directionally in binary images.
Learning the non-linear image upscaling process has previously been considered as a simple regression process, where various models have been utilized to describe the correlations between high-resolution (HR) and low-...
详细信息
ISBN:
(纸本)9781479983391
Learning the non-linear image upscaling process has previously been considered as a simple regression process, where various models have been utilized to describe the correlations between high-resolution (HR) and low-resolution (LR) images/patches. In this paper, we present a multitask learning framework based on deep neural network for image super-resolution, where we jointly consider the image super resolution process and the image degeneration process. By sharing parameters between the two highly relevant tasks, the proposed framework could effectively improve the obtained neural network based mapping model between HR and LR image patches. Experimental results have demonstrated clear visual improvement and high computational efficiency, especially with large magnification factors.
The usage of multi-beam antenna in passive millimeter wave imaging system reduces the time of imaging, but this system structure leads to striping in the image acquired. By analyzing the characteristics of passive mil...
详细信息
暂无评论