Automatically describing the content of an image is a challenging task in computervision that connects the machine learning and natural language processing. In this paper, we present a framework, based on modeling im...
详细信息
ISBN:
(纸本)9781538652183
Automatically describing the content of an image is a challenging task in computervision that connects the machine learning and natural language processing. In this paper, we present a framework, based on modeling image context, to generate natural sentences describing an image, which consists of two parts: relation modeling and description generating. By modeling the mapping from image spatial context to the logical relationship between objects, the former is trained to maximize the likelihood of the target linguistics phrase describing the relationship between object given the training image. By taking the the advantages of the syntactic-tree based method, the latter takes the predicted relationships as key ingredients to facilitate the image description generation within tree-growth process. We conduct extensive experimental evaluations on MS COCO dataset. Our framework outperforms the state-of-the-art methods. The results demonstrates that our framework provides robust and significant improvements for the relationship prediction between objects and the image description generation.
In an absolute characterization trial, there are no substitutes for the final subtyping of coal quality independent of chemical analysis. Petrology is a specialty that deals with the understanding of the essential cha...
详细信息
ISBN:
(纸本)9789811021046;9789811021039
In an absolute characterization trial, there are no substitutes for the final subtyping of coal quality independent of chemical analysis. Petrology is a specialty that deals with the understanding of the essential characteristics of the coal through appropriate chemical, morphological, or porosity analysis. Conventional analysis of coal by a petrologists is subjected to various shortcomings like inter-observer variations during screen analysis and due to different machine utilization, time consuming, highly skilled operator experience, and tiredness. In chemical analysis, use of conventional analyzers is expensive for characterization process. Thus, image analysis serves as an impressive automated characterization procedure of subtyping the coal, according to their textural, morphological, color, etc., features. Coal characterization is necessary for the proper utilization of coal in the power generation, steel, and several manufacturing industries. Thus, in this paper, attempts are made to devise the methodology for an automated characterization and sub-classification of different grades of coal samples using imageprocessing and computational intelligence techniques.
Microstructural parameters are important for analyzing the chemistry and performance of solid oxide fuel cells(SOFCs). Aiming at the YSZ/Ni anode optical microscopy(OM) image of SOFC, in this paper, particle swarm...
Microstructural parameters are important for analyzing the chemistry and performance of solid oxide fuel cells(SOFCs). Aiming at the YSZ/Ni anode optical microscopy(OM) image of SOFC, in this paper, particle swarm intelligent optimization algorithm is used to improve the fuzzy C-means clustering algorithm for image segmentation. Particle swarm optimization is used to adaptively search the initial clustering center, helping to avoid local optimization and preserve more image detail. The experimental results show that the proposed method can improve the segmentation accuracy of images. At the same time, it can accurately segment the SOFC three-phase and provide effective image segmentation results for the microstructure parameters.
Visual bag of words model have been applied in the recent past for the purpose of content-based image retrieval. In this paper, we propose a novel assignment model of visual words for representing an image patch. In p...
详细信息
ISBN:
(纸本)9789811021046;9789811021039
Visual bag of words model have been applied in the recent past for the purpose of content-based image retrieval. In this paper, we propose a novel assignment model of visual words for representing an image patch. In particular, a vector is used to represent an image patch with its elements denoting the affinities of the patch to belong to a set of closest/most influential visual words. We also introduce a dissimilarity measure, consisting of two terms, for comparing a pair of image patches. The first term captures the difference in affinities of the patches to belong to the common set of influential visual words. The second term checks the number of visual words which influences only one of the two patches and penalizes the measure accordingly. Experimental results on the publicly available COIL-100 image database clearly demonstrates the superior performance of the proposed content-based image retrieval (CBIR) method over some similar existing approaches.
Representing local image patches is a key step in many applications of computervision, while fast and effective description methods are always required by real-time imageprocessing. Motivated by the fact that quanti...
详细信息
ISBN:
(数字)9783319745213
ISBN:
(纸本)9783319745213;9783319745206
Representing local image patches is a key step in many applications of computervision, while fast and effective description methods are always required by real-time imageprocessing. Motivated by the fact that quantization compresses information while preserving primary structures, in this paper, we propose to use vector quantization (VQ) on local patch descriptor building. Compared to conventional approaches that compress floating-point features with VQ, we produce local integer descriptors very fast directly based on simple quantization methods. Experimental results on a publicly available dataset show that the present method is efficient both to build and to match. It achieves comparable performance to some typical floating-point and binary descriptors such as SIFT and BRIEF, offering a novel solution to fast local image representation except for bit test created in BRIEF.
The problem of searching a digital image in a very huge database is called content-based image retrieval (CBIR). Texture represents spatial or statistical repetition in pixel intensity and orientation. When abnormal c...
详细信息
ISBN:
(纸本)9789811021046;9789811021039
The problem of searching a digital image in a very huge database is called content-based image retrieval (CBIR). Texture represents spatial or statistical repetition in pixel intensity and orientation. When abnormal cells form within the brain is called brain tumor. In this paper, we have developed a texture feature extraction of MRI brain tumor image retrieval. There are two parts, namely feature extraction process and classification. First, the texture features are extracted using techniques like curvelet transform, contourlet transform, and Local Ternary Pattern (LTP). Second, the supervised learning algorithms like Deep Neural Network (DNN) and Extreme Learning Machine (ELM) are used to classify the brain tumor images. The experiment is performed on a collection of 1000 brain tumor images with different modalities and orientations. Experimental results reveal that contourlet transform technique provides better than curvelet transform and local ternary pattern.
image databases are getting larger and diverse with the coming up of new imaging devices and advancements in technology. Content-based image classification (CBIC) is a method to classify images from large databases in...
详细信息
ISBN:
(纸本)9789811021046;9789811021039
image databases are getting larger and diverse with the coming up of new imaging devices and advancements in technology. Content-based image classification (CBIC) is a method to classify images from large databases into different categories, on the basis of image content. An efficient image representation is an important component of a CBIC system. In this paper, we demonstrate that Self-Organizing Maps (SOM)-based clustering can be used to form an efficient representation of an image for a CBIC system. The proposed method first extracts Scale-Invariant Feature Transform (SIFT) features from images. Then it uses SOM for clustering of descriptors and forming a Bag of Features (BOF) or Vector of Locally Aggregated Descriptors (VLAD) representation of image. The performance of proposed method has been compared with systems using k-means clustering for forming VLAD or BOF representations of an image. The classification performance of proposed method is found to be better in terms of F-measure (FM) value and execution time.
The main objective of this work is to identify the persons and to classify the gender of those persons with the help of their walking styles from the gait sequences with arbitrary walking directions. The human silhoue...
详细信息
ISBN:
(纸本)9789811021046;9789811021039
The main objective of this work is to identify the persons and to classify the gender of those persons with the help of their walking styles from the gait sequences with arbitrary walking directions. The human silhouettes are extracted from the given gait sequences using background subtraction technique. Median value approach is used for the background subtraction. After the extraction of the silhouettes, the affinity propagation clustering is performed to group the silhouettes with similar views and poses to one cluster. The cluster-based averaged gait image is taken as a feature for each cluster. To learn the distance metric, sparse reconstruction-based metric learning has been used. It minimizes the intraclass sparse reconstruction errors and maximizes the interclass reconstruction errors simultaneously. The above-mentioned steps have come under the training phase. With the help of the metric learned in the training and the feature extracted from the testing video sequence, sparse reconstruction-based classification has been performed for identifying the person and gender classification of that person. The accuracy achieved for the human identification and gender classification is promising.
This paper presents an image hashing technique for content verification using Discrete Wavelet Transform (DWT) approximation features. The proposed technique converts resized RGB color images to L*a*b* color images. F...
详细信息
ISBN:
(纸本)9789811021046;9789811021039
This paper presents an image hashing technique for content verification using Discrete Wavelet Transform (DWT) approximation features. The proposed technique converts resized RGB color images to L*a*b* color images. Further, images are regularized using Gaussian low pass filter. A level 2, 2D DWT is applied on L* component of L*a*b* color image and the LL2 approximation sub-band image is chosen for feature extraction. The features are extracted by utilizing a sequence of circles on approximation sub-band image. Finally, the robust binary hash is generated from extracted features. The experimental results indicate that the hash of the presented technique is invariant to standard content preserving manipulations and malicious content altering operations. The experiment results of Receiver Operating Characteristics (ROC) plots indicate that the presented technique shows strong discriminative and robustness capabilities. Besides, the hash of the proposed technique is shorter in length and key dependent.
Pedestrian detection is a hot issue in the field of computervision and imageprocessing in recent years. It has important application value in the domain of unmanned cars and driver assistance systems and so on, but ...
详细信息
暂无评论