The proceedings contain 19 papers. The topics discussed include: adaptive lambda-enhancement: type I versus type II fuzzy implementation;an automated GA-based fuzzy image enhancement method;an efficient architecture f...
ISBN:
(纸本)9781424427604
The proceedings contain 19 papers. The topics discussed include: adaptive lambda-enhancement: type I versus type II fuzzy implementation;an automated GA-based fuzzy image enhancement method;an efficient architecture for hardware implementations of imageprocessing algorithms;inter-modality registration of NMRi and histological section images using neural networks regression in Gabor feature space;2D ultrasound image segmentation using graph cuts and local image features;contextual classification of high-resolution satellite images;a modified fuzzy C-means algorithm with adaptive spatial information for color image segmentation;3-D reconstruction and automatic fusion of edge maps from different modalities of an object;hybridization of particle swarm optimization with the K-means algorithm for image classification;and metric planar rectification from perspective view via circles.
The proceedings contain 76 papers. The topics discussed include: a new vehicle detection approach in traffic jam conditions;non-pixel robot stereo;a novel method to recognize complex dynamic gesture by combining HMM a...
详细信息
ISBN:
(纸本)1424407079
The proceedings contain 76 papers. The topics discussed include: a new vehicle detection approach in traffic jam conditions;non-pixel robot stereo;a novel method to recognize complex dynamic gesture by combining HMM and FNN models;boundary refined texture segmentation based on K-views and datagram methods;single-row superposition-type spherical compound-like eye for pan-tilt motion recovery;bare bones strategy for human detection and tracking;Daubechies complex wavelet transform based moving object tracking;identification of dynamic nonlinear systems using computationalintelligence techniques;a new invariant descriptor for shape representation and recognition;a wavelet-fuzzy logic based system to detect and identify electric disturbs;evolution strategies based particle filters for fault detection;and a multi-window stereo vision algorithm with improved performance at object borders.
The proceedings contain 8 papers. The topics discussed include: multi-view autoencoders for fake news detection;identifying school shooter threats through online texts;detecting cyberbullying in Thai memes: a multimod...
ISBN:
(纸本)9798331508418
The proceedings contain 8 papers. The topics discussed include: multi-view autoencoders for fake news detection;identifying school shooter threats through online texts;detecting cyberbullying in Thai memes: a multimodal approach using deep learning;optimizing Chinese-to-English translation using large language models;RIAND: robustness-improved and accelerated neural-deduplication;conceptual in-context learning and chain of concepts: solving complex conceptual problems using large language models;analyzing the cognitive impact of trauma from a metaphorical perspective: a case study on the attempted assassination of Donald Trump;and SUPERB-EP: evaluating encoder pooling techniques in self-supervised learning models for speech classification.
The proceedings contain 12 papers. The topics discussed include: reconstructing weighted social networks after a node deleted with substitute node selection;ConText Mining: complementing topic models with few-shot in-...
ISBN:
(纸本)9798331519742
The proceedings contain 12 papers. The topics discussed include: reconstructing weighted social networks after a node deleted with substitute node selection;ConText Mining: complementing topic models with few-shot in-context learning to generate interpretable topics;assessing personalized AI mentoring with large language models in the computing field;filtering hallucinations and omissions in large language models through a cognitive architecture;logical reasoning with LLMs via few-shot prompting and fine-tuning: a case study on turtle soup puzzles;synthetic feature augmentation improves generalization performance of language models;advancing natural language to SQL: a comparative study of open source LLMs on benchmark datasets;and To NER or not to NER? a case study of low-resource deontic modalities in EU legislation.
With the rapid development of artificial intelligence, particularly the rise of deep learning, the importance of Explainable Artificial intelligence has become increasingly prominent. Among its key techniques, counter...
详细信息
With the rapid development of artificial intelligence, particularly the rise of deep learning, the importance of Explainable Artificial intelligence has become increasingly prominent. Among its key techniques, counterfactual explanation plays a crucial role in understanding the decision-making mechanisms of opaque models. However, the high dimensionality and complex feature patterns of image data pose significant challenges for the task of generating counterfactuals for images. Existing literature has proposed various algorithms based on different assumptions, many of which rely on the existence of appropriate generative models. Some of these assumptions, particularly the assumption regarding the existence of generative models, may be overly stringent. To address this issue, this letter introduces a novel assumption-free image counterfactual generation algorithm, DFO-S, based on Score Matching and gradient-free optimization techniques. The proposed method achieves high-quality counterfactual generation without relying on generative models. Through extensive empirical analysis, we demonstrate the significant superiority of our method in terms of performance.
Aerial image classification is crucial across multiple sectors, including environmental monitoring, agriculture, and urban planning. However, processing large-scale aerial imagery efficiently poses challenges in model...
详细信息
Aerial image classification is crucial across multiple sectors, including environmental monitoring, agriculture, and urban planning. However, processing large-scale aerial imagery efficiently poses challenges in model performance, computational efficiency, and scalability. This research introduces a novel convolutional neural network (CNN) architecture tailored for cactus identification from aerial photographs. The proposed cloud-based pipeline enhances training efficiency through scalable data storage, preprocessing, and distributed training across platforms such as Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. This comparative analysis demonstrates the model's computational efficiency, shorter training durations, and cost-effectiveness. The model integrates residual connections and depthwise separable convolutions, achieving 96.7% accuracy on the aerial cactus identification dataset. The results highlight the model's high performance, cost-efficiency, and scalability, making it suitable for real-world aerial image classification tasks. Future work aims to further optimize the model using advanced techniques and extend its application to multi-class classification challenges.
The advancement of deep generation technology has significantly enhanced the growth of artificial intelligence-generated content (AIGC). Among these, AI-generated omnidirectional images (AGOIs), hold considerable prom...
详细信息
The advancement of deep generation technology has significantly enhanced the growth of artificial intelligence-generated content (AIGC). Among these, AI-generated omnidirectional images (AGOIs), hold considerable promise for applications in virtual reality (VR). However, the quality of AGOIs varies widely, and there has been limited research focused on their quality assessment. In this letter, inspired by the characteristics of the human visual system, we propose a novel viewport-independent blindquality assessment method for AGOIs, termed VI-AGOIQA, which leverages vision-language correspondence. Specifically, to minimize the computational burden associated with viewport-based prediction methods for omnidirectional image quality assessment, a set of image patches are first extracted from AGOIs in Equirectangular Projection (ERP) format. Then, the correspondence between visual and textual inputs is effectively learned by utilizing the pre-trained image and text encoders of the Contrastive Language-image Pre-training (CLIP) model. Finally, a multimodal feature fusion module is applied to predict human visual preferences based on the learned knowledge of visual-language consistency. Extensive experiments conducted on publicly available database demonstrate the promising performance of the proposed method.
Low-light image enhancement is an important task in computer vision, often made challenging by the limitations of image sensors, such as noise, low contrast, and color distortion. These challenges are further exacerba...
详细信息
Low-light image enhancement is an important task in computer vision, often made challenging by the limitations of image sensors, such as noise, low contrast, and color distortion. These challenges are further exacerbated by the computational demands of processing spatial dependencies under such conditions. We present a novel transformer-based framework that enhances efficiency by utilizing depthwise separable convolutions instead of conventional approaches. Additionally, an original feed-forward network design reduces the computational overhead while maintaining high performance. Experimental results demonstrate that this method achieves competitive results, providing a practical and effective solution for enhancing images captured in low-light environments.
Deep Hashing is one of the most important methods for generating compact feature representation in content-based image retrieval. However, in various application scenarios, it requires training different models with d...
详细信息
Deep Hashing is one of the most important methods for generating compact feature representation in content-based image retrieval. However, in various application scenarios, it requires training different models with diversified memory and computational resource costs. To address this problem, in this paper, we propose a new scalable deep hashing framework, which aims to generate binary codes with different code lengths by adaptive bit selection. Specifically, the proposed framework consists of two alternative steps, i.e., bit pool generation and adaptive bit selection. In the first step, a deep feature extraction model is trained to output binary codes by optimizing retrieval performance and bit properties. In the second step, we select informative bits from the generated bit pool with reinforcement learning algorithm, in which the same retrieval performance and bit properties are directly used in computing reward. The bit pool can be further updated by fine-tuning the deep feature extraction model with more attention on the selected bits. Hence, these two steps are alternatively iterated until convergence is achieved. Notably, most existing binary hashing methods can be readily integrated into our framework to generate scalable binary codes. Experiments on four public image datasets prove the effectiveness of the proposed framework for image retrieval tasks.
Computer-aided medical image segmentation helps to assist physicians in locating lesion area for the subsequent diagnosis and treatment. Due to the irregular shape of the target and the uneven sample size between the ...
详细信息
Computer-aided medical image segmentation helps to assist physicians in locating lesion area for the subsequent diagnosis and treatment. Due to the irregular shape of the target and the uneven sample size between the target and the background area, automatic segmentation of medical images is a challenging task. Many CNN-Based, Transformer-Based models deepen the number of network layers or introduce complex modules in order to improve the segmentation accuracy. Limited by the computational resources, these types of large models are not suitable for the actual clinical environment. Inspired by the rapidity, accuracy, and low consumption characteristics of bio-visual processing, the Ultra-Lightweight Network Inspired by Bio-Visual Interaction (BVI-Net) is constructed in this paper. The Global Pathway is constructed by simulating the dorsal stream, in order to extract global features rapidly, and the Local Pathway is constructed by simulating the ventral stream, in order to process local features finely. At the same time, the skip connection module integrating Graph Convolutional Network (GCN) attention mechanism is constructed to simulate the synchronous integration ability of the visual pathway for multi-level features. The International Skin Imaging Collaboration (ISIC) dataset, the Liver Tumor Segmentation (LiTS) dataset, and the Brain Tumor Segmentation Challenge (BraTS) dataset are used for experiments. The BVI-Net proposed in this paper requires only 0.026M parameters to achieve the excellent performance in three representative medical image segmentation datasets, which has certain advantages over state-of-the-art (SOTA) methods. The biological vision mechanism and the artificial intelligence algorithm are integrated in this paper, which provides new ideas for the construction of biological vision-guided deep learning models and promotes the development of biomimetic computational vision.
暂无评论