检索结果-内蒙古大学图书馆

CLIP-Flow:Decoding images encoded in CLIP space

Computational visual Media 2024年第6期10卷 1157-1168页

作者： Hao Ma Ming Li Jingyuan Yang Or Patashnik Dani Lischinski Daniel Cohen-Or Hui Huang Visual Computing Research Center College of Computer Science and Software EngineeringShenzhen UniversityShenzhen 518060China Department of Computer Science Tel Aviv UniversityTel Aviv 6997801Israel School of Computer Science and Engineering the Hebrew University of JerusalemJerusalem 91904Israel

This study introduces CLIP-Flow,a novel network for generating images from a given image or *** effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for image-and text-to-image *** particular,we adopted Contrastive Language-Image Pretraining(CLIP)as an encoder to extract semantics and StyleGAN as a decoder to generate images from such ***,to bridge the embedding space of CLIP and latent space of StyleGAN,real NVP is employed and modified with activation normalization and invertible *** the images and text in CLIP share the same representation space,text prompts can be fed directly into CLIP-Flow to achieve text-to-image *** conducted extensive experiments on several datasets to validate the effectiveness of the proposed image-to-image synthesis *** addition,we tested on the public dataset Multi-Modal CelebA-HQ,for text-to-image *** validated that our approach can generate high-quality text-matching images,and is comparable with state-of-the-art methods,both qualitatively and quantitatively.

关键词： image-to-image text-to-image contrastive language-image pretraining(CLIP) flow StyleGAN

来源：评论

学校读者我要写书评

暂无评论

Taming diffusion model for exemplar-based image translation

引用

Computational visual Media 2024年第6期10卷 1031-1043页

作者： Ma, Hao Yang, Jingyuan Huang, Hui Shenzhen University Visual Computing Research Center College of Computer Science and Software Engineering Shenzhen China (GRID:grid.263488.3) (ISNI:0000 0001 0472 9649)

Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given ***,most existing GAN-based translation methods fail to produce photorealistic *** this study,we propose a new diffusion model-based approach for generating high-quality images that are semantically aligned with the input mask and resemble an exemplar in *** proposed method trains a conditional denoising diffusion probabilistic model(DDPM)with a SPADE module to integrate the semantic *** then used a novel contextual loss and auxiliary color loss to guide the optimization process,resulting in images that were visually pleasing and semantically *** demonstrate that our method outperforms state-of-the-art approaches in terms of both visual quality and quantitative metrics.

关键词： exemplar image translation denoising diffusion probabilistic model(DDPM)

来源：评论

学校读者我要写书评

暂无评论

Few-Shot Classification Based on Feature Enhancement Network

Few-Shot Classification Based on Feature Enhancement Network

引用

2024 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2024

作者： Hu, Shasha Su, Han Gao, Ruixuan College of Computer Science Sichuan Normal University Chengdu China Visual Computing and Visual Reality Key Laboratory of Sichuan Province Chengdu China

ISBN: (纸本)9781665410205

Few-shot image classification stands as a pivotal task within the realm of computer vision. However, obtaining accurate class prototypes from limited annotated samples is a challenging problem. In recent years, many methods based on prototype networks have shown excellent performance. Nevertheless, existing methods overlook the discriminative semantic information lost due to sample scarcity and the hidden category information in the query set, failing to address the issue of unreliable prototypes generated from limited annotated samples. In this paper, we propose a feature enhancement network for few-shot classification. To improve the accuracy and robustness of few-shot classification models, we first enhance the support set through learning a weight matrix and then align the enhanced support set prototypes with textual semantics. To avoid being influenced by introduced prior noise, we fuse between semantically aligned prototypes and mean prototypes and ultimately utilize query prototypes for dynamic updating to obtain more accurate class prototypes. Extensive experiments demonstrate that our method achieves competitive performance on miniImageNet and tieredImageNet datasets. Furthermore, it exhibits excellent results in cross-domain few-shot classification. © 2024 IEEE.

关键词： Zero-shot learning

来源：评论

学校读者我要写书评

暂无评论

Deep Learning-Based ECG Classification for Arterial Fibrillation Detection

引用

computers, Materials & Continua 2024年第6期79卷 4805-4824页

作者： Muhammad Sohail Irshad Tehreem Masood Arfan Jaffar Muhammad Rashid Sheeraz Akram Abeer Aljohani Faculty of Computer Science&Information Technology The Superior UniversityLahore54000Pakistan Intelligent Data Visual Computing Research(IDVCR) Lahore54000Pakistan Department of Computer Science National University of TechnologyIslamabad45000Pakistan Information Systems Department College of Computer and Information SciencesImam Mohammad Ibn Saud Islamic University(IMSIU)Riyadh11432Saudi Arabia Department of Computer Science Applied CollegeTaibah UniversityMedina42353Saudi Arabia

The application of deep learning techniques in the medical field,specifically for Atrial Fibrillation(AFib)detection through Electrocardiogram(ECG)signals,has witnessed significant *** and timely diagnosis increases the patient’s chances of ***,issues like overfitting and inconsistent accuracy across datasets remain *** a quest to address these challenges,a study presents two prominent deep learning architectures,ResNet-50 and DenseNet-121,to evaluate their effectiveness in AFib *** aim was to create a robust detection mechanism that consistently performs *** such as loss,accuracy,precision,sensitivity,and Area Under the Curve(AUC)were utilized for *** findings revealed that ResNet-50 surpassed DenseNet-121 in all evaluated *** demonstrated lower loss rate 0.0315 and 0.0305 superior accuracy of 98.77%and 98.88%,precision of 98.78%and 98.89%and sensitivity of 98.76%and 98.86%for training and validation,hinting at its advanced capability for AFib *** insights offer a substantial contribution to the existing literature on deep learning applications for AFib detection from ECG *** comparative performance data assists future researchers in selecting suitable deep-learning architectures for AFib ***,the outcomes of this study are anticipated to stimulate the development of more advanced and efficient ECG-based AFib detection methodologies,for more accurate and early detection of AFib,thereby fostering improved patient care and outcomes.

关键词： Convolution neural network atrial fibrillation area under curve ECG false positive rate deep learning classification

来源：评论

学校读者我要写书评

暂无评论

Facial Image-Based Autism Detection:A Comparative Study of Deep Neural Network Classifiers

引用

computers, Materials & Continua 2024年第1期78卷 105-126页

作者： Tayyaba Farhat Sheeraz Akram Hatoon SAlSagri Zulfiqar Ali Awais Ahmad Arfan Jaffar Faculty of Computer Science and Information Technology The Superior UniversityLahore54600Pakistan Intelligent Data Visual Computing Research(IDVCR) Faculty of Computer Science and Information TechnologyThe Superior UniversityLahore54600Pakistan Information Systems Department College of Computer and Information SciencesImam Mohammad Ibn Saud Islamic University(IMSIU)Riyadh12571Saudi Arabia School of Computer Science and Electronic Engineering(CSEE) University of EssexWivenhoe ParkColchesterCO43SQUK

Autism Spectrum Disorder(ASD)is a neurodevelopmental condition characterized by significant challenges in social interaction,communication,and repetitive *** and precise ASD detection is crucial,particularly in regions with limited diagnostic resources like *** study aims to conduct an extensive comparative analysis of various machine learning classifiers for ASD detection using facial images to identify an accurate and cost-effective solution tailored to the local *** research involves experimentation with VGG16 and MobileNet models,exploring different batch sizes,optimizers,and learning rate *** addition,the“Orange”machine learning tool is employed to evaluate classifier performance and automated image processing capabilities are utilized within the *** findings unequivocally establish VGG16 as the most effective classifier with a 5-fold cross-validation ***,VGG16,with a batch size of 2 and the Adam optimizer,trained for 100 epochs,achieves a remarkable validation accuracy of 99% and a testing accuracy of 87%.Furthermore,the model achieves an F1 score of 88%,precision of 85%,and recall of 90% on test *** validate the practical applicability of the VGG16 model with 5-fold cross-validation,the study conducts further testing on a dataset sourced fromautism centers in Pakistan,resulting in an accuracy rate of 85%.This reaffirms the model’s suitability for real-world ASD *** research offers valuable insights into classifier performance,emphasizing the potential of machine learning to deliver precise and accessible ASD diagnoses via facial image analysis.

关键词： Autism Autism Spectrum Disorder(ASD) disease segmentation features optimization deep learning models facial images classification

来源：评论

学校读者我要写书评

暂无评论

Dynamic Semi-structured Data Clustering Based on Frequently Changing Structure 2

Dynamic Semi-structured Data Clustering Based on Frequently ...

引用

2nd International Conference on Cloud computing, Big Data Application and Software Engineering, CBASE 2023

作者： Yang, Junren Li, Wei Sichuan Normal University College of Computer Science Chengdu610101 China Sichuan Normal University Visual Computing and Visual Reality Key Laboratory of Sichuan Chengdu610068 China

ISBN: (纸本)9798350331448

In order to cluster dynamic semi-structured data documents, a dynamic semi-structured data clustering algorithm based on frequently changing structure is proposed. The algorithm uses dynamic model to store the historical change information of dynamic semi-structured data, builds frequent structure path tree based on the spatial frequent changing structure, uses frequent structure path tree to calculate document similarity, and finally uses DBSCAN algorithm to cluster. Experimental results show that the Jaccard coefficient and Rand index of the proposed method are above 0.99, and the performance consumption and clustering effect are better than the traditional static clustering algorithm. © 2023 IEEE.

关键词： Data mining

来源：评论

学校读者我要写书评

暂无评论

MCANet: Medical Image Segmentation with Multi-scale Cross-axis Attention

引用

Machine Intelligence Research 2025年第3期22卷 437-451页

作者： Shao, Hao Zeng, Quansheng Hou, Qibin Yang, Jufeng Tianjin Key Laboratory of Visual Computing and Intelligent Perception College of Computer Science Nankai University Tianjin300000 China

Efficiently capturing multi-scale local information and building long-range dependencies among pixels are essential for medical image segmentation because of the various sizes and shapes of the lesion regions or organs. In this paper, we propose the multi-scale cross-axis attention (MCA) mechanism to address these challenges through enhanced axial attention. To address the issues of insufficient learning of positional bias and limited long-distance interaction in axial attention caused by the small dataset, we propose using a dual cross-attention mechanism instead of axial attention to enhance global information capture. Meanwhile, to compensate for the lack of explicit attention to local information in axial attention, we use multiple convolutions of strip-shaped kernels with different kernel sizes in each axial attention path, which improves the efficiency of MCA in local information encoding. By integrating MCA into the multi-scale cross-axis attention network (MSCAN) backbone, we develop our network architecture, termed MCANet. With merely 4 M+ parameters, MCANet outperforms previous heavyweight approaches (e.g., swin transformer-based methods) across four challenging tasks: skin lesion segmentation, nuclei segmentation, abdominal multi-organ segmentation, and polyp segmentation. The code is available at https://***/haoshao-nku/medical_seg. © Institute of Automation, Chinese Academy of sciences and Springer-Verlag GmbH Germany, part of Springer Nature 2025.

关键词： Image segmentation

来源：评论

学校读者我要写书评

暂无评论

Multi-Target Distributed Maximum Correntropy Kalman Filter 24

Multi-Target Distributed Maximum Correntropy Kalman Filter

引用

12th International Conference on Communications and Broadband Networking, ICCBN 2024

作者： Deng, Xingyu Han, Hongyu Zhang, Sheng Xu, Yin College of Computer Science and Visual Computing Virtual Reality Key Laboratory of Sichuan Province Sichuan Normal University Chengdu610066 China The School of Information Science and Technology Southwest Jiaotong University Chengdu611756 China

ISBN: (纸本)9798400717109

Multi-target tracking in sensor networks is a challenging problem, especially in scenarios where sensor observations are limited. Conventional centralized Kalman filters and distributed Kalman filters (DKFs) require each sensor to observe all targets, which is often difficult to achieve in practical distributed applications. With this in mind, and inspired by the promising robustness of the maximum correntropy criterion in non-Gaussian noise environments, this paper proposes a novel multi-target distributed maximum correntropy Kalman filter, named MT-DMCKF. The proposed algorithm can operate effectively in distributed multi-target scenarios and provide accurate and consistent tracking performance even when the number of sensors mounted on targets is limited. Within a single period, the algorithm requires only one exchange of information among sensors, thus reducing communication overhead compared to existing distributed Kalman filters. Simulation experiments validate the effectiveness and robustness of the proposed MT-DMCKF. © 2024 ACM.

关键词： Wiener filtering

来源：评论

学校读者我要写书评

暂无评论

Concurrent Charging with Wave Interference 42

Concurrent Charging with Wave Interference

引用

42nd IEEE International Conference on computer Communications, INFOCOM 2023

作者： Ma, Yuzhuo Wu, Die Ren, Meixuan Peng, Jian Yang, Jilin Liu, Tang Sichuan Normal University College of Computer Science Chengdu610101 China Sichuan Normal University Visual Computing and Virtual Reality Key Lab Chengdu610068 China Sichuan University College of Computer Science Chengdu610065 China

ISBN: (纸本)9798350334142

To improve the charging performance, employing multiple wireless chargers to charge sensors concurrently is an effective way. In such charging scenarios, the radio waves radiated from multiple chargers will interfere with each other. Though a few work have realized the wave interference, they do not fully utilize the high power caused by constructive interference while avoiding the negative impacts brought by the destructive interference. In this paper, we aim to investigate the power distribution regularity of concurrent charging and take full advantage of the high power to enhance the charging efficiency. Specifically, we formulate a concurrent charGing utility mAxImizatioN (GAIN) problem and build a practical charging model with wave interference. Further, we propose a concurrent charging scheme, which not only can improve the power of interference enhanced regions by deploying chargers, but also find a set of points with the highest power to locate sensors. Finally, we conduct both simulations and field experiments to evaluate the proposed scheme. The results demonstrate that our scheme outperforms the comparison algorithms by 40.48% on average. © 2023 IEEE.

关键词： Inductive power transmission

来源：评论

学校读者我要写书评

暂无评论

Utilizing the Neglected Back Lobe for Mobile Charging 42

Utilizing the Neglected Back Lobe for Mobile Charging

引用

42nd IEEE International Conference on computer Communications, INFOCOM 2023

作者： Ren, Meixuan Wu, Die Xue, Jing Xu, Wenzheng Peng, Jian Liu, Tang Sichuan Normal University College of Computer Science Chengdu610101 China Sichuan Normal University Visual Computing and Virtual Reality Key Lab Chengdu610068 China Sichuan University College of Computer Science Chengdu610065 China

ISBN: (纸本)9798350334142

Benefitting from the breakthrough of wireless power transfer technology, the lifetime of Wireless Sensor Networks (WSNs) can be significantly prolonged by scheduling a mobile charger (MC) to charge sensors. Compared with omnidirectional charging, the MC equipped with directional antenna can concentrate energy in the intended direction, making charging more efficient. However, all prior arts ignore the considerable energy leakage behind the directional antenna (i.e., back lobe), resulting in energy wasted in vain. To address this issue, we study a fundamental problem of how to utilize the neglected back lobe and schedule the directional MC efficiently. Towards this end, we first build and verify a directional charging model considering both main and back lobes. Then, we focus on jointly optimizing the number of dead sensors and energy usage effectiveness. We achieve these by introducing a scheduling scheme that utilizes both main and back lobes to charge multiple sensors simultaneously. Finally, extensive simulations and field experiments demonstrate that our scheme reduces the number of dead sensors by 49.5% and increases the energy usage effectiveness by 10.2% on average as compared with existing algorithms. © 2023 IEEE.

关键词： Wireless sensor networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：