检索结果-内蒙古大学图书馆

7th International Conference on Pattern Recognition and Artificial Intelligence, PRAI 2024

作者： Wang, Mingyang Lu, Ying Xu, Quanyuan College of Big Data and Intelligent Engineering Southwest Forestry University Kunming China Big Data State Forestry Administration on Southwest Forestry University Key Laboratory of Forestry and Ecological Kunming China

ISBN: (纸本)9798350350890

Spodoptera frugiperda (fall armyworm, FAW) is a pest that poses a significant threat to global agriculture, with its larvae exhibiting unique morphological characteristics and varying degrees of harm at different instar stages. This study proposes a method that incorporates an improved self-attention mechanism to achieve rapid and accurate identification of FAW larval instars. This method can be seamlessly integrated into convolutional neural networks (CNNs), enhancing recognition performance by focusing on the global features of images. Furthermore, this research has constructed an image dataset of FAW larvae at various developmental stages under field and laboratory conditions to validate the effectiveness of the proposed method. Experimental results demonstrate that our method improves the recognition accuracy across various CNN architectures, achieving accuracy enhancements of 5.49%, 7.38%, 5.79%, and 5.59% on Alexnet, vgg16, resnet50, and resnet101, respectively. Our approach provides technicians with a convenient, intelligent, and precise tool to identify FAW larval instar stages. © 2024 IEEE.

关键词： Invertebrates

来源：评论

学校读者我要写书评

暂无评论

Detecting Dependency-Related Sentiment Features for Aspect-Level Sentiment Classification

引用

IEEE TRANSACTIONS ON AFFECTIVE COMPUTING 2023年第1期14卷 196-210页

作者： Zhang, Xing Xu, Jingyun Cai, Yi Tan, Xingwei Zhu, Changxi South China Univ Technol Sch Software Engn Guangzhou 510006 Peoples R China South China Univ Technol Key Lab Big Data & Intelligent Robot Minist Educ Guangzhou Peoples R China

Aspect-level sentiment classification aims to determine the sentiment polarity of a sentence toward a given aspect term or aspect category. For sentiment classification toward a given aspect term, some opinions may exist that are not the given aspect term's modifiers because a sentence may contain more than one aspect term. Hence, It is necessary to capture relevant opinion for a certain aspect term. To capture the nearest opinion of the aspect term, researchers have used the relative distance between an aspect term and all other words in a sentence. However, this can be infeasible when the sentence has a complex syntactic structure. In this paper, we introduce dependency relation to detect the dependency-related sentiment feature for the aspect term in the dependency parse tree, and integrate this relationship into the convolutional neural network and bidirectional long short-term memory. Experiments show that the related sentiment features for an aspect term help models discriminate its sentiment polarity. The proposed models achieve state-of-the-art results among neural networks. The codes and datasets are released on https://***/LittleSummer114/DW-CNN.

关键词： Feature extraction Task analysis Syntactics Batteries Sentiment analysis Computer architecture Semantics Aspect-level-sentiment classification opinion mining dependency parse tree natural language processing neural networks convolutional neural network bidirectional long short-term memory

来源：评论

学校读者我要写书评

暂无评论

ViGT: proposal-free video grounding with a learnable token in the transformer

引用

Science China(Information Sciences) 2023年第10期66卷 196-212页

作者： Kun LI Dan GUO Meng WANG School of Computer Science and Information Engineering Hefei University of Technology Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Intelligent Interconnected Systems Laboratory of Anhui Province Institute of Artificial Intelligence Hefei Comprehensive National Science Center

The video grounding(VG) task aims to locate the queried action or event in an untrimmed video based on rich linguistic descriptions. Existing proposal-free methods are trapped in the complex interaction between video and query, overemphasizing cross-modal feature fusion and feature correlation for VG. In this paper, we propose a novel boundary regression paradigm that performs regression token learning in a transformer. Particularly, we present a simple but effective proposal-free framework, namely video grounding transformer(ViGT), which predicts the temporal boundary using a learnable regression token rather than multi-modal or cross-modal features. In ViGT, the benefits of a learnable token are manifested as follows.(1) The token is unrelated to the video or the query and avoids data bias toward the original video and query.(2) The token simultaneously performs global context aggregation from video and query ***, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention(i.e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality. Furthermore, we concatenated a learnable regression token [REG] with the video and query features as the input of a vision-language transformer. Finally, we utilized the token [REG] to predict the target moment and visual features to constrain the foreground and background probabilities at each timestamp. The proposed ViGT performed well on three public datasets:ANet-Captions, TACoS, and YouCookⅡ. Extensive ablation studies and qualitative analysis further validated the interpretability of ViGT.

关键词： video grounding temporal sentence grounding boundary regression token learning proposal-free

来源：评论

学校读者我要写书评

暂无评论

Transverse Velocity Field Measurement of Solar High-resolution Images Based on Unsupervised Deep Learning

引用

Research in Astronomy and Astrophysics 2025年第3期25卷 236-248页

作者： Zhen-Hong Shang Long Chen Zhen-Ping Qiang Yi Bi Run-Xin Li Faculty of Information Engineering and Automation Kunming University of Science and Technology Yunnan Key Laboratory of Artificial Intelligence Kunming University of Science and Technology College of Big Data and Intelligent Engineering Southwest Forestry University Yunnan Observatories Chinese Academy of Sciences

Measuring the transverse velocity field in high-resolution solar images is essential for understanding solar *** paper introduces an innovative unsupervised deep learning optical flow model designed to calculate the transverse velocity field,addressing the challenges of missing optical flow labels and the limited accuracy of velocity field measurements in high-resolution solar *** proposed method converts the transverse velocity field computation problem into an optical flow computation problem,using two forward propagations of features to get rid of the reliance on optical flow ***,it reduces the impact of the“Brightness Consistency”constraint on optical flow accuracy by identifying and handling optical flow *** apply this method to compute the transverse velocity fields of high-resolution solar image sequences from the Hαand TiO bands,observed by the New Vacuum Solar *** experiments with several wellestablished optical flow methods,including those based on supervised deep learning models,show that our approach outperforms the comparison methods according to key evaluation metrics such as Residual Map Mean,Residual Map Variance,Cross Correlation,and Structural Similarity Index ***,since optical flow captures the fundamental motion information in image sequences,the proposed method can be applied to a variety of research areas,including solar image registration,sequence alignment,image super-resolution,magnetic field calibration,and solar activity *** code is available at https://***/jackie-willianm/Transverse-Velocity-Field-Measurement-of-Solar-High-Resolution-Images.

关键词： methods: data analysis Sun: fundamental parameters techniques: image processing

来源：评论

学校读者我要写书评

暂无评论

Dual-Branch Attention Transformer for Visual Commonsense Reasoning 6

Dual-Branch Attention Transformer for Visual Commonsense Rea...

引用

6th International Conference on Frontier Technologies of Information and Computer, ICFTIC 2024

作者： Ma, Xuerui Bai, Zongwen Zhou, Meili Gao, Yiqun School of Physics and Electronic Information Yan'an University Shaanxi Key Laboratory of Intelligent Processing for Big Energy Data Yan'an China

ISBN: (纸本)9798331541750

Visual Commonsense Reasoning (VCR) is a challenging task that requires a model to select the correct answer in the context of a given image and question, while also providing a reasonable explanation to justify the chosen answer. In this paper, we propose a novel dual-branch attention network model that effectively models dense intra-modal and inter-modal interactions. Our model utilizes a self-attention mechanism to attend to visual and textual information independently, while employing a bidirectional guided attention mechanism to facilitate effective interaction between different modalities, thus correlating visual information with textual information. By stacking multiple dual-branch attention modules (DBAs), our model is capable of deeply exploring data features and extracting more powerful and informative prior features. By achieving a common focus on visual and textual information, we enhance the subtlety of our understanding and significantly improve our ability to perform visual reasoning. Our comprehensive and qualitative assessments on the public VCR dataset illustrate that our methodology surpasses most models, providing a new solution for Visual Commonsense Reasoning tasks. © 2024 IEEE.

关键词： dual-branch attention multimodel Visual Commonsense Reasoning

来源：评论

学校读者我要写书评

暂无评论

LeapGNN: Accelerating Distributed GNN Training Leveraging Feature-Centric Model Migration 23

LeapGNN: Accelerating Distributed GNN Training Leveraging Fe...

引用

23rd USENIX Conference on File and Storage Technologies, FAST 2025

作者： Chen, Weijian He, Shuibing Qu, Haoyang Zhang, Xuechen The State Key Laboratory of Blockchain and Data Security Zhejiang University China Zhejiang Lab China Institute of Blockchain and Data Security China Zhejiang Key Laboratory of Big Data Intelligent Computing China Washington State University Vancouver United States

ISBN: (纸本)9781939133458

Distributed training of graph neural networks (GNNs) has become a crucial technique for processing large graphs. Prevalent GNN frameworks are model-centric, necessitating the transfer of massive graph vertex features to GNN models, which leads to a significant communication bottleneck. Recognizing that the model size is often significantly smaller than the feature size, we propose LeapGNN, a feature-centric framework that reverses this paradigm by bringing GNN models to vertex features. To make it truly effective, we first propose a micrograph-based training strategy that leverages a refined structure to enhance locality, combined with the model migration technique, to minimize remote feature retrieval. Then, we devise a feature pre-gathering approach that merges multiple fetch operations into a single one to eliminate redundant feature transmissions. Finally, we employ a micrograph-based merging method that adjusts the number of micrographs for each worker to minimize kernel switches and synchronization overhead. Our experimental results demonstrate that LeapGNN achieves a performance speedup of up to 4.2× compared to the state-of-the-art method, namely P3 © 2025 FAST. All Rights Reserved.

关键词： Graph neural networks

来源：评论

学校读者我要写书评

暂无评论

BeautifulPrompt: Towards Automatic Prompt Engineering for Text-to-Image Synthesis

BeautifulPrompt: Towards Automatic Prompt Engineering for Te...

引用

2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023

作者： Cao, Tingfeng Wang, Chengyu Liu, Bingyan Wu, Ziheng Zhu, Jinhui Huang, Jun South China University of Technology China Alibaba Group China Key Laboratory of Big Data and Intelligent Robot South China University of Technology Ministry of Education China

Recently, diffusion-based deep generative models (e.g., Stable Diffusion) have shown impressive results in text-to-image synthesis. However, current text-to-image models often require multiple passes of prompt engineering by humans in order to produce satisfactory results for real-world applications. We propose BeautifulPrompt, a deep generative model to produce high-quality prompts from very simple raw descriptions, which enables diffusion-based models to generate more beautiful images. In our work, we first fine-tuned the BeautifulPrompt model over low-quality and high-quality collecting prompt pairs. Then, to ensure that our generated prompts can generate more beautiful images, we further propose a Reinforcement Learning with Visual AI Feedback technique to fine-tune our model to maximize the reward values of the generated prompts, where the reward values are calculated based on the PickScore and the Aesthetic Scores. Our results demonstrate that learning from visual AI feedback promises the potential to improve the quality of generated prompts and images significantly. We further showcase the integration of BeautifulPrompt to a cloud-native AI platform to provide better text-to-image generation service in the cloud. © 2023 Association for Computational Linguistics.

关键词： Diffusion

来源：评论

学校读者我要写书评

暂无评论

keypoint-based Progressive Chain-of-Thought Distillation for LLMs 41

Keypoint-based Progressive Chain-of-Thought Distillation for...

引用

41st International Conference on Machine Learning, ICML 2024

作者： Feng, Kaituo Li, Changsheng Zhang, Xiaolu Zhou, Jun Yuan, Ye Wang, Guoren Beijing Institute of Technology China Ant Group China Hebei Province Key Laboratory of Big Data Science and Intelligent Technology China

Chain-of-thought distillation is a powerful technique for transferring reasoning abilities from large language models (LLMs) to smaller student models. Previous methods typically require the student to mimic the step-by-step rationale produced by LLMs, often facing the following challenges: (i) Tokens within a rationale vary in significance, and treating them equally may fail to accurately mimic keypoint tokens, leading to reasoning errors. (ii) They usually distill knowledge by consistently predicting all the steps in a rationale, which falls short in distinguishing the learning order of step generation. This diverges from the human cognitive progression of starting with easy tasks and advancing to harder ones, resulting in sub-optimal outcomes. To this end, we propose a unified framework, called KPOD, to address these issues. Specifically, we propose a token weighting module utilizing mask learning to encourage accurate mimicry of keypoint tokens by the student during distillation. Besides, we develop an in-rationale progressive distillation strategy, starting with training the student to generate the final reasoning steps and gradually extending to cover the entire rationale. To accomplish this, a weighted token generation loss is proposed to assess step reasoning difficulty, and a value function is devised to schedule the progressive distillation by considering both step difficulty and question diversity. Extensive experiments on four reasoning benchmarks illustrate our KPOD outperforms previous methods by a large margin. Copyright 2024 by the author(s)

关键词： Students

来源：评论

学校读者我要写书评

暂无评论

SDGNN: Symmetry-Preserving Dual-Stream Graph Neural Networks

引用

IEEE/CAA Journal of Automatica Sinica 2024年第7期11卷 1717-1719页

作者： Jiufang Chen Ye Yuan Xin Luo the College of Computer Science and Technology Chongqing University of Posts and TelecommunicationsChongqing 400065 the Chongqing Key Laboratory of Big Data and Intelligent Computing Chongqing Institute of Green and Intelligent TechnologyChinese Academy of SciencesChongqing 400714China the College of Computer and Information Science Southwest UniversityChongqing 400715China IEEE

Dear Editor,This letter proposes a symmetry-preserving dual-stream graph neural network(SDGNN) for precise representation learning to an undirected weighted graph(UWG). Although existing graph neural networks(GNNs) are influential instruments for representation learning to a UWG, they invariably adopt a unique node feature matrix for illustrating the sole node set of a UWG.

关键词： representation preserving undirected

来源：评论

学校读者我要写书评

暂无评论

DETECTING MACHINE-GENERATED TEXTS BY MULTI-POPULATION AWARE OPTIMIZATION FOR MAXIMUM MEAN DISCREPANCY 12

DETECTING MACHINE-GENERATED TEXTS BY MULTI-POPULATION AWARE ...

引用

12th International Conference on Learning Representations, ICLR 2024

作者： Zhang, Shuhai Song, Yiliao Yang, Jiahao Li, Yuanqing Han, Bo Tan, Mingkui South China University of Technology China Pazhou Laboratory China The University of Adelaide Australia Key Laboratory of Big Data and Intelligent Robot Ministry of Education China Department of Computer Science Hong Kong Baptist University Hong Kong

Large language models (LLMs) such as ChatGPT have exhibited remarkable performance in generating human-like texts. However, machine-generated texts (MGTs) may carry critical risks, such as plagiarism issues, misleading information, or hallucination issues. Therefore, it is very urgent and important to detect MGTs in many situations. Unfortunately, it is challenging to distinguish MGTs and human-written texts because the distributional discrepancy between them is often very subtle due to the remarkable performance of LLMs. In this paper, we seek to exploit maximum mean discrepancy (MMD) to address this issue in the sense that MMD can well identify distributional discrepancies. However, directly training a detector with MMD using diverse MGTs will incur a significantly increased variance of MMD since MGTs may contain multiple text populations due to various LLMs. This will severely impair MMD's ability to measure the difference between two samples. To tackle this, we propose a novel multi-population aware optimization method for MMD called MMD-MP, which can avoid variance increases and thus improve the stability to measure the distributional discrepancy. Relying on MMD-MP, we develop two methods for paragraph-based and sentence-based detection, respectively. Extensive experiments on various LLMs, e.g., GPT2 and ChatGPT, show superior detection performance of our MMD-MP. The source code is available at https://***/ZSHsh98/MMD-MP. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：