检索结果-内蒙古大学图书馆

Conference on Computer Vision and Pattern Recognition (CVPR)

作者： Zhanxin Gao Jun Cen Xiaobin Chang School of Artificial Intelligence Sun Yat-sen University China Cheng Kar-Shun Robotics Institute The Hong Kong University of Science and Technology China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China

ISBN: (数字)9798350353006

ISBN: (纸本)9798350353013

Continual learning empowers models to adapt autonomously to the ever-changing environment or data streams without forgetting old knowledge. Prompt-based approaches are built on frozen pre-trained models to learn the task-specific prompts and classifiers efficiently. Existing prompt-based methods are inconsistent between training and testing, limiting their effectiveness. Two types of inconsistency are revealed. Test predictions are made from all classifiers while training only focuses on the current task classifier without holistic alignment, leading to Classifier inconsistency. Prompt inconsistency indicates that the prompt selected during testing may not correspond to the one associated with this task during training. In this paper, we propose a novel prompt-based method, Consistent Prompting (CPrompt), for more aligned training and testing. Specifically, all existing classifiers are exposed to prompt training, resulting in classifier consistency learning. In addition, prompt consistency learning is proposed to enhance prediction robustness and boost prompt selection accuracy. Our Consistent Prompting surpasses its prompt-based counterparts and achieves state-of-the-art performance on multiple continual learning benchmarks. Detailed analysis shows that improvements come from more consistent training and testing. Our code is available at https://***/Zhanxin-Gao/CPrompt.

关键词： Training Continuing education Adaptation models Computer vision Accuracy Limiting Robustness

来源：评论

学校读者我要写书评

暂无评论

EgoExo-Fitness: Towards Egocentric and Exocentric Full-Body Action Understanding

arXiv

引用

arXiv 2024年

作者： Li, Yuan-Ming Huang, Wei-Jin Wang, An-Lan Zeng, Ling-An Meng, Jing-Ke Zheng, Wei-Shi School of Computer Science and Engineering Sun Yat-sen University China Peng Cheng Laboratory Shenzhen China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China School of Artificial Intelligence Sun Yat-sen University China

We present EgoExo-Fitness, a new full-body action understanding dataset, featuring fitness sequence videos recorded from synchronized egocentric and fixed exocentric (third-person) cameras. Compared with existing full-body action understanding datasets, EgoExo-Fitness not only contains videos from first-person perspectives, but also provides rich annotations. Specifically, two-level temporal boundaries are provided to localize single action videos along with sub-steps of each action. More importantly, EgoExo-Fitness introduces innovative annotations for interpretable action judgement–including technical keypoint verification, natural language comments on action execution, and action quality scores. Combining all of these, EgoExo-Fitness provides new resources to study egocentric and exocentric full-body action understanding across dimensions of "what", "when", and "how well". To facilitate research on egocentric and exocentric full-body action understanding, we construct benchmarks on a suite of tasks (i.e., action classification, action localization, cross-view sequence verification, cross-view skill determination, and a newly proposed task of guidance-based execution verification), together with detailed analysis. Data and code are available at https://***/iSEE-laboratory/EgoExo-Fitness/tree/main. Copyright © 2024, The Authors. All rights reserved.

关键词： Video analysis

来源：评论

学校读者我要写书评

暂无评论

Collaborative Static and Dynamic Vision-Language Streams for Spatio-Temporal Video Grounding

Collaborative Static and Dynamic Vision-Language Streams for...

引用

Conference on Computer Vision and Pattern Recognition (CVPR)

作者： Zihang Lin Chaolei Tan Jian-Fang Hu Zhi Jin Tiancai Ye Wei-Shi Zheng Sun Yat-sen University China Guangdong Province Key Laboratory of Information Security Technology China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China Tencent China

Spatio-Temporal Video Grounding (STVG) aims to localize the target object spatially and temporally according to the given language query. It is a challenging task in which the model should well understand dynamic visual cues (e.g., motions) and static visual cues (e.g., object appearances) in the language description, which requires effective joint modeling of spatiotemporal visuallinguistic dependencies. In this work, we propose a novel framework in which a static vision-language stream and a dynamic vision-language stream are developed to collaboratively reason the target tube. The static stream performs cross-modal understanding in a single frame and learns to attend to the target object spatially according to intraframe visual cues like object appearances. The dynamic stream models visual-linguistic dependencies across multiple consecutive frames to capture dynamic cues like motions. We further design a novel cross-stream collaborative block between the two streams, which enables the static and dynamic streams to transfer useful and complementary information from each other to achieve collaborative reasoning. Experimental results show the effectiveness of the collaboration of the two streams and our overall frame-work achieves new state-of-the-art performance on both HCSTVG and VidSTG datasets.

关键词：

来源：评论

学校读者我要写书评

暂无评论

CROSS-SCENE PERSON TRAJECTORY ANOMALY DETECTION BASED ON RE-IDENTIFICATION

CROSS-SCENE PERSON TRAJECTORY ANOMALY DETECTION BASED ON RE-...

引用

2021 IEEE International Conference on Multimedia and Expo, ICME 2021

作者： Li, Yuanxun Wu, Ancong Zheng, Wei-Shi School of Computer Science and Engineering Sun Yat-sen University China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China

ISBN: (纸本)9781665438643

In this work, we consider the cross-scene person trajectory anomaly detection problem, which detects the anomalous trajectories across multiple nonoverlapping scenes. This problem is highly significant for public security, but it is still underexplored. Since the trajectory is not continuous across nonoverlapping camera views, we take use of person re-identification (re-ID) to associate the same pedestrian in different scenes while mitigating its inaccuracy by a directional probabilistic graph. To better distinguishing normal samples from anomalies, We formulate a maximized margin graph autoencoder (MMGAE) model, and the reconstruction error of the MMGAE is regarded as an anomaly indicator for the sample. To verify the effectiveness of our approach, we collected and labeled a new dataset. we also explore the impact of the re-ID performance on the anomaly detection problem and the effect of an inaccurately constructed graph on the MMGAE. © 2021 IEEE

关键词： Trajectories

来源：评论

学校读者我要写书评

暂无评论

A self-assembled nanomedicine for glucose supply interruption-amplified low-temperature photothermal therapy and anti-prometastatic inflammatory processes of triple-negative breast cancer

Aggregate

引用

Aggregate 2024年第6期5卷 254-271页

作者： Mingcheng Wang Huixi Yi Zhixiong Zhan Zitong Feng Gang-Gang Yang Yue Zheng Dong-Yang Zhang Guangzhou Municipal and Guangdong Provincial Key Laboratory of Molecular Target&Clinical Pharmacology the NMPA and State Key Laboratory of Respiratory Diseasethe Fifth Affiliated Hospital and School of Pharmaceutical SciencesGuangzhou Medical UniversityGuangzhouChina Key Laboratory of Human-Machine-Intelligence Synergic System Research Center for Neural EngineeringShenzhen Institute of Advanced TechnologyChinese Academy of SciencesShenzhenChina School of Chemistry and Chemical Engineering Anhui University of TechnologyMa’anshanChina Breast Tumor Center Sun Yat-Sen Memorial HospitalSun Yat-Sen UniversityGuangzhouChina

The poor prognosis of triple-negative breast cancer(TNBC)results from its high metastasis,whereas inflammation accompanied by excessive reactive oxygen species(ROS)is prone to aggravate tumor *** photothermal therapy(PTT)has extremely high therapeutic efficiency,the crafty tumor cells allow an increase in the expression of heat shock proteins(HSPs)to limit its effect,and PTT-induced inflammation is also thought to be a potential trigger for tumor ***,myricetin,iron ions,and polyvinylpyrrolidone were utilized to develop nanomedicines by self-assembly strategy for the treatment of metastatic *** nanomedicines with marvelous water solubility and dispersion can inhibit glucose transporter 1 and interfere with mitochondrial function to block the energy supply of tumor cells,achieving starvation therapy on TNBC *** with excellent photothermal conversion properties allow down-regulating the expression of HSPs to enhance the effect of ***,the broad spectrum of ROS scavenging ability of nanomedicines successfully attenuates PTT-induced inflammation as well as influences hypoxia-inducible factors-1α/3-phosphoinositide-dependent protein kinase 1 related pathway through glycometabolism inhibition to reduce tumor cell ***,the nanomedicines have negligible side effects and good clinical application prospects,which provides a valuable paradigm for the treatment of metastatic TNBC through glycometabolism interference,anti-inflammation,starvation,and photothermal synergistic therapy.

关键词： self-assembled glucose transporter glycometabolism photothermal therapy starvation therapy metastasis

来源：评论

学校读者我要写书评

暂无评论

Predicting game-induced emotions using EEG, data mining and machine learning

引用

Bulletin of the National Research Centre 2024年第1期48卷 1-10页

作者： Lim, Min Xuan Teo, Jason Faculty of Computing and Informatics Universiti Malaysia Sabah Kota Kinabalu Malaysia Creative Advanced Machine Intelligence Research Centre Faculty of Computing and Informatics Universiti Malaysia Sabah Kota Kinabalu Malaysia Evolutionary Computing Laboratory Faculty of Computing and Informatics Universiti Malaysia Sabah Kota Kinabalu Malaysia

Emotion is a complex phenomenon that greatly affects human behavior and thinking in daily life. Electroencephalography (EEG), one of the human physiological signals, has been emphasized by most researchers in emotion recognition as its specific properties are closely associated with human emotion. However, the number of human emotion recognition studies using computer games as stimuli is still insufficient as there were no relevant publicly available datasets provided in the past decades. Most of the recent studies using the Gameemo public dataset have not clarified the relationship between the EEG signal’s changes and the emotion elicited using computer games. Thus, this paper is proposed to introduce the use of data mining techniques in investigating the relationships between the frequency changes of EEG signals and the human emotion elicited when playing different kinds of computer games. The data acquisition stage, data pre-processing, data annotation and feature extraction stage were designed and conducted in this paper to obtain and extract the EEG features from the Gameemo dataset. The cross-subject and subject-based experiments were conducted to evaluate the classifiers’ performance. The top 10 association rules generated by the RCAR classifier will be examined to determine the possible relationship between the EEG signal's frequency changes and game-induced emotions. The RCAR classifier constructed for cross-subject experiment achieved highest accuracy, precision, recall and F1-score evaluated with over 90% in classifying the HAPV, HANV and LANV game-induced emotions. The 20 experiment cases’ results from subject-based experiments supported that the SVM classifier could accurately classify the 4 emotion states with a kappa value over 0.62, demonstrating the SVM-based algorithm’s capabilities in precisely determining the emotion label for each participant’s EEG features’ instance. The findings in this study fill the existing gap of game-induced emotion recog

关键词：

来源：评论

学校读者我要写书评

暂无评论

Single-View Scene Point Cloud Human Grasp Generation

arXiv

引用

arXiv 2024年

作者： Wang, Yan-Kang Xing, Chengyi Wei, Yi-Lin Wu, Xiao-Ming Zheng, Wei-Shi School of Computer Science and Engineering Sun Yat-sen University China Stanford University StanfordCA United States Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China

In this work, we explore a novel task of generating human grasps based on single-view scene point clouds, which more accurately mirrors the typical real-world situation of observing objects from a single viewpoint. Due to the incompleteness of object point clouds and the presence of numerous scene points, the generated hand is prone to penetrating into the invisible parts of the object and the model is easily affected by scene points. Thus, we introduce S2HGrasp, a framework composed of two key modules: the Global Perception module that globally perceives partial object point clouds, and the DiffuGrasp module designed to generate high-quality human grasps based on complex inputs that include scene points. Additionally, we introduce S2HGD dataset, which comprises approximately 99,000 single-object single-view scene point clouds of 1,668 unique objects, each annotated with one human grasp. Our extensive experiments demonstrate that S2HGrasp can not only generate natural human grasps regardless of scene points, but also effectively prevent penetration between the hand and invisible parts of the object. Moreover, our model showcases strong generalization capability when applied to unseen objects. Our code and dataset are available at https://***/iSEE-laboratory/S2HGrasp. Copyright © 2024, The Authors. All rights reserved.

关键词： machine learning

来源：评论

学校读者我要写书评

暂无评论

Shape Optimization of Magnetorheological Brake for Performance Enhancement

Shape Optimization of Magnetorheological Brake for Performan...

引用

2024 IEEE International Conference on Robotics and Biomimetics, ROBIO 2024

作者： Shi, Qiuyu Liu, Fajiang Wu, Xinyu Gao, Fei Southern University of Science and Technology Shenzhen518005 China Shenzhen Institute of Advanced Technology Guangdong Provincial Key Laboratory of Robotics and Intelligent System Chinese Academy of Sciences Shenzhen518055 China Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems Shenzhen518055 China

ISBN: (纸本)9781665481090

Magnetorheological (MR) rotary brakes leverage the magnetically controllable rheological properties of MR fluids to provide damping torque in lower limb assistance devices. This paper utilizes an optimization algorithm (Whale Optimization Algorithm (WOA)) to optimize the geometric dimension of the MR brake to enhance field-dependent brake torque. First, the mechanical design and optimization process of the MR brake were outlined. Then, the magnetic density of the MR fluid is evaluated using the ANSYS platform. With that, the WOA was exploited to optimize key design parameters across multiple iterations. According to the simulated results, the torque was improved from 17.39 Nm to 25.89 Nm, which increased by 48%. Finally, a prototype with the initial dimension was fabricated and preliminarily tested. © 2024 IEEE.

关键词： Shape optimization

来源：评论

学校读者我要写书评

暂无评论

Automatic Depression Detection Network Based on Facial Images and Facial Landmarks Feature Fusion

Automatic Depression Detection Network Based on Facial Image...

引用

2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024

作者： Hu, Min Xu, Lingxiang Liu, Lei Wang, Xiaohua Li, Hongbo Yang, Jiaoyun Hefei University of Technology Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine Hefei China Hefei University of Technology National Smart Eldercare International Science and Technology Cooperation Base Hefei China Hefei University of Technology Intelligent Interconnected Systems Laboratory of Anhui Province Hefei China

ISBN: (纸本)9798350386226

Artificial intelligence methods offer objectivity and convenience in automatic depression detection, however, current research often neglects the critical role of facial landmarks. This oversight results in insufficient spatial structure information and a lack of detailed local representation, which fails to capture the nuanced semantic information crucial for identifying depression-related clues. To address these issues, we introduce a novel dual-branch network model comprising the Landmark-Image-Landmark Net (LIL Net) and the Global Context Vision Transformer Net (GCVit Net). Through a dual-stream, multiscale, and cross-fusion strategy, LIL Net is designed to extract original facial image features alongside landmark features, prioritizing the detailed semantic information of potential depression clues. LIL Net employs an innovative LIL Attention approach to jointly learn multiscale features from facial landmarks and images, thereby enhancing the model's ability to capture fine-grained depression-related cues. Furthermore, the Multi-scale Feature Fusion (MSFF) module fuses the obtained multiscale features, augmenting the semantic expression of potential depression clues within facial landmarks via attention mechanisms. Meanwhile, the GCVit Net branch network supplements global information by extracting global facial features. Finally, the features from both branches are concatenated to enhance the accuracy of depression degree predictions. Experimental results demonstrate that our model has superior performance in detecting depression compared to existing methods. We release our code at https://***/xlx777/LIL-Net. © 2024 IEEE.

关键词： Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

A Novel Hybrid Brain-Computer Interface Based on Integration of Steady-State Visual Stimulation and Rapid Serial Visual Presentation Paradigm

A Novel Hybrid Brain-Computer Interface Based on Integration...

引用

2024 IEEE International Conference on Robotics and Biomimetics, ROBIO 2024

作者： Xie, Jun Zhang, Huanqing Tao, Qing Fang, Peng Liu, Junjie Ge, Zengle Shao, Yixuan Ding, Yuhang School of Mechanical Engineering Xinjiang University Urumqi830017 China School of Mechanical Engineering Xi'an Jiaotong University Xi'an710049 China Shenzhen Institute of Advanced Technology CAS Key Laboratory of Human-Machine Intelligence-Synergy Systems Shenzhen Engineering Laboratory of Neural Rehabilitation Technology Shenzhen518055 China

ISBN: (纸本)9781665481090

Brain-computer interface (BCI) is a kind of human-computer interaction which can realize the communication and control between human brain and the external environment. The single-modality BCI has the problems of small command set and the difficulty to realize multi-dimensional control. To address these issues, a hybrid paradigm combining steady-state visual evoked potential (SSVEP) and rapid serial visual presentation (RSVP) was proposed and designed in this study. A 16-target SSVEP-RSVP hybrid BCI (hBCI) was constructed. This novel paradigm aims to achieve fast information transmission, guarantee system performance through efficient visual stimulation, and effectively overcome the crowding effect problem in the traditional BCIs. Through offline and online experiments, the hBCI system based on the SSVEP-RSVP integration paradigm shows high performance, which provides new insights and technical and theoretical supports for the applicable hBCIs. © 2024 IEEE.

关键词： Brain computer interface

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：