检索结果-内蒙古大学图书馆

32nd ACM International Conference on Multimedia, MM 2024

作者： Li, Yudong Hou, Xianxu Dezhi, Zheng Shen, Linlin Zhao, Zhe School of Computer Science and Software Engineering Shenzhen University Shenzhen China Shenzhen Institute of Artificial Intelligence and Robotics for Society Shenzhen China School of AI and Advanced Computing Xi'an Jiaotong-Liverpool University Shenzhen China Guangdong Provincial Key Laboratory of Intelligent Information Processing Shenzhen University Shenzhen China Tencent AI Lab Beijing China

ISBN: (纸本)9798400706868

While significant progress has been made in multi-modal learning driven by large-scale image-text datasets, there is still a noticeable gap in the availability of such datasets within the facial domain. To facilitate and advance the field of facial representation learning, we present FLIP-80M, a large-scale visual-linguistic dataset comprising over 80 million face images paired with text descriptions. FLIP-80M is constructed by leveraging the large openly available image-text-pair dataset LAION-5B and a mixed-method approach to filter face-related pairs from both visual and linguistic perspectives. Our curation process involves face detection, face caption classification, text de-noising, and synthesis-based image augmentation. As a result, FLIP-80M stands as the largest face-text dataset to date. To evaluate the potential of our dataset, we fine-tune the CLIP model using the proposed FLIP-80M, to create FLIP (Facial Language-Image Pretraining) and assess its representation capabilities across various downstream tasks. Our experiments demonstrate that our FLIP model achieves state-of-the-art results in a range of face analysis tasks, including face parsing, face alignment, and face attribute classification. The dataset and models are available at https://***/ydli-ai/FLIP. © 2024 ACM.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

Benchmarking Graph Representations and Graph Neural Networks for Multivariate Time Series Classification

arXiv

引用

arXiv 2025年

作者： Yang, Wennuo Wu, Shiling Zhou, Yuzhi Luo, Cheng He, Xilin Xie, Weicheng Shen, Linlin Song, Siyang Computer Vision Institute School of Computer Science & Software Engineering Shenzhen University China Shenzhen Institute of Artificial Intelligence and Robotics for Society China Guangdong Provincial Key Laboratory of Intelligent Information Processing China HBUG Lab University of Exeter United Kingdom

Multivariate Time Series Classification (MTSC) enables the analysis if complex temporal data, and thus serves as a cornerstone in various real-world applications, ranging from healthcare to finance. Since the relationship among variables in MTS usually contain crucial cues, a large number of graph-based MTSC approaches have been proposed, as the graph topology and edges can explicitly represent relationships among variables (channels), where not only various MTS graph representation learning strategies but also different Graph Neural Networks (GNNs) have been explored. Despite such progresses, there is no comprehensive study that fairly benchmarks and investigates the performances of existing widely-used graph representation learning strategies/GNN classifiers in the application of different MTSC tasks. In this paper, we present the first benchmark which systematically investigates the effectiveness of the widely-used three node feature definition strategies, four edge feature learning strategies and five GNN architecture, resulting in 60 different variants for graph-based MTSC. These variants are developed and evaluated with a standardized data pipeline and training/validation/testing strategy on 26 widely-used suspensor MTSC datasets. Our experiments highlight that node features significantly influence MTSC performance, while the visualization of edge features illustrates why adaptive edge learning outperforms other edge feature learning methods. The code of the proposed benchmark is publicly available at https://***/CVI-yangwn/*** Codes 68T10 © 2025, CC BY.

关键词： Graph neural networks

来源：评论

学校读者我要写书评

暂无评论

KADEL: Knowledge-Aware Denoising Learning for Commit Message Generation

arXiv

引用

arXiv 2024年

作者： Tao, Wei Zhou, Yucheng Wang, Yanlin Zhang, Hongyu Wang, Haofen Zhang, Wenqiang Shanghai Engineering Research Center of AI and Robotics Academy for Engineering and Technology Fudan University Shanghai China State Key Laboratory of Internet of Things for Smart City Department of Computer and Information Science University of Macau China School of Software Engineering Sun Yat-sen University Guangdong Zhuhai519082 China School of Big Data and Software Engineering Chongqing University Chongqing China College of Design and Innovation Tongji University Shanghai China Engineering Research Center of AI and Robotics Ministry of Education Academy for Engineering and Technology Shanghai Key Lab of Intelligent Information Processing School of Computer Science Fudan University 220 Handan Road Shanghai200433 China

Commit messages are natural language descriptions of code changes, which are important for software evolution such as code understanding and maintenance. However, previous methods are trained on the entire dataset without considering the fact that a portion of commit messages adhere to good practice (i.e., good-practice commits), while the rest do not. On the basis of our empirical study, we discover that training on good-practice commits significantly contributes to the commit message generation. Motivated by this finding, we propose a novel knowledge-aware denoising learning method called KADEL. Considering that good-practice commits constitute only a small proportion of the dataset, we align the remaining training samples with these good-practice commits. To achieve this, we propose a model that learns the commit knowledge by training on good-practice commits. This knowledge model enables supplementing more information for training samples that do not conform to good practice. However, since the supplementary information may contain noise or prediction errors, we propose a dynamic denoising training method. This method composes a distribution-aware confidence function and a dynamic distribution list, which enhances the effectiveness of the training process. Experimental results on the whole MCMD dataset demonstrate that our method overall achieves state-of-the-art performance compared with previous methods. © 2024, CC BY.

关键词： Sampling

来源：评论

学校读者我要写书评

暂无评论

Multi-Objective Optimization of High-Power Fiber Laser Cutting of Thick Mild Steel by Using Response Surface Methodology

SSRN

引用

SSRN 2023年

作者： Liu, Yanjie Yoshigoe, Kenji Ullah, Farhan Zhang, Shijin Zhao, Yue Shanghai Key Lab of Intelligent Manufacturing and Robotics School of Mechatronic Engineering and Automation Shanghai University Shanghai200444 China School of Software Northwestern Polytechnical University Xi’an710129 China Department of Electrical Engineering and Computer Science Embry-Riddle Aeronautical University Daytona BeachFL32114 United States

High-power continuous wave fiber laser cutting is a next-generation cutting technology for rapid prototyping and small-scale fabrication. To utilize the advantage offered by this technology in terms of high-quality cutting, the laser cutting process parameters must be optimized. In this study, separation speed experiment on thick mild steel plates was performed using a 12-kW continuous-wave multimode ytterbium-doped fiber laser, and a high-power fiber laser cutting cut-through criterion for thick carbon steel plates was proposed. The response surface method was used to model and optimize the cut quality characteristics. Material thickness, laser power, cutting speed, gas pressure, and defocus amount were selected as input factors, whereas kerf taper and material removal rate (MRR) were selected as output quality characteristics. The effects of the various factors on the kerf taper and MRR were investigated using analysis of variance, and second-order regression models were developed for the kerf taper and MRR. Furthermore, a multi-objective optimization method was used to optimize the kerf taper and MRR. The percentage errors of the validation experiments and multi-objective optimization results were 5.27% and 1.93% for the kerf taper and MRR, respectively. Moreover, the analysis revealed that the material removal mechanism and the kerf taper formation mechanism are related to the melt flow. The findings presented in this paper contribute to the improvement of quality and accuracy when cutting thick plates by using a high-power fiber laser. © 2023, The Authors. All rights reserved.

关键词： Multiobjective optimization

来源：评论

学校读者我要写书评

暂无评论

A bottom-up paradigm for traffic scene graph representation 9

A bottom-up paradigm for traffic scene graph representation

引用

9th International Conference on Computing and Pattern Recognition, ICCPR 2020

作者： Zhang, Zhixuan Zhang, Chi Liu, Yuehu Su, Yuanqi Li, Ping Zheng, Jinzi School of Software Engineering Xi'An Jiaotong University Xi'an China Institute of Artificial Intelligence and Robotics Xi'An Jiaotong University Xi'an China Shaanxi Key Lab of Digital Technology and Intelligent Systems Xi'an China School of Computer Science and Technology Xi'An Jiaotong University Xi'an China China Academy of Railway Sciences Corporation Limited Beijing China

ISBN: (纸本)9781450387835

With increasing hardware computing power and model capacity, visual tasks for scene cognitive understanding have attracted more attention, such as visual relationships inference. The scene graph representation formed by a coupling of objects, attributes and relationships nodes displayed by different modalities of information, including original image, foreground things, background stuff and scene attributes, strongly promotes the progress of research area. In this paper, we address the scene graph representation of traffic scenarios for autonomous driving. It should be noted that the universal representation are the specific needs of cognitive understanding of traffic scenes: on the one hand, there is a lack of fine-grained description of key objects and attributes;on the other hand, there are redundant descriptions of objects and relationships. To tackle these problems, we take advantage of the fine-grained instance-level annotation of the traffic scene, proposing a bottom-up representation paradigm. It makes full use of the hierarchical structure of the traffic scene and the sparsity of element classes. In addition, on the basis of the existing methods, we optimize the relationship list of traffic scene graph representation. Moreover, we improve the scene graph annotation methods, proposing a "ground-vision joint location method"to better describe the spatially-distributed visual knowledge. The case analysis showed that compared with existing methods, our paradigm for scene graph can represent more abundant traffic scene information. © 2020 ACM.

关键词： Knowledge representation

来源：评论

学校读者我要写书评

暂无评论

A Bottom-up Paradigm for Traffic Scene Graph Representation 2020

A Bottom-up Paradigm for Traffic Scene Graph Representation

引用

Proceedings of the 2020 9th International Conference on Computing and Pattern Recognition

作者： Zhixuan Zhang Chi Zhang Yuehu Liu Yuanqi Su Ping Li Jinzi Zheng School of Software Engineering Xi'an Jiaotong University Xi'an China Institute of Artificial Intelligence and Robotics Xi'an Jiaotong University Xi'an China Institute of Artificial Intelligence and Robotics Xi'an Jiaotong University Xi'an China and Shaanxi Key Lab of Digital Technology and Intelligent Systems Xi'an China School of Computer Science and Technology Xi'an Jiaotong University Xi'an China China Academy of Railway Sciences Corporation Limited Beijing China

ISBN: (纸本)9781450387835

With increasing hardware computing power and model capacity, visual tasks for scene cognitive understanding have attracted more attention, such as visual relationships inference. The scene graph representation formed by a coupling of objects, attributes and relationships nodes displayed by different modalities of information, including original image, foreground things, background stuff and scene attributes, strongly promotes the progress of research area. In this paper, we address the scene graph representation of traffic scenarios for autonomous driving. It should be noted that the universal representation are the specific needs of cognitive understanding of traffic scenes: on the one hand, there is a lack of fine-grained description of key objects and attributes; on the other hand, there are redundant descriptions of objects and relationships. To tackle these problems, we take advantage of the fine-grained instance-level annotation of the traffic scene, proposing a bottom-up representation paradigm. It makes full use of the hierarchical structure of the traffic scene and the sparsity of element classes. In addition, on the basis of the existing methods, we optimize the relationship list of traffic scene graph representation. Moreover, we improve the scene graph annotation methods, proposing a "ground-vision joint location method" to better describe the spatially-distributed visual knowledge. The case analysis showed that compared with existing methods, our paradigm for scene graph can represent more abundant traffic scene information.

关键词： visual relationship description traffic knowledge representation autonomous driving Scene graph

来源：评论

学校读者我要写书评

暂无评论

REFUGE2 CHALLENGE: A TREASURE TROVE FOR MULTI-DIMENSION ANALYSIS AND EVALUATION IN GLAUCOMA SCREENING

arXiv

引用

arXiv 2022年

作者： Fang, Huihui Li, Fei Wu, Junde Fu, Huazhu Sun, Xu Son, Jaemin Yu, Shuang Zhang, Menglu Yuan, Chenglang Bian, Cheng Lei, Baiying Zhao, Benjian Xu, Xinxing Li, Shaohua Fumero, Francisco Sigut, José Almubarak, Haidar Bazi, Yakoub Guo, Yuanhao Zhou, Yating Baid, Ujjwal Innani, Shubham Guo, Tianjiao Yang, Jie Orlando, José Ignacio Bogunović, Hrvoje Zhang, Xiulan Xu, Yanwu The REFUGE2 Challenge Australia State Key Laboratory of Ophthalmology Zhongshan Ophthalmic Center Sun Yat-Sen University Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science Guangzhou China Intelligent Healthcare Unit Baidu Inc. Beijing China The Institute of High Performance Computing Agency for Science Technology and Research Singapore Yatiris Group PLADEMA Institute CONICET UNICEN Tandil Argentina Christian Doppler Lab for Artificial Intelligence in Retina Department of Ophthalmology and Optometry Medical University of Vienna Vienna Austria VUNO Inc Seoul Korea Republic of Tencent HealthCare Tencent Shenzhen China Computer Vision Institute College of Computer Science and Software Engineering of Shenzhen University Shenzhen China School of Biomedical Engineering Health Science Center Shenzhen University China Xiaohe Healthcare ByteDance Guangdong Guangzhou510000 China School of Biomedical Engineering Shenzhen University China College of Computer Science & Software Engineering Shenzhen University China Department of Computer Science and Systems Engineering Universidad de La Laguna Spain Saudi Electronic University Saudi Arabia King Saud University Saudi Arabia Institute of Automation Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China SGGS Institute of Engineering and Technology India Institute of Medical Robotics Shanghai Jiao Tong University China Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

With the rapid development of artificial intelligence (AI) in medical image processing, deep learning in color fundus photography (CFP) analysis is also evolving. Although there are some open-source, labeled datasets of CFPs in the ophthalmology community, large-scale datasets for screening only have labels of disease categories, and datasets with annotations of fundus structures are usually small in size. In addition, labeling standards are not uniform across datasets, and there is no clear information on the acquisition device. Here we release a multi-annotation, multi-quality, and multi-device color fundus image dataset for glaucoma analysis on an original challenge-Retinal Fundus Glaucoma Challenge 2nd Edition (REFUGE2). The REFUGE2 dataset contains 2000 color fundus images with annotations of glaucoma classification, optic disc/cup segmentation, as well as fovea localization. Meanwhile, the REFUGE2 challenge sets three sub-tasks of automatic glaucoma diagnosis and fundus structure analysis and provides an online evaluation framework. Based on the characteristics of multi-device and multi-quality data, some methods with strong generalizations are provided in the challenge to make the predictions more robust. This shows that REFUGE2 brings attention to the characteristics of real-world multi-domain data, bridging the gap between scientific research and clinical application. © 2022, CC BY-NC-ND.

关键词： Color

来源：评论

学校读者我要写书评

暂无评论

An Iterative unsupervised Person Search Algorithm on Natural Scene Images

An Iterative unsupervised Person Search Algorithm on Natural...

引用

Chinese Automation Congress (CAC)

作者： Sisi Cao Yuehu Liu School of Software Engineering Xi’an Jiaotong University Xi’an China Key Lab of Digital Technology and Intelligent System of Shaanxi Province Institute of Artificial Intelligence and Robotics (IAIR) Xi’an Jiaotong University Xi’an China

Person search is a challenging task due to the different requirements of annotations between person detection and Re-identification. In general, person search methods use the supervised person Re-identification methods, where abundant identity labels of the bounding boxes are essential. However, most person images are unlabeled in the real-world scenario and it is unpractical to annotate the abundant fine-grained labels for unlabeled images. Obviously, the existing supervised methods are not appropriate with the real-world scenario. Therefore, we propose an unsupervised learning method for person search in this paper, which contacts two parts: one is unsupervised person detection and the other is unsupervised person Re-identification. The experimental results on two well-known datasets, CUHK-SYSU and PRW, indicate that proposed method achieves competitive performance than the state-of-art unsupervised methods. Note that proposed method has greater practical significance even though it does not get the results as good as the general supervised methods.

关键词： Training Clustering algorithms Noise measurement Task analysis Search methods Feature extraction Unsupervised learning

来源：评论

学校读者我要写书评

暂无评论

Reports of the AAAI 2010 fall symposia

Reports of the AAAI 2010 fall symposia

引用

作者： Azevedo, Roger Biswas, Gautam Bohus, Dan Carmichael, Ted Finlayson, Mark A. Hadzikadic, Mirsad Havasi, Catherine Horvitz, Eric Kanda, Takayuki Koyejo, Oluwasanmi Lawless, William F. Lenat, Doug Meneguzzi, Felipe Mutlu, Bilge Oh, Jean Pirrone, Roberto Raux, Antoine Sofge, Donald A. Sukthankar, Gita Van Durme, Benjamin Yorke-Smith, Neil Department of Educational and Counseling Psychology McGill University Canada Center for Intelligent Systems Vanderbilt University United States Microsoft Research Redmond United States Department of Software and Information Systems University of North Carolina Charlotte United States Complex Systems Institute United States Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology United States Media Lab. Massachusetts Institute of Technology United States ATR Intelligent Robotics and Communication Laboratories Kyoto Japan Department of Electrical and Computer Engineering University of Texas Austin United States Department of Mathematics Paine College Augusta GA United States Cycorp Austin TX United States Robotics Institute Carnegie Mellon University United States Department of Computer Sciences University of Wisconsin Madison United States University of Palermo Italy Honda Research Institute United States Navy Center for Applied Research in Artificial Intelligence Naval Research Laboratory Washington DC United States Department of Electrical Engineering and Computer Science University of Central Florida United States Department of Computer Science Johns Hopkins University United States American University of Beirut Lebanon SRI International's Artificial Intelligence Center United States

The Association for the Advancement of Artificial Intelligence (AAAI) presented the 2010 Fall Symposium Series on November 11-13, 2010. The eight symposia included Cognitive and Metacognitive Educational Systems, Commonsense Knowledge, Complex Adaptive Systems: Resilience, Robustness, and Evolvability, Computational Models of Narrative, Dialog with Robots, Manifold Learning and Its Applications, Proactive Assistant Agents and Quantum Informatics for Cognitive, Social, and Semantic Processes. Cognitive and Metacognitive Educational Systems aimed to provide a comprehensive definition of metacognitive educational systems that is inclusive of the theoretical, architectural, and educational aspects of this field. The AAAI Commonsense Knowledge Fall Symposium had the goal of bringing together the diverse elements of this community whose work benefits from or contributes to the representation of general knowledge about the world. One of the specific goals of Proactive symposium was to gather the researchers from various projects in assistant agents to share their wisdom in retrospect.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Building domain independent ontology for web 2.0

Building domain independent ontology for web 2.0

引用

8th IEEE International Conference on Computer and Information Technology Workshops, CIT Workshops 2008

作者： Enkhbold, Nyamsuren Choi, Ho-Jin Intelligent Software Engineering and Robotics Lab. Information and Communications University Daejeon Korea Republic of

ISBN: (纸本)9780769533391

In this paper we are proposing an information integration approach based on minimalist-upper ontology that can be applied among Web 2.0 applications. Instead of following conventional ontology engineering principles we are proposing to facilitate the use of collective intelligence, one of the major locomotives of Web 2.0, for the development of minimalist-upper ontology that can be shared among more specific domain ontologies thereby facilitating information exchange and integration. © 2008 IEEE. DOI 10.1109/CIT.2008. Workshops.103.

关键词： Ontology

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：