检索结果-内蒙古大学图书馆

26th International Conference on Computer Supported Cooperative Work in Design, CSCWD 2023

作者： Xiao, Boao Wu, Siyuan He, Xin Dou, Wanchun Nanjing University State Key Laboratory for Novel Software Technology China Nanjing University of Posts and Telecommunications School of Computer Science China

ISBN: (纸本)9798350331684

Image embedding, being a fundamental task in computer vision, plays a crucial role in various downstream tasks such as image retrieval. Widely adopted in e-commerce and social media collaboration, image retrieval benefits greatly from representations learned by the embedding model. However, conventional embedding models are often trained on a single domain, leading to inadequate performance in the multi-domain scenario. To address this challenge, we introduce a generalized image embedding model designed for multi-domain image retrieval. The proposed method employs a contrastively learned Vision Transformer and a carefully crafted training scheme to enhance domain generalization capability. Our theoretical analysis and experimental results, conducted on a large-scale, real-world multi-domain image retrieval dataset, demonstrate the superiority of the proposed method over existing embedding models in terms of both accuracy and domain generalization capability. © 2023 IEEE.

关键词： Image retrieval

来源：评论

学校读者我要写书评

暂无评论

Cooperative Air-Ground Instant Delivery by UAVs and Crowdsourced Taxis 40

Cooperative Air-Ground Instant Delivery by UAVs and Crowdsou...

引用

40th IEEE International Conference on Data Engineering, ICDE 2024

作者： Gao, Junhui Wang, Qianru Zhang, Xin Shi, Juan Zhao, Xiang Han, Qingye Pan, Yan School of Computer Science Northwestern Polytechnical University China School of Computer Science and Technology Xidian University China Air Force Engineering University China National University of Defense Technology Laboratory for Big Data and Decision China School of Management Science and Real Estate Chongqing University China National University of Defense Technology National Key Laboratory of Information Systems Engineering China National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing China

ISBN: (纸本)9798350317152

Instant delivery has become a fundamental service in people's daily lives. Different from the traditional express service, the instant delivery has a strict shipping time constraint after being ordered. However, the labor shortage makes it challenging to realize efficient instant delivery. To tackle the problem, researchers have studied to introduce vehicles (i.e., taxis) or Unmanned Aerial Vehicles (UAVs or drones) into instant delivery tasks. Unfortunately, the delivery detour of taxis and the limited battery of UAVs make it hard to meet the rapidly increasing instant delivery demands. Under this circumstance, this paper proposes an air-ground cooperative instant delivery paradigm to maximize the delivery performance and meanwhile minimize the negative effects on the taxi passengers. Specifically, a data-driven delivery potential-demands-aware cooperative strategy is designed to improve the overall delivery performance of both UAVs and taxis as well as the taxi passengers' experience. The experimental results show that the proposed method improves the delivery number by 30.1% and 114.5% compared to the taxi-based and UAV-based instant delivery respectively, and shortens the delivery time by 35.7% compared to the taxi-based instant delivery. © 2024 IEEE.

关键词： Unmanned aerial vehicles (UAV)

来源：评论

学校读者我要写书评

暂无评论

LINEAR LOG-NORMAL ATTENTION WITH UNBIASED CONCENTRATION

arXiv

引用

arXiv 2023年

作者： Nahshan, Yury Kampeas, Joseph Haleva, Emir Distributed and Parallel Software Lab Huawei Technologies United States

Transformer models have achieved remarkable results in a wide range of applications. However, their scalability is hampered by the quadratic time and memory complexity of the self-attention mechanism concerning the sequence length. This limitation poses a substantial obstacle when dealing with long documents or high-resolution images. In this work, we study the self-attention mechanism by analyzing the distribution of the attention matrix and its concentration ability. Furthermore, we propose instruments to measure these quantities and introduce a novel self-attention mechanism, Linear Log-Normal Attention, designed to emulate the distribution and concentration behavior of the original self-attention. Our experimental results on popular natural language benchmarks reveal that our proposed Linear Log-Normal Attention outperforms other linearized attention alternatives, offering a promising avenue for enhancing the scalability of transformer models. © 2023, CC BY.

关键词： Scalability

来源：评论

学校读者我要写书评

暂无评论

Experimental Analysis of Large-scale Learnable Vector Storage Compression 50th

Experimental Analysis of Large-scale Learnable Vector Storag...

引用

50th International Conference on Very Large Data Bases, VLDB 2024

作者： Zhang, Hailin Zhao, Penghao Miao, Xupeng Shao, Yingxia Liu, Zirui Yang, Tong Cui, Bin Peking University School of Computer Science and Key Lab of High Confidence Software Technologies China Carnegie Mellon University United States Beijing University of Posts and Telecommunications China Institute of Computational Social Science Peking University Qingdao China

Learnable embedding vector is one of the most important applications in machine learning, and is widely used in various database-related domains. However, the high dimensionality of sparse data in recommendation tasks and the huge volume of corpus in retrieval-related tasks lead to a large memory consumption of the embedding table, which poses a great challenge to the training and deployment of models. Recent research has proposed various methods to compress the embeddings at the cost of a slight decrease in model quality or the introduction of other overheads. Nevertheless, the relative performance of these methods remains unclear. Existing experimental comparisons only cover a subset of these methods and focus on limited metrics. In this paper, we perform a comprehensive comparative analysis and experimental evaluation of embedding compression. We introduce a new taxonomy that categorizes these techniques based on their characteristics and methodologies, and further develop a modular benchmarking framework that integrates 14 representative methods. Under a uniform test environment, our benchmark fairly evaluates each approach, presents their strengths and weaknesses under different memory budgets, and recommends the best method based on the use case. In addition to providing useful guidelines, our study also uncovers the limitations of current methods and suggests potential directions for future research. © 2023, VLDB Endowment. All rights reserved.

关键词： Embeddings

来源：评论

学校读者我要写书评

暂无评论

LaserGuider: A Laser Based Physical Backdoor Attack against Deep Neural Networks

arXiv

引用

arXiv 2024年

作者： Xu, Yongjie Chen, Guangke Song, Fu Chen, Yuqi ShanghaiTech University China Pengcheng Laboratory China Key Laboratory of System Software Chinese Academy of Sciences State Key Laboratory of Computer Science Institute of Software Chinese Academy of Science China Nanjing Institute of Software Technology China

Backdoor attacks embed hidden associations between triggers and targets in deep neural networks (DNNs), causing them to predict the target when a trigger is present while maintaining normal behavior otherwise. Physical backdoor attacks, which use physical objects as triggers, are feasible but lack remote control, temporal stealthiness, flexibility, and mobility. To overcome these limitations, in this work, we propose a new type of backdoor triggers utilizing lasers that feature long-distance transmission and instant-imaging properties. Based on the laser-based backdoor triggers, we present a physical backdoor attack, called LaserGuider, which possesses remote control ability and achieves high temporal stealthiness, flexibility, and mobility. We also introduce a systematic approach to optimize laser parameters for improving attack effectiveness. Our evaluation on traffic sign recognition DNNs, critical in autonomous vehicles, demonstrates that LaserGuider with three different laser-based triggers achieves over 90% attack success rate with negligible impact on normal inputs. Additionally, we release LaserMark, the first dataset of real-world traffic signs stamped with physical laser spots, to support further research in backdoor attacks and defenses. Copyright © 2024, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

False Positive Detection for Text-Based Person Retrieval 21st

False Positive Detection for Text-Based Person Retrieval

引用

21st Pacific Rim International Conference on Artificial Intelligence, PRICAI 2024

作者： Cao, Yan Lu, Jun College of Computer Science and Technology Heilongjiang University Harbin150080 China Jiaxiang Industrial Technology Research Institute of Heilongjiang University Shandong Jining272400 China Key Laboratory of Database and Parallel Computing of Heilongjiang Province Heilongjiang University Harbin150080 China

ISBN: (纸本)9789819601189

Text-based person retrieval (TBPR) is a challenging topic in cross-modal retrieval tasks, aiming to query corresponding person images based on textual descriptions. This task is complicated by noisy correspondences between images and text due to incorrect text annotations and low-quality images. Although many robust noise learning methods have been proposed with satisfactory performance, they tend to directly filter out a portion of the noise in the dataset, which decreases the ability to distinguish the noise as the number of training epochs increases. This paper proposes a noise-robust learning strategy based on false positive detection (FPD). Precisely, FPD consists of two main components: 1) The key-value library (KVL) module utilizes a set of clean samples, assigns confidence weights to each training sample during the training process, and adaptively adjusts the contribution of each sample to reduce the accumulation of noise. 2) To improve the performance of TBPR, this paper designs a local token-based selection method, which utilizes the contribution of each local token to the weights in the self-attention matrix and selects the local tokens with vital information as global representations. Extensive experiments are conducted on three datasets (CUHK-PEDES, ICFG-PEDES, and RSTPReID) to evaluate the performance and robustness of FPD under noisy conditions. The experimental results show that the FPD has better results at high noise ratios, and all the metrics outperform the state-of-the-art method under the RSTPReID dataset with a noise ratio of 80%, where Rank-1 outperforms the best method by 2.28%, 1.64% for mAP, and 1.92% for mINP. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Image annotation

来源：评论

学校读者我要写书评

暂无评论

GOAL: Generalized Jointly Sparse Linear Discriminant Regression for Feature Extraction

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2024年第10期5卷 4959-4971页

作者： Lu, Haoquan Lai, Zhihui Zhang, Junhong Yu, Zhuozhen Wen, Jiajun Shenzhen University Computer Vision Institute College of Computer Science and Software Engineering Guangdong Provincial Key Laboratory of Intelligent Information Processing Shenzhen518060 China Peking University Shenzhen Graduate School School of Electronic and Computer Engineering Shenzhen518055 China

Ridge regression (RR)-based methods aim to obtain a low-dimensional subspace for feature extraction. However, the subspace's dimensionality does not exceed the number of data categories, hence compromising its capability of feature representation. Moreover, these methods with L2-norm metric and regularization cannot extract highly robust features from data with corruption. To address these problems, in this article, we propose generalized jointly sparse linear discriminant regression (GOAL), a novel regression method based on joint L2,1-norm and capped-L2-norm, which can integrate sparsity, locality, and discriminability into one model to learn a full-rank robust feature extractor. The sparsely selected discriminative features are robust enough to characterize the decision boundary between classes. Locality is related to manifold structure and Laplacian smoothing, which can enhance the robustness of the model. By using the multinorm metric and regularization regression framework, the proposed method obtains the projection with joint sparsity and guarantees that the rank of the projection matrix will not be limited by the number of classes. An iterative algorithm is proposed to compute the optimal solution. Complexity analysis and proofs of convergence are also given in the article. Experiments on well-known datasets demonstrate our model's superiority and generalization ability. © 2020 IEEE.

关键词： Data mining

来源：评论

学校读者我要写书评

暂无评论

Unveiling Context-Related Anomalies: Knowledge Graph Empowered Decoupling of Scene and Action for Human-Related Video Anomaly Detection

引用

IEEE Transactions on Circuits and Systems for Video Technology 2025年

作者： Chen, Chenglizhao Liu, Xinyu Song, Mengke Li, Luming Yuan, Shaojiang Yu, Xu Pang, Shanchen Qingdao Institute of Software College of Computer Science and Technology China Shandong Key Laboratory of Intelligent Oil & Gas Industrial Software China State Key Laboratory of Chemical Safety China

Video anomaly detection methods are mainly classified into two categories based on their primary feature types: appearance-based and action-based. Appearance-based methods rely on low-level visual features like color, texture, and shape, learning patterns specific to training scenes. While effective in familiar settings, they struggle with unknown or altered scenes due to poor generalization and limited understanding of action-scene relationships. In contrast, action-based methods focus on detecting action anomalies but often overlook contextual scene associations, leading to misjudgments (e.g., running on a street being deemed normal without considering scene context). To overcome these limitations, we propose a novel decoupling-based anomaly detection architecture (DecoAD). Its core lies in the decoupling and interweaving of scenes and actions, enabling explicit modeling of their complex relationships. By reconstructing these interactions using knowledge graphs, DecoAD achieves a deeper understanding of behaviors and contexts. This design ensures strong performance in both known and unknown scenarios, significantly enhancing generalization. To evaluate its effectiveness in dynamic scenes and its ability to handle scene-related anomalies, we introduce UFSR, the first video anomaly detection dataset featuring dynamic scenes and scene-related anomalies. DecoAD supports fully-supervised, weakly-supervised, and unsupervised settings, improving AUC on UBnormal by 1.1%, 3.1%, and 2.1% in fully-supervised, weakly-supervised, and unsupervised settings, and on UFSR by 1.2% and 8.2% in weakly-supervised and unsupervised settings. The source code and datasets are available at: https://***/liuxy3366/DecoAD. © 1991-2012 IEEE.

关键词： Video analysis

来源：评论

学校读者我要写书评

暂无评论

Edge computing service deployment and task offloading based on multi-task high-dimensional multi-objective optimization

arXiv

引用

arXiv 2023年

作者： Guo, Yanheng Zhang, Yan Wu, Linjie Li, Mengxia Cai, Xingjuan Chen, Jinjun Shanxi Key Laboratory of Big Data Analysis and Parallel Computing Taiyuan University of Science and Technology Shanxi Taiyuan030024 China State Key Laboratory for Novel Software Technology Nanjing University China Department of Computer Science and Software Engineering Swinburne University of Technology Melbourne Australia

The Mobile Edge Computing (MEC) system located close to the client allows mobile smart devices to offload their computations onto edge servers, enabling them to benefit from low-latency computing services. Both cloud service providers and users seek more comprehensive solutions, necessitating judicious decisions in service deployment and task offloading while balancing multiple objectives. This study investigates service deployment and task offloading challenges in a multi-user environment, framing them as a multi-task high-dimensional multi-objective optimization (MT-HD-MOO) problem within an edge environment. To ensure stable service provisioning, beyond considering latency, energy consumption, and cost as deployment objectives, network reliability is also incorporated. Furthermore, to promote equitable usage of edge servers, load balancing is introduced as a fourth task offloading objective, in addition to latency, energy consumption, and cost. Additionally, this paper designs a MT-HD-MOO algorithm based on a multi-selection strategy to address this model and its solution. By employing diverse selection strategies, an environment selection strategy pool is established to enhance population diversity within the high-dimensional objective space. Ultimately, the algorithm’s effectiveness is verified through simulation experiments. © 2023, CC BY-NC-ND.

关键词： Multiobjective optimization

来源：评论

学校读者我要写书评

暂无评论

A novel overlapping minimization SMOTE algorithm for imbalanced classification

引用

Frontiers of Information Technology & Electronic Engineering 2024年第9期25卷 1266-1281页

作者： Yulin HE Xuan LU Philippe FOURNIER-VIGER Joshua Zhexue HUANG Guangdong Laboratory of Artificial Inteligence and Digital Economy(SZ) Shenzhen 518107China College of Computer Science and Software Engineering Shenzhen UniversityShenzhen 518060China

The synthetic minority oversampling technique(SMOTE) is a popular algorithm to reduce the impact of class imbalance in building classifiers, and has received several enhancements over the past 20 years. SMOTE and its variants synthesize a number of minority-class sample points in the original sample space to alleviate the adverse effects of class imbalance. This approach works well in many cases, but problems arise when synthetic sample points are generated in overlapping areas between different classes, which further complicates classifier training. To address this issue, this paper proposes a novel generalization-oriented rather than imputation-oriented minorityclass sample point generation algorithm, named overlapping minimization SMOTE(OM-SMOTE). This algorithm is designed specifically for binary imbalanced classification problems. OM-SMOTE first maps the original sample points into a new sample space by balancing sample encoding and classifier generalization. Then, OM-SMOTE employs a set of sophisticated minority-class sample point imputation rules to generate synthetic sample points that are as far as possible from overlapping areas between classes. Extensive experiments have been conducted on 32 imbalanced datasets to validate the effectiveness of OM-SMOTE. Results show that using OM-SMOTE to generate synthetic minority-class sample points leads to better classifier training performances for the naive Bayes,support vector machine, decision tree, and logistic regression classifiers than the 11 state-of-the-art SMOTE-based imputation algorithms. This demonstrates that OM-SMOTE is a viable approach for supporting the training of high-quality classifiers for imbalanced classification. The implementation of OM-SMOTE is shared publicly on the Git Hub platform at https://***/luxuan123123/OM-SMOTE/.

关键词： Imbalanced classification Synthetic minority oversampling technique(SMOTE) Majority-class sample point Minority-class sample point Generalization capability Overlapping minimization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：