检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Zhu, Hongze Xie, Guoyang Hou, Chengbin Dai, Tao Gao, Can Wang, Jinbao Shen, Linlin National Engineering Laboratory for Big Data System Computing Technology Shenzhen University Shenzhen China Department of Computer Science City University of Hong Kong Hong Kong Department of Intelligent Manufacturing CATL Ningde China Fuzhou Fuyao Institute for Advanced Study Fuyao University of Science and Technology Fuzhou China College of Computer Science and Software Engineering Shenzhen University Shenzhen China Guangdong Provincial Key Laboratory of Intelligent Information Processing Shenzhen China Shenzhen University Shenzhen China Shenzhen Institute of Artificial Intelligence and Robotics for Society Shenzhen China

High-resolution point clouds (HRPCD) anomaly detection (AD) plays a critical role in precision machining and high-end equipment manufacturing. Despite considerable 3D-AD methods that have been proposed recently, they still cannot meet the requirements of the HRPCD-AD task. There are several challenges: i) It is difficult to directly capture HRPCD information due to large amounts of points at the sample level;ii) The advanced transformer-based methods usually obtain anisotropic features, leading to degradation of the representation;iii) The proportion of abnormal areas is very small, which makes it difficult to characterize. To address these challenges, we propose a novel group-level feature-based network, called Group3AD, which has a significantly efficient representation ability. First, we design an Intercluster Uniformity Network (IUN) to present the mapping of different groups in the feature space as several clusters, and obtain a more uniform distribution between clusters representing different parts of the point clouds in the feature space. Then, an Intracluster Alignment Network (IAN) is designed to encourage groups within the cluster to be distributed tightly in the feature space. In addition, we propose an Adaptive Group-Center Selection (AGCS) based on geometric information to improve the pixel density of potential anomalous regions during inference. The experimental results verify the effectiveness of our proposed Group3AD, which surpasses Reg3D-AD by the margin of 5% in terms of object-level AUROC on Real3DAD. We provide the code and supplementary information on our website: https://***/M-3LAB/Group3AD. Copyright © 2024, The Authors. All rights reserved.

关键词： Anomaly detection

来源：评论

学校读者我要写书评

暂无评论

Depth-Aware Multi-Modal Fusion for Generalized Zero-Shot Learning

Depth-Aware Multi-Modal Fusion for Generalized Zero-Shot Lea...

引用

IEEE International Conference on Industrial Informatics (INDIN)

作者： Weipeng Cao Xuyang Yao Zhiwu Xu Yinghui Pan Yixuan Sun Dachuan Li Bohua Qiu Muheng Wei Guangdong Laboratory of Artificial Intelligence and Digital Economy (Shenzhen) Shenzhen China National Engineering Laboratory for Big Data System Computing Technology Shenzhen University Shenzhen China College of Computer Science and Software Engineering Shenzhen University Shenzhen China Stony Brook University New York United States Research Institute of Trustworthy Autonomous Systems Southern University of Science and Technology Shenzhen China Department of Computer Science and Engineering Southern University of Science and Technology Shenzhen China ZhenDui Industry Artificial Intelligence Co. Ltd Shenzhen China Department of Automation Shanghai Jiao Tong University Shanghai China Key Laboratory of System Control and Information Processing Ministry of Education of China Shanghai China

ISBN: (数字)9798331527471

ISBN: (纸本)9798331527488

Realizing Generalized Zero-Shot Learning (GZSL) based on large models is emerging as a prevailing trend. However, most existing methods merely regard large models as black boxes, solely leveraging the features output by the final layer while disregarding potential performance enhancements from other layers. Indeed, numerous researchers have visually depicted variations in the features learned across different layers of neural networks. Motivated by this observation, we propose a Vision Transformer (ViT)-based GZSL method named Depth-Aware Multi-Modal ViT (DAM2ViT), which exploits multi-level features of ViT. DAM2ViT incorporates a multi-modal interaction block to align semantic information of categories across multiple layers, thereby augmenting the model's capacity to learn associations between visual and semantic spaces. Extensive experiments conducted on three benchmark datasets (i.e., CUB, SUN, AWA2) have showcased that DAM2ViT achieves competitive results compared to state-of-the-art methods.

关键词： Visualization Adaptation models Semantics Zero shot learning Neural networks Termination of employment Transformers Market research Sun Optimization

来源：评论

学校读者我要写书评

暂无评论

A transactional-behavior-based hierarchical gated network for credit card fraud detection

引用

IEEE/CAA Journal of Automatica Sinica 2025年

作者： Xie, Yu Zhou, MengChu Liu, Guanjun Wei, Lifei Zhu, Honghao De Meo, Pasquale College of Information Engineering Shanghai Maritime University Shanghai201306 China School of Information and Electronic Engineering Zhejiang Gongshang University Hangzhou310018 China Helen and John C. Hartmann Department of Electrical and Computer Engineering New Jersey Institute of Technology NewarkNJ07102 United States Key Laboratory of Embedded System and Service Computing Ministry of Education Department of Computer Science Tongji University Shanghai201804 China College of Computer Science and Information Engineering Bengbu University Bengbu233030 China Department of Ancient and Modern Civilizations University of Messina Messina98166 Italy

The task of detecting fraud in credit card transactions is crucial to ensure the security and stability of a financial system, as well as to enforce customer confidence in digital payment systems. Historically, credit card companies have used rule-based approaches to detect fraudulent transactions, but these have proven inadequate due to the complexity of fraud strategies and have been replaced by much more powerful solutions based on machine learning or deep learning algorithms. Despite significant progress, the current approaches to fraud detection suffer from a number of limitations: for example, it is unclear whether some transaction features are more effective than others in discriminating fraudulent transactions, and they often neglect possible correlations among transactions, even though they could reveal illicit behaviour. In this paper, we propose a novel credit card fraud detection (CCFD) method based on a transaction behaviour-based hierarchical gated network. First, we introduce a feature-oriented extraction module capable of identifying key features from original transactions, and such analysis is effective in revealing the behavioural characteristics of fraudsters. Second, we design a transaction-oriented extraction module capable of capturing the correlation between users' historical and current transactional behaviour. Such information is crucial for revealing users' sequential behaviour patterns. Our approach, called transactional-behaviour-based hierarchical gated network model (TbHGN), extracts two types of new transactional features, which are then combined in a feature interaction module to learn the final transactional representations used for CCFD. We have conducted extensive experiments on a real-world credit card transaction dataset with an increase in average F1 between 1.42% and 6.53% and an improvement in average AUC between 0.63% and 2.78% over the state of the art. © 2025 institute of Electrical and Electronics Engineers Inc.. All right

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

DL-SLOT: Dynamic LiDAR SLAM and object tracking based on collaborative graph optimization

arXiv

引用

arXiv 2022年

作者： Tian, Xuebo Zhu, Zhongyang Zhao, Junqiao Tian, Gengxuan Ye, Chen Department of Computer Science and Technology School of Electronics and Information Engineering Tongji University Shanghai China The Key Laboratory of Embedded System and Service Computing Ministry of Education Tongji University Shanghai China Institute of Intelligent Vehicles Tongji University Shanghai China

Ego-pose estimation and dynamic object tracking are two critical problems for autonomous driving systems. The solutions to these problems are generally based on their respective assumptions, i.e., the static world assumption for simultaneous localization and mapping (SLAM) and the accurate ego-pose assumption for object tracking. However, these assumptions are challenging to hold in dynamic road scenarios, where SLAM and object tracking become closely correlated. Therefore, we propose DL-SLOT, a dynamic LiDAR SLAM and object tracking method, to simultaneously address these two coupled problems. This method integrates the state estimations of both the autonomous vehicle and the stationary and dynamic objects in the environment into a unified optimization framework. First, we used object detection to identify all points belonging to potentially dynamic objects. Subsequently, a LiDAR odometry was conducted using the filtered point cloud. Simultaneously, we proposed a sliding window-based object association method that accurately associates objects according to the historical trajectories of tracked objects. The ego-states and those of the stationary and dynamic objects are integrated into the sliding window-based collaborative graph optimization. The stationary objects are subsequently restored from the potentially dynamic object set. Finally, a global pose-graph is implemented to eliminate the accumulated error. Experiments on KITTI datasets demonstrate that our method achieves better accuracy than SLAM and object tracking baseline methods. This confirms that solving SLAM and object tracking simultaneously is mutually advantageous, dramatically improving the robustness and accuracy of SLAM and object tracking in dynamic road scenarios. Copyright © 2022, The Authors. All rights reserved.

关键词： Optical radar

来源：评论

学校读者我要写书评

暂无评论

PUGAN: Physical Model-Guided Underwater Image Enhancement Using GAN with Dual-Discriminators

arXiv

引用

arXiv 2023年

作者： Cong, Runmin Yang, Wenyu Zhang, Wei Li, Chongyi Guo, Chun-Le Huang, Qingming Kwong, Sam Institute of Information Science Beijing Jiaotong University Beijing100044 China School of Control Science and Engineering Shandong University Jinan250061 China Key Laboratory of Machine Intelligence and System Control Ministry of Education Jinan250061 China Beijing Key Laboratory of Advanced Information Science and Network Technology Beijing100044 China College of Computer Science Nankai University Tianjin300350 China School of Computer Science and Technology University of Chinese Academy of Sciences Beijing101408 China Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing100190 China Peng Cheng Laboratory Shenzhen518055 China Department of Computer Science City University of Hong Kong Hong Kong City University of Hong Kong Shenzhen Research Institute Shenzhen51800 China

Due to the light absorption and scattering induced by the water medium, underwater images usually suffer from some degradation problems, such as low contrast, color distortion, and blurring details, which aggravate the difficulty of downstream underwater understanding tasks. Therefore, how to obtain clear and visually pleasant images has become a common concern of people, and the task of underwater image enhancement (UIE) has also emerged as the times require. Among existing UIE methods, Generative Adversarial Networks (GANs) based methods perform well in visual aesthetics, while the physical model-based methods have better scene adaptability. Inheriting the advantages of the above two types of models, we propose a physical model-guided GAN model for UIE in this paper, referred to as PUGAN. The entire network is under the GAN architecture. On the one hand, we design a Parameters Estimation subnetwork (Par-subnet) to learn the parameters for physical model inversion, and use the generated color enhancement image as auxiliary information for the Two-Stream Interaction Enhancement subnetwork (TSIE-subnet). Meanwhile, we design a Degradation Quantization (DQ) module in TSIE-subnet to quantize scene degradation, thereby achieving reinforcing enhancement of key regions. On the other hand, we design the Dual-Discriminators for the style-content adversarial constraint, promoting the authenticity and visual aesthetics of the results. Extensive experiments on three benchmark datasets demonstrate that our PUGAN outperforms state-of-the-art methods in both qualitative and quantitative metrics. The code and results can be found from the link of https://***/proj_***. © 2023, CC BY-NC-SA.

关键词： Discriminators

来源：评论

学校读者我要写书评

暂无评论

DL-SLOT: Dynamic Lidar SLAM and Object Tracking Based On Graph Optimization

arXiv

引用

arXiv 2022年

作者： Tian, Xuebo Zhao, Junqiao Ye, Chen Department of Computer Science and Technology School of Electronics and Information Engineering Tongji University Shanghai China The Key Laboratory of Embedded System and Service Computing Ministry of Education Tongji University Shanghai China Institute of Intelligent Vehicles Tongji University Shanghai China

Ego-pose estimation and dynamic object tracking are two key issues in an autonomous driving system. Two assumptions are often made for them, i.e. the static world assumption of simultaneous localization and mapping (SLAM) and the exact ego-pose assumption of object tracking, respectively. However, these assumptions are difficult to hold in highly dynamic road scenarios where SLAM and object tracking become correlated and mutually beneficial. In this paper, DL-SLOT, a dynamic Lidar SLAM and object tracking method is proposed. This method integrates the state estimations of both the ego vehicle and the static and dynamic objects in the environment into a unified optimization framework, to realize SLAM and object tracking (SLOT) simultaneously. Firstly, we implement object detection to remove all the points that belong to potential dynamic objects. Then, LiDAR odometry is conducted using the filtered point cloud. At the same time, detected objects are associated with the history object trajectories based on the time-series information in a sliding window. The states of the static and dynamic objects and ego vehicle in the sliding window are integrated into a unified local optimization framework. We perform SLAM and object tracking simultaneously in this framework, which significantly improves the robustness and accuracy of SLAM in highly dynamic road scenarios and the accuracy of objects' states estimation. Experiments on public datasets have shown that our method achieves better accuracy than A-LOAM. Copyright © 2022, The Authors. All rights reserved.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

Size-Invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection

arXiv

引用

arXiv 2024年

作者： Li, Feiran Xu, Qianqian Bao, Shilong Yang, Zhiyong Cong, Runmin Cao, Xiaochun Huang, Qingming Institute of Information Engineering Chinese Academy of Sciences Beijing China School of Cyber Security University of Chinese Academy of Sciences Beijing China Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China School of Computer Science and Technology University of Chinese Academy of Sciences Beijing China Institute of Information Science Beijing Jiaotong University Beijing China School of Control Science and Engineering Shandong University Jinan China Key Laboratory of Machine Intelligence and System Control Ministry of Education Jinan China School of Cyber Science and Tech. Sun Yat-Sen University Shenzhen Campus China Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of Sciences Beijing China

This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image. We observe that current metrics are size-sensitive, where larger objects are focused, and smaller ones tend to be ignored. We argue that the evaluation should be size-invariant because bias based on size is unjustified without additional semantic information. In pursuit of this, we propose a generic approach that evaluates each salient object separately and then combines the results, effectively alleviating the imbalance. We further develop an optimization framework tailored to this goal, achieving considerable improvements in detecting objects of different sizes. Theoretically, we provide evidence supporting the validity of our new metrics and present the generalization analysis of SOD. Extensive experiments demonstrate the effectiveness of our method. The code is available at https://***/Ferry-Li/SI-SOD. Copyright © 2024, The Authors. All rights reserved.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

Generalized Visual Quality Assessment of GAN-Generated Face Images

arXiv

引用

arXiv 2022年

作者： Tian, Yu Ni, Zhangkai Chen, Baoliang Wang, Shiqi Wang, Hanli Kwong, Sam The Department of Computer Science City University of Hong Kong 999077 Hong Kong The Department of Computer Science & Technology Tongji University Shanghai 200092 China The Department of Computer Science & Technology Key Laboratory of Embedded System and Service Computing Ministry of Education Shanghai Institute of Intelligent Science and Technology Tongji University Shanghai 200092 China The City University of Hong Kong Shenzhen Research Institute Shenzhen518057 China

Recent years have witnessed the dramatically increased interest in face generation with generative adversarial networks (GANs). A number of successful GAN algorithms have been developed to produce vivid face images towards different application scenarios. However, little work has been dedicated to automatic quality assessment of such GAN-generated face images (GFIs), even less have been devoted to generalized and robust quality assessment of GFIs generated with unseen GAN model. Herein, we make the first attempt to study the subjective and objective quality towards generalized quality assessment of GFIs. More specifically, we establish a large-scale database consisting of GFIs from four GAN algorithms, the pseudo labels from image quality assessment (IQA) measures, as well as the human opinion scores via subjective testing. Subsequently, we develop a quality assessment model that is able to deliver accurate quality predictions for GFIs from both available and unseen GAN algorithms based on meta-learning. In particular, to learn shared knowledge from GFIs pairs that are born of limited GAN algorithms, we develop the convolutional block attention (CBA) and facial attributes-based analysis (ABA) modules, ensuring that the learned knowledge tends to be consistent with human visual perception. Extensive experiments exhibit that the proposed model achieves better performance compared with the state-of-the-art IQA models, and is capable of retaining the effectiveness when evaluating GFIs from the unseen GAN algorithms. Copyright © 2022, The Authors. All rights reserved.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Generalized Face Forgery Detection via Adaptive Learning for Pre-trained Vision Transformer

arXiv

引用

arXiv 2023年

作者： Luo, Anwei Cai, Rizhao Kong, Chenqi Ju, Yakun Kang, Xiangui Huang, Jiwu Kot, Alex C. The School of Information Technology Jiangxi University of Finance and Economics Nanchang330013 China The School of Computer Science and Engineering Sun Yat-Sen University Guangzhou510006 China Lab. School of Electrical and Electronic Engineering Nanyang Technology University Singapore The Guangdong Key Laboratory of Intelligent Information Processing National Engineering Laboratory for Big Data System Computing Technology Shenzhen University Shenzhen518060 China The China-Singapore International Joint Research Institute Singapore

With the rapid progress of generative models, the current challenge in face forgery detection is how to effectively detect realistic manipulated faces from different unseen domains. Though previous studies show that pre-trained Vision Transformer (ViT) based models can achieve some promising results after fully fine-tuning on the Deepfake dataset, their generalization performances are still unsatisfactory. One possible reason is that fully fine-tuned ViT-based models may disrupt the pre-trained features [1], [2] and overfit to some data-specific patterns [3]. To alleviate this issue, we present a Forgery-aware Adaptive Vision Transformer (FA-ViT) under the adaptive learning paradigm, where the parameters in the pre-trained ViT are kept fixed while the designed adaptive modules are optimized to capture forgery features. Specifically, a global adaptive module is designed to model long-range interactions among input tokens, which takes advantage of self-attention mechanism to mine global forgery clues. To further explore essential local forgery clues, a local adaptive module is proposed to expose local inconsistencies by enhancing the local contextual association. In addition, we introduce a fine-grained adaptive learning module that emphasizes the common compact representation of genuine faces through relationship learning in fine-grained pairs, driving these proposed adaptive modules to be aware of fine-grained forgery-aware information. Extensive experiments demonstrate that our FA-ViT achieves state-of-the-arts results in the cross-dataset evaluation, and enhances the robustness against unseen perturbations. Particularly, FA-ViT achieves 93.83% and 78.32% AUC scores on Celeb-DF and DFDC datasets in the cross-dataset evaluation. The code and trained model have been released at: https://***/LoveSiameseCat/FAViT. Copyright © 2023, The Authors. All rights reserved.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Ultra-Fast Mini License Plate Recognition system Based-on Vision Processing Unit 20

Ultra-Fast Mini License Plate Recognition System Based-on Vi...

引用

Proceedings of the 2020 2nd International Conference on Big-data Service and Intelligent Computation

作者： Junhui Wang Shuangyin Ren Jiezhong He Xiaolan Ji Diqing Huang College of Computer National University of Defense Technology and State Key Laboratory of Mathematical Engineering and Advanced Computing China National Key Laboratory of Science and Technology on Information System Security Institute of Systems Engineering Academy of Military Science College of Computer National University of Defense Technology

ISBN: (纸本)9781450388399

As more embedded environments need license plate recognition systems, how to recognize car plates with high speed/accuracy and low energy has become an important and challenging problem. In this paper, we propose a ultra-Fast miNi (FaNi) license plate recognition (LPR) system. The FaNi system are divided into one training sub-system and one inference sub-system. The former are used to get some offline features; then, the latter is deployed online to recognize license numbers with nearly real-time speed. The inference system is comprised of the vision processing unit (VPU) and the display unit. These two parts are both implemented with hardware logic. Experiments show that the FaNi system can obtain high accuracy and high speed with low resource cost.

关键词： VPU Plate Recognition system FPGA

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：