检索结果-内蒙古大学图书馆

DCT-phase statistics for forged IMEI numbers and air ticket detection

Expert Systems with Applications 2021年 164卷

作者： Nandanwar, Lokesh Shivakumara, Palaiahnakote Kanchan, Swati Basavaraja, V. Guru, D.S. Pal, Umapada Lu, Tong Blumenstein, Michael Faculty of Computer Science and Information Technology University of Malaya Malaysia Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India Department of Studies in Computer Science University of Mysore Karnataka India National Key Lab for Novel Software Technology Nanjing University China Sydney Australia

New tools have been developing with the intention of having more flexibility and greater user-friendliness for editing the images and documents in digital technologies, but, unfortunately, they are also being used for manipulating and tampering information. Examples of such crimes include creating forged International Mobile Equipment Identity (IMEI) numbers which are embedded on mobile packages and inside smart mobile cases for illicit activities. Another example of such crimes is altering the name or date on air tickets for breaching security at the airport. This paper presents a new expert system for detecting forged IMEI numbers as well as altered air ticket images. The proposed method derives the phase spectrum using the Discrete Cosine Transform (DCT) to highlight the suspicious regions;it is unlike the phase spectrum from a Fourier transform, which is ineffective due to power spectrum noise. From the phase spectrum, our method extracts phase statistics to study the effect of distortions introduced by forgery operations. This results in feature vectors, which are fed to a Support Vector Machine (SVM) classifier for detection of forged IMEI numbers and air ticket images. Experimental results on our dataset of forged IMEI numbers (which is created by us for this work), on altered air tickets, on benchmark datasets of video caption text (which is tampered text), and on altered receipts of the ICPR 2018 FDC dataset, show that the proposed method is robust across different datasets. Furthermore, comparative studies of the proposed method with the existing methods on the same datasets show that the proposed method outperforms the existing methods. The dataset created will be availab.e freely on request to the authors. © 2020 Elsevier Ltd

关键词： Discrete cosine transforms

来源：评论

学校读者我要写书评

暂无评论

Cross Domain Object Detection by Target-Perceived Dual Branch Distillation

arXiv

引用

arXiv 2022年

作者： He, Mengzhe Wang, Yali Wu, Jiaxi Wang, Yiru Li, Hanqing Li, Bo Gan, Weihao Wu, Wei Qiao, Yu ShenZhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institute of Advanced Technology Chinese Academy of Sciences China SenseTime Research University of Chinese Academy of Science China Shanghai AI Laboratory Shanghai China Beihang University China SIAT Branch Shenzhen Institute of Artificial Intelligence and Robotics for Society China

Cross domain object detection is a realistic and challenging task in the wild. It suffers from performance degradation due to large shift of data distributions and lack of instance-level annotations in the target domain. Existing approaches mainly focus on either of these two difficulties, even though they are closely coupled in cross domain object detection. To solve this problem, we propose a novel Target-perceived Dual-branch Distillation (TDD) framework. By integrating detection branches of both source and target domains in a unified teacher-student learning scheme, it can reduce domain shift and generate reliable supervision effectively. In particular, we first introduce a distinct Target Proposal Perceiver between two domains. It can adaptively enhance source detector to perceive objects in a target image, by leveraging target proposal contexts from iterative cross-attention. Afterwards, we design a concise Dual Branch Self Distillation strategy for model training, which can progressively integrate complementary object knowledge from different domains via self-distillation in two branches. Finally, we conduct extensive experiments on a number of widely-used scenarios in cross domain object detection. The results show that our TDD significantly outperforms the state-of-the-art methods on all the benchmarks. Our code and model will be availab.e at here. Copyright © 2022, The Authors. All rights reserved.

关键词： Distillation

来源：评论

学校读者我要写书评

暂无评论

RBF-Softmax: Learning Deep Representative Prototypes with Radial Basis Function Softmax 1

引用

16th European Conference on computer vision, ECCV 2020

作者： Zhang, Xiao Zhao, Rui Qiao, Yu Li, Hongsheng CUHK-SenseTime Joint Lab The Chinese University of Hong Kong Hong Kong SenseTime Research Hong Kong ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen China

ISBN: (数字)9783030585747

ISBN: (纸本)9783030585730

Deep neural networks have achieved remarkable successes in learning feature representations for visual classification. However, deep features learned by the softmax cross-entropy loss generally show excessive intra-class variations. We argue that, because the traditional softmax losses aim to optimize only the relative differences between intra-class and inter-class distances (logits), it cannot obtain representative class prototypes (class weights/centers) to regularize intra-class distances, even when the training is converged. Previous efforts mitigate this problem by introducing auxiliary regularization losses. But these modified losses mainly focus on optimizing intra-class compactness, while ignoring keeping reasonable relations between different class prototypes. These lead to weak models and eventually limit their performance. To address this problem, this paper introduces a novel Radial Basis Function (RBF) distances to replace the commonly used inner products in the softmax loss function, such that it can adaptively assign losses to regularize the intra-class and inter-class distances by reshaping the relative differences, and thus creating more representative prototypes of classes to improve optimization. The proposed RBF-Softmax loss function not only effectively reduces intra-class distances, stabilizes the training behavior, and reserves ideal relations between prototypes, but also significantly improves the testing performance. Experiments on visual recognition benchmarks including MNIST, CIFAR-10/100, and ImageNet demonstrate that the proposed RBF-Softmax achieves better results than cross-entropy and other state-of-the-art classification losses. The code is at https://***/2han9x1a0release/RBF-Softmax. © 2020, Springer Nature Switzerland AG.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Detach and Enhance: Learning Disentangled Cross-modal Latent Representation for Efficient Face-Voice Association and Matching

Detach and Enhance: Learning Disentangled Cross-modal Latent...

引用

IEEE International Conference on Data Mining (ICDM)

作者： Zhenning Yu Xin Liu Yiu-Ming Cheung Minghang Zhu Xing Xu Nannan Wang Taihao Li Dept. of Comput. Sci. & Fujian Key Lab. of Big Data Intelligence and Security Huaqiao University Xiamen China Zhejiang Lab Hangzhou China Dept. of Comput. Sci. and Institute of Research and Continuing Education HK Baptist University Hong Kong SAR China Xiamen Key Lab. of Computer Vision and Pattern Recognition Huaqiao University Xiamen China Dept. of Computer Sci. and Eng. University of Electronic Science and Technology of China Chengdu China State Key Lab. of Integrated Services Networks & School of Telecommun. Eng. Xidian University Xi’an China

Many researches in cognitive science have shown that humans often perform face-voice association for various perception tasks, and some recent data mining works have been designed in emulating such ability intelligently. Nevertheless, most methods often suffer from the degraded performance when there exist semantically irrelevant interference factors across different modalities. To alleviate this concern, this paper presents an efficient Disentangled Cross-modal Latent Representation (DCLR) method to adaptively detach the discriminative feature attributes and enhance the face-voice association. To be specific, the proposed DCLR framework consists of two-stage cross-modal disentangling process. First, the former stage employs the supervised contrastive learning to push the representations of face-voice data from the same person closer while pulling those representations of different person away. Then, the latter stage freezes all the parameters of the former stage, and further innovates a multi-layer orthogonal decoupling scheme to learn the disentangled latent representations, while filtering out the modality-dependent irrelevant factors. Besides, the cross-modal reconstruction loss is further utilized to narrow down the semantic gap between heterogeneous feature expressions. Through the joint exploitation of the above, the proposed framework can well associate the face-voice data to benefit various kinds of cross-modal perception tasks. Extensive experiments verify the superiorities of the proposed face-voice association framework and show its competitive performances.

关键词： Representation learning Filtering Semantics Interference Data models Cognitive science Data mining

来源：评论

学校读者我要写书评

暂无评论

Robust partial Fourier reconstruction for diffusion-weighted imaging using a recurrent convolutional neural network

arXiv

引用

arXiv 2021年

作者： Gadjimuradov, Fasil Benkert, Thomas Nickel, Marcel Dominik Maier, Andreas Pattern Recognition Lab. Department of Computer Science Friedrich-Alexander University Erlangen-Nürnberg Erlangen Germany Magnetic Resonance Applications Predevelopment Siemens Healthcare GmbH Erlangen Germany

Purpose: To develop an algorithm for robust partial Fourier (PF) reconstruction applicable to diffusion-weighted (DW) images with non-smooth phase variations. Methods: Based on an unrolled proximal splitting algorithm, a neural network architecture is derived which alternates between data consistency operations and regularization implemented by recurrent convolutions. In order to exploit correlations, multiple repetitions of the same slice are jointly reconstructed under consideration of permutation-equivariance. The algorithm is trained on DW liver data of 60 volunteers and evaluated on retrospectively and prospectively sub-sampled data of different anatomies and resolutions. Results: The proposed method is able to significantly outperform conventional PF techniques on retrospectively sub-sampled data in terms of quantitative measures as well as perceptual image quality. In this context, joint reconstruction of repetitions as well as the particular type of recurrent network unrolling are found to be beneficial with respect to reconstruction quality. On prospectively PF-sampled data, the proposed method enables DW imaging with higher signal without sacrificing image resolution or introducing additional artifacts. Alternatively, it can be used to counter the TE increase in acquisitions with higher resolution. Further, generalizability can be shown to prospective brain data exhibiting anatomies and contrasts not present in the training set. Conclusion: This work demonstrates that robust PF reconstruction of DW data is feasible even at strong PF factors in anatomies prone to phase variations. Since the proposed method does not rely on smoothness priors of the phase but uses learned recurrent convolutions instead, artifacts of conventional PF methods can be avoided. © 2021, CC BY-NC-ND.

关键词： Recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

Learning to Predict Context-Adaptive Convolution for Semantic Segmentation 16th

Learning to Predict Context-Adaptive Convolution for Semanti...

引用

16th European Conference on computer vision, ECCV 2020

作者： Liu, Jianbo He, Junjun Qiao, Yu Ren, Jimmy S. Li, Hongsheng CUHK-SenseTime Joint Laboratory The Chinese University of Hong Kong Hong Kong Shenzhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Beijing China SenseTime Research Hong Kong

ISBN: (纸本)9783030585945

Long-range contextual information is essential for achieving high-performance semantic segmentation. Previous feature re-weighting methods demonstrate that using global context for re-weighting feature channels can effectively improve the accuracy of semantic segmentation. However, the globally-sharing feature re-weighting vector might not be optimal for regions of different classes in the input image. In this paper, we propose a Context-adaptive Convolution Network (CaC-Net) to predict a spatially-varying feature weighting vector for each spatial location of the semantic feature maps. In CaC-Net, a set of context-adaptive convolution kernels are predicted from the global contextual information in a parameter-efficient manner. When used for convolution with the semantic feature maps, the predicted convolutional kernels can generate the spatially-varying feature weighting factors capturing both global and local contextual information. Comprehensive experimental results show that our CaC-Net achieves superior segmentation performance on three public datasets, PASCAL Context, PASCAL VOC 2012 and ADE20K. © 2020, Springer Nature Switzerland AG.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Revisiting the Generalization Problem of Low-level vision Models Through the Lens of Image Deraining

arXiv

引用

arXiv 2025年

作者： Hu, Jinfan You, Zhiyuan Gu, Jinjin Zhu, Kaiwen Xue, Tianfan Dong, Chao Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen518055 China University of Chinese Academy of Sciences Beijing100049 China The Chinese University of Hong Kong 999077 Hong Kong The University of Sydney NSW2006 Australia Shanghai Jiao Tong University Shanghai200240 China Shanghai Artificial Intelligence Laboratory Shanghai200232 China Shenzhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences China Shenzhen University of Advanced Technology Shenzhen518055 China

Generalization remains a significant challenge for low-level vision models, which often struggle with unseen degradations in real-world scenarios despite their success in controlled benchmarks. In this paper, we revisit the generalization problem in low-level vision models. Image deraining is selected as a case study due to its well-defined and easily decoupled structure, allowing for more effective observation and analysis. Through comprehensive experiments, we reveal that the generalization issue is not primarily due to limited network capacity but rather the failure of existing training strategies, which lead networks to overfit specific degradation patterns. Our findings show that guiding networks to focus on learning the underlying image content, rather than the degradation patterns, is key to improving generalization. We demonstrate that balancing the complexity of background images and degradations in the training data helps networks better fit the image distribution. Furthermore, incorporating content priors from pre-trained generative models significantly enhances generalization. Experiments on both image deraining and image denoising validate the proposed strategies. We believe the insights and solutions will inspire further research and improve the generalization of low-level vision models. Copyright © 2025, The Authors. All rights reserved.

关键词： Image denoising

来源：评论

学校读者我要写书评

暂无评论

Neighbourhood-guided feature reconstruction for occluded person re-identification

arXiv

引用

arXiv 2021年

作者： Yu, Shijie Chen, Dapeng Zhao, Rui Chen, Haobin Qiao, Yu ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institute of Advanced Technology Chinese Academy of Sciences University of Chinese Academy of Sciences China SenseTime Group Limited Shanghai AI Lab Shanghai China

Person images captured by surveillance cameras are often occluded by various obstacles, which lead to defective feature representation and harm person re-identification (Re-ID) performance. To tackle this challenge, we propose to reconstruct the feature representation of occluded parts by fully exploiting the information of its neighborhood in a gallery image set. Specifically, we first introduce a visible part-based feature by body mask for each person image. Then we identify its neighboring samples using the visible features and reconstruct the representation of the full body by an outlierremovable graph neural network with all the neighboring samples as input. Extensive experiments show that the proposed approach obtains significant improvements. In the large-scale Occluded- DukeMTMC benchmark, our approach achieves 64.2% mAP and 67.6% rank-1 accuracy which outperforms the state-of-the-art approaches by large margins, i.e.,20.4% and 12.5%, respectively, indicating the effectiveness of our method on occluded Re-ID problem. Copyright © 2021, The Authors. All rights reserved.

关键词： Security systems

来源：评论

学校读者我要写书评

暂无评论

Machine Learning and computer vision Techniques in Continuous Beehive Monitoring Applications: A Survey

arXiv

引用

arXiv 2022年

作者： Bilik, Simon Zemcik, Tomas Kratochvila, Lukas Ricanek, Dominik Richter, Miloslav Zambanini, Sebastian Horak, Karel Department of Control and Instrumentation Faculty of Electrical Engineering and Communication Brno University of Technology Technická 3058/10 Brno61600 Czech Republic Computer Vision and Pattern Recognition Laboratory Department of Computational Engineering Lappeenranta-Lahti University of Technology LUT Yliopistonkatu 34 Lappeenranta53850 Finland Computer Vision Lab Institute of Visual Computing & Human-Centered Technology Faculty of Informatics TU Wien Favoritenstr. 9/193-1 ViennaA-1040 Austria

Wide use and availab.lity of machine learning and computer vision techniques allows development of relatively complex monitoring systems in many domains. Besides the traditional industrial domain, new applications appears also in biology and agriculture, where they may be used to detect infections, parasites and weeds, but also for automated monitoring and early warning systems. This goes in concordance with the introduction of the easily accessible hardware and development kits such as the Arduino, or RaspberryPi families. In this paper, we survey 50 papers focusing on the methods of automated beehive monitoring using computer vision techniques. Particularly on the pollen and Varroa mite detection together with the bee traffic monitoring. Such systems could also be used for monitoring of honeybee colonies and for the inspection of their health state, which could potentially identify dangerous states before the situation is critical, or to better plan periodic bee colony inspections and therefore save significant costs. Further on, we also include analysis of the research trends in this application field and we outline the possible directions of new development. Our paper is also aimed at veterinary and apidology professionals and experts, who may not be familiar with machine learning to introduce them to its capabilities, hence each family of techniques is prefaced by a brief theoretical introduction and motivation related to its base method. We hope that this paper will inspire other scientists to use machine learning techniques for other applications in beehive monitoring. © 2022, CC BY.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

NTIRE 2023 Image Shadow Removal Challenge Report

NTIRE 2023 Image Shadow Removal Challenge Report

引用

2023 IEEE/CVF Conference on computer vision and pattern recognition Workshops, CVPRW 2023

作者： Vasluianu, Florin-Alexandru Seizinger, Tim Timofte, Radu Cui, Shuhao Huang, Junshi Tian, Shuman Fan, Mingyuan Zhang, Jiaqi Zhu, Li Wei, Xiaoming Wei, Xiaolin Luo, Ziwei Gustafsson, Fredrik K. Zhao, Zheng Sjölund, Jens Schön, Thomas B. Dong, Xiaoyi Zhang, Xi Sheryl Li, Chenghua Leng, Cong Yeo, Woon-Ha Oh, Wang-Taek Lee, Yeo-Reum Ryu, Han-Cheol Luo, Jinting Jiang, Chengzhi Han, Mingyan Wu, Qi Lin, Wenjie Yu, Lei Li, Xinpeng Jiang, Ting Fan, Haoqiang Liu, Shuaicheng Xu, Shuning Song, Binbin Chen, Xiangyu Zhang, Shile Zhou, Jiantao Zhang, Zhao Zhao, Suiyi Zheng, Huan Gao, Yangcheng Wei, Yanyan Wang, Bo Ren, Jiahuan Luo, Yan Kondo, Yuki Miyata, Riku Yasue, Fuma Naruki, Taito Ukita, Norimichi Chang, Hua-En Yang, Hao-Hsiang Chen, Yi-Chung Chiang, Yuan-Chun Huang, Zhi-Kai Chen, Wei-Ting Chen, I-Hsiang Hsieh, Chia-Hsuan Kuo, Sy-Yen Xianwei, Li Fu, Huiyuan Liu, Chunlin Ma, Huadong Fu, Binglan He, Huiming Wang, Mengjia She, Wenxuan Liu, Yu Nathan, Sabari Kansal, Priya Zhang, Zhongjian Yang, Huabin Wang, Yan Zhang, Yanru Phutke, Shruti S. Kulkarni, Ashutosh Khan, Md Raqib Murala, Subrahmanyam Vipparthi, Santosh Kumar Ye, Heng Liu, Zixi Yang, Xingyi Liu, Songhua Wu, Yinwei Jing, Yongcheng Yu, Qianhao Zheng, Naishan Huang, Jie Long, Yuhang Yao, Mingde Zhao, Feng Zhao, Bowen Ye, Nan Shen, Ning Cao, Yanpeng Xiong, Tong Xia, Weiran Li, Dingwen Xia, Shuchen Computer Vision Lab Ifi Caidas University of Würzburg Germany Computer Vision Lab Eth Zürich Switzerland Meituan Group China Department of Information Technology Uppsala University Sweden Institute of Automation Chinese Academy of Sciences Beijing China Nanjing China Maicro Nanjing China Department of Artificial Intelligence Convergence Sahmyook University Seoul Korea Republic of Megvii Technology China University of Electronic Science and Technology of China China University of Macau China China Toyota Technological Institute Japan Graduate Institute of Electronics Engineering National Taiwan University Taiwan Department of Electrical Engineering National Taiwan University Taiwan Graduate Institute of Communication Engineering National Taiwan University Taiwan ServiceNow United States Beijing University of Post and Teleconmunication Beijing China Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education China Couger Inc. Computer Vision and Pattern Recognition Lab Indian Institute of Technology Ropar Punjab Rupnagar India Research Institute Singapore National University of Singapore Singapore Research Institute Singapore University of Sydney Australia Brain-Inspired Vision Laboratory Information Science and Technology Institution University of Science and Technology of China China State Key Laboratory of Fluid Power and Mechatronic Systems School of Mechanical Engineering Zhejiang University Hangzhou310027 China Key Laboratory of Advanced Manufacturing Technology of Zhejiang Province School of Mechanical Engineering Zhejiang University Hangzhou310027 China South China University of Technology China

ISBN: (纸本)9798350302493

This work reviews the results of the NTIRE 2023 Challenge on Image Shadow Removal. The described set of solutions were proposed for a novel dataset, which captures a wide range of object-light interactions. It consists of 1200 roughly pixel aligned pairs of real shadow free and shadow affected images, captured in a controlled environment. The data was captured in a white-box setup, using professional equipment for lights and data acquisition sensors. The challenge had a number of 144 participants registered, out of which 19 teams were compared in the final ranking. The proposed solutions extend the work on shadow removal, improving over the performance level describing state-of-the-art methods. © 2023 IEEE.

关键词： Data acquisition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：