检索结果-内蒙古大学图书馆

12th International Conference on Learning Representations, ICLR 2024

作者： Wang, Lean Yang, Wenkai Chen, Deli Zhou, Hao Lin, Yankai Meng, Fandong Zhou, Jie Sun, Xu National Key Laboratory for Multimedia Information Processing School of Computer Science Peking University China Gaoling School of Artificial Intelligence Renmin University of China China Pattern Recognition Center WeChat AI Tencent Inc. China DeepSeek AI China

As large language models (LLMs) generate texts with increasing fluency and realism, there is a growing need to identify the source of texts to prevent the abuse of LLMs. Text watermarking techniques have proven reliable in distinguishing whether a text is generated by LLMs by injecting hidden patterns. However, we argue that existing LLM watermarking methods are encoding-inefficient and cannot flexibly meet the diverse information encoding needs (such as encoding model version, generation time, user id, etc.). In this work, we conduct the first systematic study on the topic of Codable Text Watermarking for LLMs (CTWL) that allows text watermarks to carry multi-bit customizable information. First of all, we study the taxonomy of LLM watermarking technologies and give a mathematical formulation for CTWL. Additionally, we provide a comprehensive evaluation system for CTWL: (1) watermarking success rate, (2) robustness against various corruptions, (3) coding rate of payload information, (4) encoding and decoding efficiency, (5) impacts on the quality of the generated text. To meet the requirements of these non-Pareto-improving metrics, we follow the most prominent vocabulary partition-based watermarking direction, and devise an advanced CTWL method named Balance-Marking. The core idea of our method is to use a proxy language model to split the vocabulary into probability-balanced parts, thereby effectively maintaining the quality of the watermarked text. Extensive experimental results show that our method outperforms the baseline under comprehensive evaluation. Our code is available at https://***/lancopku/codable-watermarking-for-llm. © 2024 12th International Conference on Learning Representations, ICLR 2024. All rights reserved.

关键词： Encoding (symbols)

来源：评论

学校读者我要写书评

暂无评论

Robust text line detection in equipment nameplate images

Robust text line detection in equipment nameplate images

引用

2019 IEEE International Conference on Robotics and Biomimetics, ROBIO 2019

作者： Lai, Jiangyu Guo, Lanqing Qiao, Yu Chen, Xiaolong Zhang, Zhengfu Liu, Canping Li, Ying Fu, Bin Guangzhou Power Supply Bureau Co. Ltd. Guangzhou China ShenZhen Key Lab of Computer Vision and Pattern Recognition SIATSenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences China SIAT Branch Shenzhen Institute of Artificial Intelligence and Robotics for Society China

ISBN: (纸本)9781728163215

Scene text detection for equipment nameplates in the wild is important for equipment inspection robot since it enables inspection robot to take specific actions for different equipment's. Although text detection in images has achieved great progress in recent years, the detection for equipment nameplates faces several challenges such as extreme illumination and distortion which significantly decrease the detection performance. In this paper, we propose a deep text detection model Robust Text Line Detection (RTLD) for locating word level text instances in equipment cards. Specifically, the proposed model first employs a corner detection module to determine the four corner points of each nameplate, and then a carefully designed image transformed module transforms the irregular nameplate region into a rectangular region. Finally, text detection module is introduced to locate every word level text instance in the transformed images. We conduct extensive experiments to examine our proposed methods on real equipment nameplate images. Our model achieves 91.2% precision and 92.6% recall on Equipment Nameplate Dataset. The experimental results demonstrate the effectiveness of our models. © 2019 IEEE.

关键词： Nameplates

来源：评论

学校读者我要写书评

暂无评论

Content and structure based attention for graph node classification

引用

Journal of Intelligent and Fuzzy Systems 2024年第4期46卷 8329-8343页

作者： Chen, Yong Xie, Xiao-Zhu Weng, Wei College of Computer and Information Engineering Xiamen University of Technology Xiamen China Fujian Key Laboratory of Pattern Recognition and Image Understanding Xiamen China

Graph-structured data is ubiquitous in real-world applications, such as social networks, citation networks, and communication networks. Graph neural network (GNN) is the key to process them. In recent years, graph attention networks (GATs) have been proposed for node classification and achieved encouraging performance. It focuses on the content associated on nodes to evaluate the attention weights, and the rich structure information in the graph is almost ignored. Therefore, we propose a multi-head attention mechanism to fully employ node content and graph structure information. The core idea is to introduce the interactions in the topological structure into the existing GATs. This method can more accurately estimate the attention weights among nodes, thereby improving the convergence of GATs. Second, the mechanism is lightweight and efficient, requires no training to learn, can accurately analyze higher-order structural information, and can be strongly interpreted through heatmaps. We name the proposed model content- and structure-based graph attention network (CSGAT). Furthermore, our proposed model achieves state-of-the-art performance on a number of datasets in node classification. The code and data are available at https://***/CroakerShark/CSGAT. © 2024 - IOS Press. All rights reserved.

关键词： Classification (of information)

来源：评论

学校读者我要写书评

暂无评论

Local gradient difference features for classification of 2D-3D natural scene text images 25

Local gradient difference features for classification of 2D-...

引用

25th International Conference on pattern recognition, ICPR 2020

作者： Nandanwar, Lokesh Shivakumara, Palaiahnakote Raghavendra, Ramachandra Lu, Tong Pal, Umapada Lopresti, Daniel Anuar, Nor Badrul Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur Malaysia Faculty of Information Technology and Electrical Engineering IIK NTNU Norway National Key Lab for Novel Software Technology Nanjing University Nanjing China Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India Computer Science and Engineering Lehigh University BethlehemPA United States

ISBN: (纸本)9781728188089

Methods developed for normal 2D text detection do not work well for text that is rendered using decorative, 3D effects, etc. This paper proposes a new method for classification of 2D and 3D natural scene text images so that an appropriate recognition method can be chosen accordingly based on the classification results for better performance. The proposed method explores local gradient differences for obtaining candidate pixels, which represent a stroke. To study the spatial distribution of candidate pixels, we propose a measure, called COLD, which is denser for pixels toward the center of strokes and scattered for non-stroke pixels. This observation leads us to introduce mass features for extracting the regular spatial pattern of COLD, which indicates a 2D text image. The extracted features are fed into a Neural Network (NN) for classification. The proposed method is tested on (i) a new dataset introduced in this work (ii) a second dataset assembled from standard natural scene datasets (iii) Non-Text Image datasets which does not contain text, rather it contains objects. Experimental results of the proposed method on images with text and non-text show that the proposed method is independent of text. The proposed approach improves text detection and recognition performance significantly after classification. © 2020 IEEE

关键词： Image classification

来源：评论

学校读者我要写书评

暂无评论

Tensor Low-Rank Reconstruction for Semantic Segmentation 1

引用

16th European Conference on computer vision, ECCV 2020

作者： Chen, Wanli Zhu, Xinge Sun, Ruoqi He, Junjun Li, Ruiyu Shen, Xiaoyong Yu, Bei The Chinese University of Hong Kong New Territories Hong Kong Shanghai Jiao Tong University Shanghai China ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Beijing China SmartMore Shenzhen China

ISBN: (数字)9783030585204

ISBN: (纸本)9783030585198

Context information plays an indispensable role in the success of semantic segmentation. Recently, non-local self-attention based methods are proved to be effective for context information collection. Since the desired context consists of spatial-wise and channel-wise attentions, 3D representation is an appropriate formulation. However, these non-local methods describe 3D context information based on a 2D similarity matrix, where space compression may lead to channel-wise attention missing. An alternative is to model the contextual information directly without compression. However, this effort confronts a fundamental difficulty, namely the high-rank property of context information. In this paper, we propose a new approach to model the 3D context representations, which not only avoids the space compression but also tackles the high-rank difficulty. Here, inspired by tensor canonical-polyadic decomposition theory (i.e, a high-rank tensor can be expressed as a combination of rank-1 tensors.), we design a low-rank-to-high-rank context reconstruction framework (i.e, RecoNet). Specifically, we first introduce the tensor generation module (TGM), which generates a number of rank-1 tensors to capture fragments of context feature. Then we use these rank-1 tensors to recover the high-rank context features through our proposed tensor reconstruction module (TRM). Extensive experiments show that our method achieves state-of-the-art on various public datasets. Additionally, our proposed method has more than 100 times less computational cost compared with conventional non-local-based methods. © 2020, Springer Nature Switzerland AG.

关键词： Tensors

来源：评论

学校读者我要写书评

暂无评论

Multi-level thresholding for pupil location in eye-gaze tracking systerm

Multi-level thresholding for pupil location in eye-gaze trac...

引用

2016 International Conference on Machine Learning and Cybernetics, ICMLC 2016

作者： Chen, Mo-Han Wen, Jing Zhu, Yu Xing, Hao-Yang Wang, Yi College of Computer Science Chongqing University Chongqing China Key Laboratory of Pattern Recognition and Intelligent Information Processing Institutions of Higher Education of Sichuan Province Chengdu University China Magnetic Resonance Imaging Research Centre Huaxi Hospital Sichuan Chengdu China

ISBN: (纸本)9781509003891

A new pupil location methodis proposed in eye-gaze tracking system. Firstly, input images are enhanced in order to reduce the influence of illumination. Secondly, multiple candidate thresholds are obtained in terms of the valleys in histogram, and then different segmentation results are available. Further, a fusion method based on the overlaps of segmentation results is proposed which acquired candidate regions and the eye region is obtained according to the entropy information of candidate regions. Thirdly, a simple and effective threshold segmentation method based on eye characteristic is employed and pupil area is obtained. Finally, the center of pupil is acquired by ellipse fitting. Experimental results demonstrated that the proposed methods were effective to improve pupil location accuracy. © 2016 IEEE.

关键词： Entropy

来源：评论

学校读者我要写书评

暂无评论

Word-Wise Handwriting Based Gender Identification Using Multi-Gabor Response Fusion 4th

Word-Wise Handwriting Based Gender Identification Using Mult...

引用

4th Workshop on Document Analysis and recognition, DAR 2018, held in Conjunction with the 11th Indian Conference on vision, Graphics, and Image Processing, ICVGIP 2018

作者： Asadzadeh Kaljahi, Maryam Vidya Varshini, P.V. Shivakumara, Palaiahnakote Pal, Umapada Lu, Tong Guru, D.S. Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur Malaysia Vellore Institute of Technology VelloreTamil Nadu India Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India National Key Lab for Novel Software Technology Nanjing University Nanjing China Department of Studies in Computer Science Manasagangotri University of Mysuru Mysore India

ISBN: (纸本)9789811393600

Handwriting based gender identification at the word level is challenging due to free style writing, use of different scripts, and inadequate information. This paper presents a new method based on Multi-Gabor Response (MGR) fusion for gender identification at the word level. It first explores weighted-gradient features for word segmentation from text line images. For each word, the proposed method obtains eight Gabor response images. Then it performs sliding window operation over MGR images to smooth the values. For each smoothed MGR images, we perform fusion operation that chooses the Gabor response value which contributes to the highest peak in the histogram. This process results in a feature matrix, which is fed to CNN for gender identification. Experimental results on our dataset (multi scripts) apart from English, and benchmark databases, namely, IAM, KHATT, and QUWI, which contain handwritten English and Arabic text, show that the proposed method outperforms the existing methods. © Springer Nature Singapore Pte Ltd. 2019.

关键词： Image segmentation

来源：评论

学校读者我要写书评

暂无评论

An improved convex programming model for the inverse problem in intensity-modulated radiation therapy

引用

International Journal of Performability Engineering 2018年第5期14卷 871-884页

作者： Lan, Yihua Zhang, Xingang Zhang, Jianyang Wang, Yang Hung, Chih-Cheng School of Computer and Information Technology Nanyang Normal University Nanyang473061 China Institute of Image Processing and Pattern Recognition Nanyang Normal University Nanyang473061 China Radiology Department Central Hospital of Nanyang Nanyang473061 China Laboratory for Machine Vision and Security Research College of Computing and Software Engineering Kennesaw State University - Marietta Campus 1100 South Marietta Parkway MariettaGA30067-2896 United States

Intensity modulated radiation therapy technology (IMRT) is one of the main approaches in cancer treatment because it can guarantee the killing of cancer cells while optimally protecting normal tissue from complications. Inverse planning, which is the core component of the entire IMRT system, is mainly based on accurate mathematical modeling and associated fast solving methods. In inverse planning, the fluence map optimization, which considers the multi-leaf collimator (MLC) modulation, is the current research focus. Although the hitting constrain problem with the unidirectional movement of leaf-sweeping has been solved, our goal is to solve the hitting constrain problem with the bidirectional movement of leaf-sweeping. In this study, we propose a non-synchronized type to solve the hitting constrain problem with the bidirectional movement of leaf-sweeping schemes for IMRT. In solving this problem, a new mathematical model is proposed under the framework of convex programming. The advantage of the convex model is to avoid the uncertainty and inaccuracy that occurs in the non-convex programming solving process. Experimental results for two clinical testing cases show that under the same condition of total number of monitoring units, the new proposed model produces better dose distribution than those of the total variance and quadratic models. © 2018 Totem Publisher, Inc. All rights reserved.

关键词： Convex optimization

来源：评论

学校读者我要写书评

暂无评论

Deep Audio-visual Learning:A Survey

引用

International Journal of Automation and computing 2021年第3期18卷 351-376页

作者： Hao Zhu Man-Di Luo Rui Wang Ai-Hua Zheng Ran He Anhui Provincial Key Laboratory of Multimodal Cognitive Computation School of Computer Science and TechnologyAnhui UniversityHefei 230601China Center for Research on Intelligent Perception and Computing(CRIPAC)and National Laboratory of Pattern Recognition(NLPR) Institute of AutomationChinese Academy of SciencesBeijing 100190China School of Artificial Intelligence University of the Chinese Academy of SciencesBeijing 100049China Center for Excellence in Brain Science and Intelligence Technology Chinese Academy of SciencesShanghai 200031China

Audio-visual learning,aimed at exploiting the relationship between audio and visual modalities,has drawn considerable attention since deep learning started to be used *** tend to leverage these two modalities to improve the performance of previously considered single-modality tasks or address new challenging *** this paper,we provide a comprehensive survey of recent audio-visual learning *** divide the current audio-visual learning tasks into four different subfields:audiovisual separation and localization,audio-visual correspondence learning,audio-visual generation,and audio-visual representation ***-of-the-art methods,as well as the remaining challenges of each subfield,are further ***,we summarize the commonly used datasets and challenges.

关键词： Deep audio-visual learning audio-visual separation and localization correspondence learning generative models representation learning

来源：评论

学校读者我要写书评

暂无评论

Sustainable Mining in the Era of Artificial Intelligence

引用

IEEE/CAA Journal of Automatica Sinica 2024年第1期11卷 1-4页

作者： Long Chen Yuting Xie Yutong Wang Shirong Ge Fei-Yue Wang IEEE the State Key Laboratory for Management and Control of Complex Systems at the Institute of Automation Chinese Academy of SciencesBeijing 100190China the National Laboratory of Pattern Recognition at the Institute of Automation and Waytous Ltd.China the School of Computer Science and Engineering Sun Yat-Sen UniversityGuangzhou 510275GuangdongChina the School of Mechanical Electronic and Information Engineering China University of Mining and Technology BeijingBejing 100083China the School of Artificial Intelligence University of Chinese Academy of SciencesBeijing 100190China

The mining sector historically drove the global economy but at the expense of severe environmental and health repercussions,posing sustainability challenges[1]-[3].Recent advancements on artificial intelligence(AI)are revolutionizing mining through robotic and data-driven innovations[4]-[7].While AI offers mining industry advantages,it is crucial to acknowledge the potential risks associated with its widespread ***-reliance on AI may lead to a loss of human control over mining operations in the future,resulting in unpredictable consequences.

关键词： Sustainable mining consequences

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：