检索结果-内蒙古大学图书馆

arXiv 2022年

作者： Wang, Jiaan Meng, Fandong Liang, Yunlong Zhang, Tingyi Xu, Jiarong Li, Zhixu Zhou, Jie Shanghai Key Laboratory of Data Science School of Computer Science Fudan University Shanghai China Pattern Recognition Center WeChat AI Tencent Inc China Beijing Key Lab of Traffic Data Analysis and Mining Beijing Jiaotong University Beijing China School of Management Fudan University Shanghai China

Given a document in a source language, cross-lingual summarization (CLS) aims at generating a concise summary in a different target language. Unlike monolingual summarization (MS), naturally occurring source-language documents paired with target-language summaries are rare. To collect large-scale CLS data, existing datasets typically involve translation in their creation. However, the translated text is distinguished from the text originally written in that language, i.e., translationese. In this paper, we first confirm that different approaches of constructing CLS datasets will lead to different degrees of translationese. Then we systematically investigate how translationese affects CLS model evaluation and performance when it appears in source documents or target summaries. In detail, we find that (1) the translationese in documents or summaries of test sets might lead to the discrepancy between human judgment and automatic evaluation;(2) the translationese in training sets would harm model performance in real-world applications;(3) though machine-translated documents involve translationese, they are very useful for building CLS systems on low-resource languages under specific training strategies. Lastly, we give suggestions for future CLS research including dataset and model developments. We hope that our work could let researchers notice the phenomenon of translationese in CLS and take it into account in the future. Copyright © 2022, The Authors. All rights reserved.

关键词： Large dataset

来源：评论

学校读者我要写书评

暂无评论

RBP-Former: Joint Prediction of RNA-protein Binding Sites on Full-length RNA Transcripts for Multiple RBPs

RBP-Former: Joint Prediction of RNA-protein Binding Sites on...

引用

IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

作者： Yichong Li Xiaojian Liu Fan Cheng Xiaoyong Pan Yang Yang Department of Computer Science and Engineering Shanghai Jiao Tong University Shanghai China Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai China Key Laboratory of System Control and Information Processing Ministry of Education of China Shanghai China Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Shanghai China

ISBN: (数字)9798350386226

ISBN: (纸本)9798350386233

RNA-binding proteins (RBPs) are essential for gene expression, and the complex RNA-protein interaction mechanisms require analysis of global RNA information. Therefore, accurate prediction of RBP binding sites on full-length RNA transcripts is crucial for understanding these mechanisms and their roles in diseases. While machine learning methods can predict RBP binding to RNA fragments, extending this to full-length transcripts presents challenges due to sequence length and data imbalance. In this paper, we introduce RBP-Former, a binding site joint prediction model designed specifically for full-length RNA transcripts that can be used for multiple RBPs. This model processes information at both coarse and fine-grained levels to fully exploit sequence data and its interactions with multiple RBPs. We develop multi-level imbalance learning strategies, achieving favorable results on imbalanced data. Our method outperforms existing methods in predicting binding sites on full-length RNA transcripts for multiple RBPs, demonstrating its effectiveness in handling imbalanced label and sample distributions.

关键词： Proteins Protein engineering Accuracy RNA Machine learning Predictive models Data models Gene expression Bioinformatics Diseases

来源：评论

学校读者我要写书评

暂无评论

Road Segmentation via Iterative Deep Analysis

Road Segmentation via Iterative Deep Analysis

引用

IEEE International Conference on Robotics and Biomimetics

作者： Xiang Chen Yu Qiao Student at Shenzhen Key Laboratory of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Address: 1068 Xueyuan Avenue Shenzhen University Town Shenzhen P.R.China Researcher at Shenzhen Key Laboratory of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Address: 1068 Xueyuan Avenue Shenzhen University Town Shenzhen P.R.China

ISBN: (纸本)9781467396769

Nowadays, people are increasingly concerned about the safety of traffic systems. Road segmentation and recognition is a fundamental problem in perceiving traffic environments and serve as the basis for self-driving cars. In this paper, inspired by an iterative deep analysis thinking, we propose a novel method which is able to learning powerful features step by step, and solve the optimal precision by balancing local and global information to conduct pixel-level classification for road segmentation. Firstly, we introduce an iterative deep analysis thinking which shows that how to design a strong and robustness deep model from failure experience. Secondly, we choose a powerful global features learning network as basis to create a novel framework for our task. Meanwhile, we employ the patch and multi-scale pyramid as input to enhance local features learning. We conduct experiments on three datasets from KITTI vision Benchmark, namely UU, UM, UMM. The experimental results demonstrate that our proposed method obtains comparable performance with state-of-the-art methods on these datasets.

关键词： recognition fundamental datasets

来源：评论

学校读者我要写书评

暂无评论

An Optimizing Parameters and Feature Selection in SVM Based on Improved Cockroach Swarm Optimization 16th

An Optimizing Parameters and Feature Selection in SVM Based ...

引用

16th International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IIH-MSP 2020 in conjunction with the 13th International Conference on Frontiers of Information Technology, Applications and Tools, FITAT 2020

作者： Nguyen, Trong-The Yu, Jie Nguyen, Thi-Thanh-Tan Lai, Quoc-Anh Ngo, Truong-Giang Dao, Thi-Kien Fujian Provincial Key Laboratory of Big Data Mining and Applications Fujian University of Technology Fuzhou China College of Mechanical and Automotive Engineering Fujian University of Technology Fuzhou China Information Technology Faculty Electric Power University Hanoi Viet Nam Department of Pattern Recognition & Image Processing Institute of Information Technology Vietnam Academy of Science and Technology Hanoi Viet Nam Faculty of Computer Science and Engineering Thuyloi University 175 Tay Son Dong Da Hanoi Viet Nam

ISBN: (纸本)9789813367562

This study improves a classifier of the support vector machine (SVM) by optimizing its parameters by adjusting cockroach swarm optimization (CSO). Classification system design includes data inputs, pre-process, and classification. The relief technique selects the feature subset, and the kernel parameters used as input for the system. The feature selection and classification of data sets were figured out by the combination of ICSO and SVM with its optimized parameters synchronous. Simulation results of the proposed scheme are compared with other systems, e.g., GA-SVM, PSO-SVM, and CSO-SVM, which show that the proposed project can achieve better classification accuracy than the different algorithms. It is promising that the proposed scheme can be applied to high-dimensional data classification, such as image classification, signal classification. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

A glass image classification method based on multi-feature fusion

A glass image classification method based on multi-feature f...

引用

International Conference on Wavelet Analysis and pattern recognition (ICWAPR)

作者： Liang Zhang Jing Wen Sheng-Zhou Xu Hao-Yang Xing Yu Zhu Heng-Xin Chen College of Computer Science Chongqing University Chongqing China Key Laboratory of Pattern Recognition and Intelligent Chengdu University China School of Computer Science South-central University For Nationalities WuHan China Magnetic Resonance Imaging Research Centre Huaxi Hospital Chengdu Sichuan China Chongqing University Chongqing Sichuan CN

ISBN: (纸本)9781509029181

In this work, a new glass classification method is proposed. Firstly, images are enhanced by image preprocessing. Secondly, a series of glass features including shape and texture features are proposed. Finally, we employ simple minimum distance classifier to classify the input glass images. The experimental results show that the proposed method has high classification efficiency and accuracy.

关键词： Glass pattern recognition Training Shape Wavelet analysis Sorting Production facilities

来源：评论

学校读者我要写书评

暂无评论

Talking Face Generation via Learning Semantic and Temporal Synchronous Landmarks

Talking Face Generation via Learning Semantic and Temporal S...

引用

International Conference on pattern recognition

作者： Aihua Zheng Feixia Zhu Hao Zhu Mandi Luo Ran He Anhui Provincial Key Laboratory of Multimodal Cognitive Computation School of Computer Science and Technology Anhui University Heifei China Center for Research on Intelligent Perception and Computing (CRIPAC) National Laboratory of Pattern Recognition (NLPR) CASIA Beijing China Center for Excellence in Brain Science and Intelligence Technology CAS Beijing China

ISBN: (纸本)9781728188089;9781728188096

Given a speech clip and facial image, the goal of talking face generation is to synthesize a talking face video with accurate mouth synchronization and natural face motion. Recent progress has proven the effectiveness of the landmarks as the intermediate information during talking face generation. However, the large gap between audio and visual modalities makes the prediction of landmarks challenging and limits generation ability. This paper proposes a semantic and temporal synchronous landmark learning method for talking face generation. First, we propose to introduce a word detector to enforce richer semantic information. Then, we propose to preserve the temporal synchronization and consistency between landmarks and audio via the proposed temporal residual loss. Lastly, we employ a U-Net generation network with adaptive reconstruction loss to generate facial images for the predicted landmarks. Experimental results on two benchmark datasets LRW and GRID demonstrate the effectiveness of our model compared to the state-of-the-art methods of talking face generation.

关键词： Learning systems Visualization Face recognition Semantics Mouth Detectors Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

Inconsistency Distillation For Consistency:Enhancing Multi-View Clustering via Mutual Contrastive Teacher-Student Leaning

Inconsistency Distillation For Consistency:Enhancing Multi-V...

引用

IEEE International Conference on Data Mining (ICDM)

作者： Dunqiang Liu Shu-Juan Peng Xin Liu Lei Zhu Zhen Cui Taihao Li Dept. of Comput. Sci. & Fujian Key Lab. of Big Data Intelligence and Security Huaqiao University Xiamen China Zhejiang Lab Hangzhou China Xiamen Key Lab. of Computer Vision and Pattern Recognition Huaqiao University Xiamen China Key Lab. of Computer Vision and Machine Learning (Huaqiao University) Fujian Province University Xiamen China School of Information Sci. and Eng. Shandong Normal University Jinan China School of Computer Sci. and Eng. Nanjing University of Science and Technology Nanjing China

Multi-view clustering has attracted more attention recently since many real-world data are comprised of different representations or views. Recent multi-view clustering works mainly exploit the instance consistency to obtain the shared representations across different views, and apply a single-view clustering method to perform data partitions. However, these existing methods often ignore the inconsistency of instance associations within the views, which may enlarge the intra-class diversity among the views and therefore degrade the clustering performance. To address this issue, this paper proposes an efficient mutual contrastive teacher-student leaning (MC-TSL) model to enhance the multi-view clustering, which is the first attempt to study the inconsistency distillation for consistency learning. First, the proposed MC-TSL approach exploits a view-specific encoder with two heads, an instance encoding head and a semantic distillation head, respectively, for capturing the consistent and discriminative feature representations. To be specific, the former head exploits a cross-view contrastive learning method to obtain a redundancy-free consistent representation at the instance level, while the latter head designs a mutual teacher-student learning module to capture the intra-view information at semantic level. By training these two heads in an end-to-end manner, the discriminative multi-view embeddings are efficiently obtained and refined by minimizing the weighted sum of the reconstruction loss, contrastive loss and contrast distillation loss. Extensive experiments verify the superiorities of the proposed MC-TSL framework and show its competitive clustering performances.

关键词： Training Learning systems Clustering methods Semantics Encoding Data mining

来源：评论

学校读者我要写书评

暂无评论

A Simple yet Effective Network based on vision Transformer for Camouflaged Object and Salient Object Detection

arXiv

引用

arXiv 2024年

作者： Hao, Chao Yu, Zitong Liu, Xin Xu, Jun Yue, Huanjing Yang, Jingyu The School of Electrical and Information Engineering Tianjin University Tianjin300072 China The School of Computing and Information Technology Great Bay University Dongguan523000 China The Computer Vision and Pattern Recognition Laboratory Lappeenranta-Lahti University of Technology LUT Lappeenranta53850 Finland The School of Statistics and Data Science Nankai University Tianjin300072 China

Camouflaged object detection (COD) and salient object detection (SOD) are two distinct yet closely-related computer vision tasks widely studied during the past decades. Though sharing the same purpose of segmenting an image into binary foreground and background regions, their distinction lies in the fact that COD focuses on concealed objects hidden in the image, while SOD concentrates on the most prominent objects in the image. Previous works achieved good performance by stacking various hand-designed modules and multi-scale features. However, these carefully-designed complex networks often performed well on one task but not on another. In this work, we propose a simple yet effective network (SENet) based on vision Transformer (ViT), by employing a simple design of an asymmetric ViT-based encoder-decoder structure, we yield competitive results on both tasks, exhibiting greater versatility than meticulously crafted ones. Furthermore, to enhance the Transformer’s ability to model local information, which is important for pixel-level binary segmentation tasks, we propose a local information capture module (LICM). We also propose a dynamic weighted loss (DW loss) based on Binary Cross-Entropy (BCE) and Intersection over Union (IoU) loss, which guides the network to pay more attention to those smaller and more difficult-to-find target objects according to their size. Moreover, we explore the issue of joint training of SOD and COD, and propose a preliminary solution to the conflict in joint training, further improving the performance of SOD. Extensive experiments on multiple benchmark datasets demonstrate the effectiveness of our method. The code is available at https://***/linuxsino/SENet. Copyright © 2024, The Authors. All rights reserved.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

When Face recognition Meets with Deep Learning: An Evaluation of Convolutional Neural Networks for Face recognition

When Face Recognition Meets with Deep Learning: An Evaluatio...

引用

International Conference on computer vision Workshops (ICCV Workshops)

作者： Guosheng Hu Yongxin Yang Dong Yi Josef Kittler William Christmas Stan Z. Li Timothy Hospedales Centre for Vision Speech and Signal Processing University of Surrey UK Indicates equal contribution LEAR team Inria Grenoble Rhone-Alpes Montbonnot France Electronic Engineering and Computer Science Queen Mary University of London UK Chinese Academy of Sciences Center for Biometrics and Security Research & National Laboratory of Pattern Recognition China

Deep learning, in particular Convolutional Neural Network (CNN), has achieved promising results in face recognition recently. However, it remains an open question: why CNNs work well and how to design a 'good' architecture. The existing works tend to focus on reporting CNN architectures that work well for face recognition rather than investigate the reason. In this work, we conduct an extensive evaluation of CNN-based face recognition systems (CNN-FRS) on a common ground to make our work easily reproducible. Specifically, we use public database LFW (Labeled Faces in the Wild) to train CNNs, unlike most existing CNNs trained on private databases. We propose three CNN architectures which are the first reported architectures trained using LFW data. This paper quantitatively compares the architectures of CNNs and evaluates the effect of different implementation choices. We identify several useful properties of CNN-FRS. For instance, the dimensionality of the learned features can be significantly reduced without adverse effect on face recognition accuracy. In addition, a traditional metric learning method exploiting CNN-learned features is evaluated. Experiments show two crucial factors to good CNN-FRS performance are the fusion of multiple CNNs and metric learning. To make our work reproducible, source code and models will be made publicly available.

关键词： Face recognition Face Databases Measurement Training Object recognition Convolutional codes

来源：评论

学校读者我要写书评

暂无评论

TTPP: Temporal transformer with progressive prediction for efficient action anticipation

arXiv

引用

arXiv 2020年

作者： Wang, Wen Peng, Xiaojiang Su, Yanzhou Qiao, Yu Cheng, Jian School of Information and Communication Engineering University of Electronic Science and Technology of China Chengdu Sichuan611731 China ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab. Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences SIAT Branch Shenzhen Institute of Artificial Intelligence and Robotics for Society

Video action anticipation aims to predict future action categories from observed frames. Current state-of-the-art approaches mainly resort to recurrent neural networks to encode history information into hidden states, and predict future actions from the hidden representations. It is well known that the recurrent pipeline is inefficient in capturing long-term information which may limit its performance in predication task. To address this problem, this paper proposes a simple yet efficient Temporal Transformer with Progressive Prediction (TTPP) framework, which repurposes a Transformer-style architecture to aggregate observed features, and then leverages a light-weight network to progressively predict future features and actions. Specifically, predicted features along with predicted probabilities are accumulated into the inputs of subsequent prediction. We evaluate our approach on three action datasets, namely TVSeries, THUMOS-14, and TV-Human-Interaction. Additionally we also conduct a comprehensive study for several popular aggregation and prediction strategies. Extensive results show that TTPP not only outperforms the state-of-the-art methods but also more efficient. Copyright © 2020, The Authors. All rights reserved.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：