检索结果-内蒙古大学图书馆

arXiv 2019年

作者： Wan, Ruosi Xiong, Haoyi Li, Xingjian Zhu, Zhanxing Huan, Jun Big Data Laboratory Baidu Inc. Beijing China National Engineering Laboratory for Deep Learning Technology and Applications Beijing China School of Mathematical Sciences Peking University Beijing China

—Transfer learning have been frequently used to improve deep neural network training through incorporating weights of pre-trained networks as the starting-point of optimization for regularization. While deep transfer learning can usually boost the performance with better accuracy and faster convergence, transferring weights from inappropriate networks hurts training procedure and may lead to even lower accuracy. In this paper, we consider deep transfer learning as minimizing a linear combination of empirical loss and regularizer based on pre-trained weights, where the regularizer would restrict the training procedure from lowering the empirical loss, with conflicted descent directions (e.g., derivatives). Following the view, we propose a novel strategy making regularization-based deep Transfer learning Never Hurt (DTNH) that, for each iteration of training procedure, computes the derivatives of the two terms separately, then re-estimates a new descent direction that does not hurt the empirical loss minimization while preserving the regularization affects from the pre-trained weights. Extensive experiments have been done using common transfer learning regularizers, such as L2-SP and knowledge distillation, on top of a wide range of deep transfer learning benchmarks including Caltech, MIT indoor 67, CIFAR-10 and ImageNet. The empirical results show that the proposed descent direction estimation strategy DTNH can always improve the performance of deep transfer learning tasks based on all above regularizers, even when transferring pre-trained weights from inappropriate networks. All in all, DTNH strategy can improve state-of-the-art regularizers in all cases with 0.1%—7% higher accuracy in all experiments. Copyright © 2019, The Authors. All rights reserved.

关键词： deep neural networks

来源：评论

学校读者我要写书评

暂无评论

A multi-objective deep reinforcement learning algorithm for spatio-temporal latency optimization in mobile IoT-enabled edge computing networks

引用

Simulation Modelling Practice and Theory 2025年 143卷

作者： Parisa Khoshvaght Amir Haider Amir Masoud Rahmani Farhad Soleimanian Gharehchopogh Ferzat Anka Jan Lansky Mehdi Hosseinzadeh Institute of Research and Development Duy Tan University Da Nang Vietnam School of Engineering & Technology Duy Tan University Da Nang Vietnam Centre for Research Impact & Outcome Chitkara University Institute of Engineering and Technology Chitkara University Rajpura 140401 Punjab India Department of AI and Robotics Sejong University Seoul 05006 Republic of Korea Future Technology Research Center National Yunlin University of Science and Technology Yunlin Taiwan Department of Computer Engineering Ur. C. Islamic Azad University Urmia Iran Data Science Application and Research Center (VEBIM) Fatih Sultan Mehmet Vakif University Istanbul Türkiye Department of Computer Science and Mathematics Faculty of Economic Studies University of Finance and Administration Prague Czech Republic Pattern Recognition and Machine Learning Laboratory School of Computing Gachon University Seongnam Republic of Korea

The rapid increase in Mobile Internet of Things (IoT) devices requires novel computational frameworks. These frameworks must meet strict latency and energy efficiency requirements in Edge and Mobile Edge Computing (MEC) systems. Spatio-temporal dynamics, which include the position of edge servers and the timing of task schedules, pose a complex optimization problem. These challenges are further exacerbated by the heterogeneity of IoT workloads and the constraints imposed by device mobility. The balance between computational overhead and communication challenges is also a problem. To solve these issues, advanced methods are needed for resource management and dynamic task scheduling in mobile IoT and edge computing environments. In this paper, we propose a deep Reinforcement learning (DRL) multi-objective algorithm, called a Double deep Q-learning (DDQN) framework enhanced with Spatio-temporal mobility prediction, latency-aware task offloading, and energy-constrained IoT device trajectory optimization for federated edge computing networks. DDQN was chosen for its optimize stability and reduced overestimation in Q-values. The framework employs a reward-driven optimization model that dynamically prioritizes latency-sensitive tasks, minimizes task migration overhead, and balances energy efficiency across devices and edge servers. It integrates dynamic resource allocation algorithms to address random task arrival patterns and real-time computational demands. Simulations demonstrate up to a 35 % reduction in end-to-end latency, a 28 % improvement in energy efficiency, and a 20 % decrease in the deadline-miss ratio compared to benchmark algorithms.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Relaxed 2-D principal component analysis by Lpnorm for face recognition

arXiv

引用

arXiv 2019年

作者： Chen, Xiao Jia, Zhi-Gang Cai, Yunfeng Zhao, Mei-Xiang School of Mathematics and Statistics Jiangsu Key Laboratory of Education Big Data Science and Engineering Jiangsu Normal University Xuzhou221116 China Baidu Research National Engineering Laboratory for Deep Learning Technology and Applications Beijing100193 China

A relaxed two dimensional principal component analysis (R2DPCA) approach is proposed for face recognition. Different to the 2DPCA, 2DPCA-L1 and G2DPCA, the R2DPCA utilizes the label information (if known) of training samples to calculate a relaxation vector and presents a weight to each subset of training data. A new relaxed scatter matrix is defined and the computed projection axes are able to increase the accuracy of face recognition. The optimal Lp-norms are selected in a reasonable range. Numerical experiments on practical face databased indicate that the R2DPCA has high generalization ability and can achieve a higher recognition rate than state-of-the-art methods. Copyright © 2019, The Authors. All rights reserved.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

A Spatio-Spectral Hybrid Convolutional Architecture for Hyperspectral Document Authentication

A Spatio-Spectral Hybrid Convolutional Architecture for Hype...

引用

International Conference on Document Analysis and Recognition

作者： Muhammad Jaleed Khan Khurram Khurshid Faisal Shafait Department of Electrical Engineering Institute of Space Technology (IST) Islamabad Pakistan School of Electrical Engineering and Computer Science (SEECS) National University of Sciences and Technology (NUST) Islamabad Pakistan Deep Learning Laboratory National Center of Artificial Intelligence (NCAI) Islamabad Pakistan

Hyperspectral Document Image (HSDI) analysis allows for efficient and accurate differentiation of inks with visually similar color but unique spectral response, which is a crucial step in authentication of documents. Various HSDI based ink discrimination methods are available in the current literature, however, more accurate and robust methods are required to empower document authentication. Contrary to the former ink mismatch detection methods based on spectral features only, we present a novel method based on deep learning that exploits the spectral correlation as well as the spatial context to enhance ink mismatch detection. Spectral responses of the target pixel and its neighboring pixels are organized in an image format and fed to a Convolutional Neural Network (CNN) for classification. The proposed method achieves the highest accuracy among the other ink mismatch detection methods on the UWA Writing Ink Hyperspectral Images database (WIHSI), which demonstrates the effectiveness of deep learning models employing spatio-spectral hybrid features for document authentication. Detailed experimental analysis for selection of appropriate CNN architecture, spatio-spectral data format and training ratio is presented along with a comparison with the previous methods on this subject.

关键词： Ink Hyperspectral imaging Feature extraction Authentication deep learning Databases Text analysis

来源：评论

学校读者我要写书评

暂无评论

Part-level Car Parsing and Reconstruction from a Single Street View

arXiv

引用

arXiv 2018年

作者： Geng, Qichuan Zhang, Hong Huang, Xinyu Wang, Sen Lu, Feixiang Cheng, Xinjing Zhou, Zhong Yang, Ruigang Beihang University Beijing China Baidu Research Beijing China National Engineering Laboratory of Deep Learning Technology and Application China

Part information has been shown to be resistant to occlusions and viewpoint changes, which is beneficial for various vision-related tasks. However, we found very limited work in car pose estimation and reconstruction from street views leveraging the part information. There are two major contributions in this paper. Firstly, we make the first attempt to build a framework to simultaneously estimate shape, translation, orientation, and semantic parts of cars in 3D space from a single street view. As it is labor-intensive to annotate semantic parts on real street views, we propose a specific approach to implicitly transfer part features from synthesized images to real street views. For pose and shape estimation, we propose a novel network structure that utilizes both part features and 3D losses. Secondly, we are the first to construct a high-quality dataset that contains 348 different car models with physical dimensions and part-level annotations based on global and local deformations. Given these models, we further generate 60K synthesized images with randomization of orientation, illumination, occlusion, and texture. Our results demonstrate that our part segmentation performance is significantly improved after applying our implicit transfer approach. Our network for pose and shape estimation achieves the state-of-the-art performance on the ApolloCar3D dataset and outperforms 3D-RCNN and deepMANTA by 12.57 and 8.91 percentage points in terms of mean A3DP-Abs [17, 3, 30]. Copyright © 2018, The Authors. All rights reserved.

关键词： Model automobiles

来源：评论

学校读者我要写书评

暂无评论

DeLS-3D: deep localization and segmentation with a 3D semantic map

arXiv

引用

arXiv 2018年

作者： Wang, Peng Yang, Ruigang Cao, Binbin Xu, Wei Lin, Yuanqing Baidu Research National Engineering Laboratory for Deep Learning Technology and Applications

For applications such as augmented reality, autonomous driving, self-localization/camera pose estimation and scene parsing are crucial technologies. In this paper, we propose a unified framework to tackle these two problems simultaneously. The uniqueness of our design is a sensor fusion scheme which integrates camera videos, motion sensors (GPS/IMU), and a 3D semantic map in order to achieve robustness and efficiency of the system. Specifically, we first have an initial coarse camera pose obtained from consumer-grade GPS/IMU, based on which a label map can be rendered from the 3D semantic map. Then, the rendered label map and the RGB image are jointly fed into a pose CNN, yielding a corrected camera pose. In addition, to incorporate temporal information, a multi-layer recurrent neural network (RNN) is further deployed improve the pose accuracy. Finally, based on the pose from RNN, we render a new label map, which is fed together with the RGB image into a segment CNN which produces per-pixel semantic label. In order to validate our approach, we build a dataset with registered 3D point clouds and video camera images. Both the point clouds and the images are semantically-labeled. Each video frame has ground truth pose from highly accurate motion sensors. We show that practically, pose estimation solely relying on images like PoseNet [25] may fail due to street view confusion, and it is important to fuse multiple sensors. Finally, various ablation studies are performed, which demonstrate the effectiveness of the proposed system. In particular, we show that scene parsing and pose estimation are mutually beneficial to achieve a more robust and accurate system. Copyright © 2018, The Authors. All rights reserved.

关键词： Rendering (computer graphics)

来源：评论

学校读者我要写书评

暂无评论

Hyperspectral Image Analysis for Writer Identification using deep learning

Hyperspectral Image Analysis for Writer Identification using...

引用

Proceedings of the Digital Image Computing: Technqiues and applications (DICTA)

作者： Ammad Ul Islam Muhammad Jaleed Khan Khurram Khurshid Faisal Shafait Artificial Intelligence and Computer Vision (iVision) Lab Institute of Space Technology (IST) Islamabad Pakistan School of Electrical Engineering and Computer Science (SEECS) National University of Sciences and Technology (NUST) Islamabad Pakistan Deep Learning Laboratory National Center of Artificial Intelligence (NCAI) Islamabad Pakistan

Handwriting is a behavioral characteristic of human beings that is one of the common idiosyncrasies utilized for litigation purposes. Writer identification is commonly used for forensic examination of questioned and specimen documents. Recent advancements in imaging and machine learning technologies have empowered the development of automated, intelligent and robust writer identification methods. Most of the existing methods based on human defined features and color imaging have limited performance in terms of accuracy and robustness. However, rich spectral information content obtained from hyperspectral imaging (HSI) and suitable spatio-spectral features extracted using deep learning can significantly enhance the performance of writer identification in terms of accuracy and robustness. In this paper, we propose a novel writer identification method in which spectral responses of text pixels in a hyperspectral document image are extracted and are fed to a Convolutional Neural Network (CNN) for writer classification. Different CNN architectures, hyperparameters, spatio-spectral formats, train-test ratios and inks are used to evaluate the performance of the proposed system on the UWA Writing Inks Hyperspectral Images (WIHSI) database and to select the most suitable set of parameters for writer identification. The findings of this work have opened a new arena in forensic document analysis for writer identification using HSI and deep learning.

关键词： Hyperspectral imaging Feature extraction Ink Text analysis Forensics deep learning Databases

来源：评论

学校读者我要写书评

暂无评论

Development of a Chinese Depressed Speech Corpus Based on The Disturbed Effect of Self-Processing

Development of a Chinese Depressed Speech Corpus Based on Th...

引用

Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)

作者： Xiaoyong Lu Yanqin Li Haizhen An Tao Pan Renjun Li Yanbin Hu Aibao Zhou Hongwu Yang Key Laboratory of behavioral and Mental Health Lanzhou China Engineering Research Center of Gansu Province for Intelligent Information Technology and Application Lanzhou China Lanzhou Resources and Environment Voc- Tech College Lanzhou China National and Provincial Joint Engineering Laboratory of Learning Analysis Technology in Online Education Lanzhou China

ISBN: (数字)9781728132488

ISBN: (纸本)9781728132495

Depression has long been recognized as one of the leading causes of disability and burden worldwide. In psychology, it is well known that the self is not only the cognitive subject, but also the core of personality. And the high incidence of suicide and pervasive hopelessness in depressed individuals suggested that the self might be abnormal among them. In order to expand the application of the depression detection, we employ classical scientific psychology paradigms on abnormalities of self-related processing in depressed individuals to develop a Chinese depressed speech corpus. Eleven depressed individuals and ten healthy subjects, who are gender-balanced and age-balanced, were recruited to participate in this study. Currently we have preliminarily collected 6 and 2.5 hours of speech data respectively, with the results of preliminary analysis indicating that there exist abnormalities in the depressed speech. The study results will provide a new perspective and strategy for further study on the building and application of Chinese speech corpus in depression.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Interactive language acquisition with one-shot visual concept learning through a conversational game

arXiv

引用

arXiv 2018年

作者： Zhang, Haichao Yu, Haonan Xu, Wei Baidu Research - Institue of Deep Learning Sunnyvale United States National Engineering Laboratory for Deep Learning Technology and Applications Beijing China

Building intelligent agents that can communicate with and learn from humans in natural language is of great value. Supervised language learning is limited by the ability of capturing mainly the statistics of training data, and is hardly adaptive to new scenarios or flexible for acquiring new knowledge without inefficient retraining or catastrophic forgetting. We highlight the perspective that conversational interaction serves as a natural interface both for language learning and for novel knowledge acquisition and propose a joint imitation and reinforcement approach for grounded language learning through an interactive conversational game. The agent trained with this approach is able to actively acquire information by asking questions about novel objects and use the just-learned knowledge in subsequent conversations in a one-shot fashion. Results compared with other methods verified the effectiveness of the proposed approach. Copyright © 2018, The Authors. All rights reserved.

关键词： Intelligent agents

来源：评论

学校读者我要写书评

暂无评论

The Speech Synthesis of Yi Language Based on DNN

The Speech Synthesis of Yi Language Based on DNN

引用

Information, Media and engineering (IJCIM), International Joint Conference on

作者： Xiaolong Bu Hongwu Yang Weizhao Zhang College of Physics and Electronic Engineering Northwest Normal University Lanzhou China School of Educational Technology National and provincial Joint Engineering Laboratory of Learning Analysis Technology in Online Education Northwest Normal University Lanzhou China College of Physics and Electronic Engineering Engineering Research Center of Gansu Province for Intelligent Information Technology and Application Northwest Normal University Lanzhou China

ISBN: (数字)9781728155869

ISBN: (纸本)9781728155876

This paper is mainly about a speech synthesis system based on deep Neural Network (DNN) model of Yi languages, a kind of minority language in China. The system is composed of relatively complete text analysis of Yi, model training and speech synthesis module. Especially in front-end, the word segmentation, pause handling, word-to-phoneme conversion and label processing are used to analysis text of Yi language. We designed the question set for decision tree of DNN model training and used vocoder: WORLD for synthesis. The system achieves a relatively good Mean Opinion Score (MOS) of 3.93 by Yi undergraduates as evaluators compared with a MOS of 4.58 of original speech. To investigate the factors affecting the quality of synthesized Yi speech, this paper also objectively evaluates the performance of different training set and DNN model. The system successfully synthesized Yi speech for the first time and synthesized speech is relatively good as the result of an only complete minority language speech synthesis system.

关键词： Speech synthesis Text analysis Training Analytical models Hidden Markov models Dictionaries

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：