检索结果-内蒙古大学图书馆

DASVDD: deep Autoencoding Support Vector Data Descriptor for Anomaly Detection

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 2024年第8期36卷 3739-3750页

作者： Hojjati, Hadi Armanfard, Narges McGill Univ Dept Elect & Comp Engn Montreal PQ H3A 0C3 Canada Mila Quebec AI Inst QC H2S Montreal PQ Canada

One-Class anomaly detection aims to detect anomalies from normal samples using a model trained on normal data. With recent advancements in deep learning, researchers have designed efficient one-class anomaly detection methods. Existing works commonly use neural networks to map the data into a more informative representation and then apply an anomaly detection algorithm. In this paper, we propose a method, DASVDD, that jointly learns the parameters of an autoencoder while minimizing the volume of an enclosing hypersphere on its latent representation. We propose a novel anomaly score that combines the autoencoder's reconstruction error and the distance from the center of the enclosing hypersphere in the latent representation. Minimizing this anomaly score aids us in learning the underlying distribution of the normal class during training. Including the reconstruction error in the anomaly score ensures that DASVDD does not suffer from the hypersphere collapse issue since the DASVDD model does not converge to the trivial solution of mapping all inputs to a constant point in the latent representation. Experimental evaluations on several benchmark datasets show that the proposed method outperforms the commonly used state-of-the-art anomaly detection algorithms while maintaining robust performance across different anomaly classes.

关键词： Anomaly detection deep autoencoder deep learning support vector data descriptor deep autoencoder deep learning support vector data descriptor

来源：评论

学校读者我要写书评

暂无评论

Adaptive fish school search optimized resnet for multi-view 3D objects reconstruction

引用

MULTIMEDIA TOOLS AND APPLICATIONS 2024年第32期83卷 77639-77666页

作者： Premalatha, V. Parveen, Nikhat Koneru Lakshmaiah Educ Fdn Dept Comp Sci & Engn Guntur Andhra Pradesh India

Reconstruction of multi-view 3-dimensional images is essential in robotics and computer vision to obtain an accurate 3-dimensional representation of objects by analyzing the 2-dimensional input data. For reconstructing the 3-dimensional image, it is mandatory to analyze the 3-dimensional geometry features from multiple viewpoints. It includes feature extraction and transformation from 2-dimensional features to 3-dimensional volumetric meshes. However, the existing research cannot produce consistent reconstruction results for the same input images with different orders. Therefore, the deep learning-based Residual Network-50 model is developed for 3-dimensional image reconstruction from multi-view images in the present work. The proposed system model comprises a 2-dimensional and 3-dimensional network and a backpropagation layer. From the input image, 2-dimensional features are computed using a 2-dimensional network. Then, the metaheuristic Adaptive School of Fish Optimization is used to improve the neural network's output, determining the optimal weight that gives less classification error. Then, the testing process uses the deep autoencoder, which decodes the output of the training model. Residual Network-50 is used to reconstruct 2-dimensional images to 3-dimensional using single or multi-views. Finally, the experimental analysis is performed in Python. The experiment is performed on the ShapeNet dataset and compared with the existing works. The proposed model yields better accuracy, F-score and Intersection-over-Union values of 99.3%, 0.893 and 0.734, respectively, which is more efficient than other existing models.

关键词： Object reconstruction Fish school search optimization deep autoencoder Back projection layer 3D reconstructed images ResNet-50

来源：评论

学校读者我要写书评

暂无评论

Multi-path long-term vessel trajectories forecasting with probabilistic feature fusion for problem shifting

引用

OCEAN ENGINEERING 2024年 312卷

作者： Spadon, Gabriel Kumar, Jay Eden, Derek van Berkel, Josh Foster, Tom Soares, Amilcar Fablet, Ronan Matwin, Stan Pelot, Ronald Dalhousie Univ Fac Comp Sci Halifax NS Canada Dalhousie Univ Ind Engn Dept Halifax NS Canada IMT Atlantique Dept Genie Math & Elect Brest France Linnaeus Univ Dept Comp Sci & Media Tech Vaxjo Sweden Polish Acad Sci Inst Comp Sci Warsaw Poland DHI Water & Environm Inc Ottawa ON Canada

This paper presents a deep auto-encoder model and a phased framework approach to predict the next 12 h of vessel trajectories using 1 to 3 h of Automatic Identification System data as input. The strategy involves fusing spatiotemporal features from AIS messages with probabilistic features engineered from historical AIS data to reduce forecasting uncertainty. The probabilistic features have an F1-Score of approximately 85% and 75% for the vessel route and destination prediction, respectively. Under such circumstances, we achieved an R2 Score of over 98% with different layer structures and varying feature combinations;the high R2 Score is a natural outcome of the well-defined shipping lanes in the study region. However, our proposal stands out among competing approaches as it demonstrates the capability of complex decision-making during turnings and route selection. Furthermore, we have shown that our model achieves more accurate forecasting with average and median errors of 11km and 6km, respectively, a 25% improvement from the current state-ofthe-art approaches. The resulting model from this proposal is deployed as part of a broader Decision Support System to safeguard whales by preventing the risk of vessel-whale collisions under the smartWhales initiative and acting on the Gulf of St. Lawrence in Atlantic Canada.

关键词： Feature fusion Probabilistic modeling deep autoencoder Spatiotemporal forecasting Phased framework approach Trajectory reconstruction

来源：评论

学校读者我要写书评

暂无评论

Face Hallucination From New Perspective of Non-Linear Learning Compressed Sensing

引用

IEEE ACCESS 2020年 8卷 9434-9440页

作者： Yang, Shuyuan Hao, Xiaoyang Liu, Zhi Yang, Chen Wang, Min Xidian Univ Sch Artificial Intelligence Xian 710071 Peoples R China Xidian Univ Key Lab Radar Signal Proc Xian 710071 Peoples R China

The past decade has witnessed a prosperity of sparsity-inspired face hallucination methods that use sparse prior and instances to generate High-Resolution (HR) faces. However, they need numerous Low-Resolution (LR) and HR instance pairs and adopt approximate sparse coding, which will bring bias to the recovery and suffer from high computational burden. In this paper we advance a Single Face Image Hallucination (SFIH) method from a new perspective of Non-linear Learning Compressive Sensing (NLCS), which can recover HR faces from a surprisingly small number of HR faces. The nonlinear sparse coding of facial images is explored, and a deep autoencoder (DAE) network is constructed for learning a kernel function from a single HR instance set. SFIH is then reduced to an analytic compressive recovery problem by reformulating linear sparse coding as a nonlinear DAE model. By exploring the nonlinear sparsity in the feature space, NLCS can accurately and rapidly recover HR facial images with large magnification factor and exhibit robustness to LR-HR instance pairs mapping. Some experiments are taken on realizing 3X, 6X, 9X amplification of face images, and the results prove its efficiency and superiority to its counterparts.

关键词： Face hallucination nonlinear sparse coding non-linear learning compressed sensing deep autoencoder

来源：评论

学校读者我要写书评

暂无评论

Unsupervised deep Clustering With Hard Balanced Constraint: Application in Disciplinary-Focused Student Section Formation

引用

IEEE ACCESS 2024年 12卷 98239-98253页

作者： Chantamunee, Siripinyo Thamrongrat, Pornpon Thanathamathee, Putthiporn Chaisriya, Kannattha Nizam, Dinna Nina Mohd Walailak Univ Informat Innovat Ctr Excellence Nakhon Si Thammarat 80160 Thailand Walailak Univ Sch Engn & Technol Nakhon Si Thammarat 80160 Thailand Walailak Univ Sch Informat Nakhon Si Thammarat 80160 Thailand Univ Malaysia Sabah Fac Comp & Informat User Experience Res Grp Labuan 87008 Malaysia

Effective student group formation is crucial in higher education to foster collaborative learning environments. Grouping students by academic disciplines enhances peer-to-peer interactions and facilitates in-depth discussions on specialized topics. However, due to classroom space and resource constraints, it is challenging to accommodate all students from similar disciplines in one class. This necessitates a grouping method that can ensure a balanced distribution of students across available groups. Traditional K-means clustering, commonly used for this purpose, often results in inconsistent group sizes and fails to guarantee a balanced distribution of group members. Hard balanced clustering, which strictly enforces precise size limits on each cluster, offers a promising alternative for organizing balanced student sections to optimize classroom utilization. Nonetheless, most hard balanced clustering methods are limited in feature learning capability, which can lead to the overlooking of significant data patterns and result in ineffective clustering. To address this limitation, this paper introduces a new unsupervised model, deep Hard Balanced Clustering (DHBC), which integrates hard balanced clustering with a deep learning framework to enhance feature learning. DHBC incorporates a balanced clustering mechanism within the optimization process of an autoencoder architecture. It enhances the generated latent space representation by introducing a joint loss function that combines reconstruction and balanced clustering objectives, ensuring the embedded representation supports a balanced distribution of students. The model optimizes balanced clustering centroids during training. Comparative experiments conducted on real-world student enrollment datasets, evaluated by WCSS scores, demonstrate DHBC's superiority in creating more cohesive and balanced student groups compared to state-of-the-art methods.

关键词： Clustering algorithms Linear programming Feature extraction Representation learning Decoding deep learning Computer architecture Encoding Hard balanced clustering balanced clustering deep clustering deep autoencoder group formation

来源：评论

学校读者我要写书评

暂无评论

D³FC: deep feature-extractor discriminative dictionary-learning fuzzy classifier for medical imaging

引用

APPLIED INTELLIGENCE 2022年第7期52卷 7201-7217页

作者： Ghasemi, Majid Kelarestaghi, Manoochehr Eshghi, Farshad Sharifi, Arash Islamic Azad Univ Dept Comp Engn Farsan Branch Farsan *** Iran Kharazmi Univ Dept Elect & Comp Engn Tehran *** Iran

Providing accurate and speedy diagnosis and, in turn, treatment, automated medical image analysis plays a significant role in survival rate improvement. Inherent different kinds of uncertainties and complexities prove machine learning-based, particularly dictionary-learning-based classification approaches, very promisingly. This work concerns class-specific fuzzy discriminative dictionary learning using deep features on the continuum of our machine-learning-based medical image classifiers' evolution path. In (DFC)-F-3, a deep autoencoder generates a more relevant, representative, and compact features set. The distinctive-hidden information and inherent complexity and uncertainty of medical images are addressed using fuzzy-discriminative terms in the optimization function, simultaneously improving the inter-class-representation distance and intra-class-representation similarity. A comprehensive set of experiments on cancer tumor images from three different databases shows the outperformance of (DFC)-F-3 over related state-of-the-art competitions in accuracy, sensitivity, specificity, precision, convergence speed, and noise resilience. The meaningfulness of the experiments' results is statistically verified.

关键词： Class-specific deep autoencoder Discriminative sparse representation Fuzzy dictionary learning Intra- Inter-class distance

来源：评论

学校读者我要写书评

暂无评论

deep Rating and Review Neural Network for Item Recommendation

引用

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022年第11期33卷 6726-6736页

作者： Xi, Wu-Dong Huang, Ling Wang, Chang-Dong Zheng, Yin-Yu Lai, Jian-Huang Sun Yat Sen Univ Sch Comp Sci & Engn Guangzhou 510006 Peoples R China Guangdong Prov Key Lab Computat Sci Guangzhou 518055 Peoples R China Minist Educ Key Lab Machine Intelligence & Adv Comp Guangzhou 510006 Peoples R China South China Agr Univ Coll Math & Informat Guangzhou 510642 Peoples R China Sun Yat Sen Univ Lingnan Univ Coll Guangzhou 510275 Peoples R China

To alleviate the sparsity issue, many recommender systems have been proposed to consider the review text as the auxiliary information to improve the recommendation quality. Despite success, they only use the ratings as the ground truth for error backpropagation. However, the rating information can only indicate the users' overall preference for the items, while the review text contains rich information about the users' preferences and the attributes of the items. In real life, reviews with the same rating may have completely opposite semantic information. If only the ratings are used for error backpropagation, the latent factors of these reviews will tend to be consistent, resulting in the loss of a large amount of review information. In this article, we propose a novel deep model termed deep rating and review neural network (DRRNN) for recommendation. Specifically, compared with the existing models that adopt the review text as the auxiliary information, DRRNN additionally considers both the target rating and target review of the given user-item pair as ground truth for error backpropagation in the training stage. Therefore, we can keep more semantic information of the reviews while making rating predictions. Extensive experiments on four publicly available datasets demonstrate the effectiveness of the proposed DRRNN model in terms of rating prediction.

关键词： Predictive models Semantics Recommender systems Collaboration Backpropagation Neural networks Electronic mail deep autoencoder deep neural networks (DNNs) ratings and reviews recommender systems

来源：评论

学校读者我要写书评

暂无评论

Efficient quality enhancement of gastrointestinal endoscopic video by a novel method of color salient bilateral filtering

引用

MULTIMEDIA TOOLS AND APPLICATIONS 2021年第4期80卷 6235-6245页

作者： Das, Apurba Shylaja, S. S. PES Univ Comp Sci & Engn Bangalore Karnataka India

The recent advancements in bio-photonics enabled physicians to combine techniques such as narrow-band imaging, fluorescence spectroscopy, optical coherence tomography, with visible spectrum endoscopy video to provide in vivo microscopic tissue characterization in online optical biopsy (Ye et al.2015);(Wang and Van Dam2004). Despite the aforementioned advantages, it is challenging for gastroenterologists to retarget the optical biopsy sites during endoscopic examinations because of the degraded quality of endoscopic video which gets corrupted by haze, noise, oversaturated illumination, etc. Enhancement of video frames by considering color channels independently gives birth to unintended phantom color due to its ignorance of the psycho-visual correspondence. To address the aforementioned, we have proposed a novel algorithm to enhance video with faster performance. The proposedC(2)D(2)A(Cross Color Dominant deep autoencoder) uses the strength of (a) bilateral filtering both in spatial neighborhood domain and psycho-visual range;(b) deep autoencoder which learns salient patterns. The domain-based color sparseness has further improved the performance, modulating classical deep autoencoder to color dominant deep autoencoder. The work has shown promise towards not only a generic framework of quality enhancement of video streams but also addressing performance. The current work in turn improves the image and video analytics like segmentation, detection, and tracking the objects or regions of interest.

关键词： Endoscopy deep autoencoder Color sparseness Bilateral filtering

来源：评论

学校读者我要写书评

暂无评论

Audio-visual feature fusion via deep neural networks for automatic speech recognition

引用

DIGITAL SIGNAL PROCESSING 2018年 82卷 54-63页

作者： Rahmani, Mohammad Hasan Almasganj, Farshad Seyyedsalehi, Seyyed Ali Amirkabir Univ Technol Biomed Engn Dept Hafez Ave Tehran Iran

The brain-like functionality of the artificial neural networks besides their great performance in various areas of scientific applications, make them a reliable tool to be employed in Audio-Visual Speech Recognition (AVSR) systems. The applications of such networks in the AVSR systems extend from the preliminary stage of feature extraction to the higher levels of information combination and speech modeling. In this paper, some carefully designed deep autoencoders are proposed to produce efficient bimodal features from the audio and visual stream inputs. The basic proposed structure is modified in three proceeding steps to make better usage of the presence of the visual information from the speakers' lips Region of Interest (ROI). The performance of the proposed structures is compared to both the unimodal and bimodal baselines in a professional phoneme recognition task, under different noisy audio conditions. This is done by employing a state-of-the-art DNN-HMM hybrid as the speech classifier. In comparison to the MFCC audio-only features, the finally proposed bimodal features cause an average relative reduction of 36.9% for a range of different noisy conditions, and also, a relative reduction of 19.2% for the clean condition in terms of the Phoneme Error Rates (PER). (C) 2018 Elsevier Inc. All rights reserved.

关键词： Audio-visual speech recognition deep autoencoder deep neural networks Feature extraction Multimodal information processing

来源：评论

学校读者我要写书评

暂无评论

An Intelligent Speech Multifeature Recognition Method Based on deep Machine Learning: A Smart City Application

引用

JOURNAL OF TESTING AND EVALUATION 2024年第3期52卷 1389-1403页

作者： Song, Ye Yan, Kai Guangdong Vocat & Tech Coll Posts & Telecom Dept Econ Management 191 Zhongshan Rd Guangzhou 510000 Peoples R China Guangdong Vocat & Tech Coll Posts & Telecom Sch Mobile Commun 191 Zhongshan Rd Guangzhou 510000 Peoples R China

Speech recognition has the problem of low recognition accuracy because of poor denoising effect and low endpoint detection accuracy. Therefore, a new intelligent speech multifeature recognition method based on deep machine learning is proposed. In this method, speech signals are digitally processed, a first-order finite impulse response (FIR) high pass digital filter is used to preemphasize digital speech signals, and short-term energy and zero crossing rate are combined to detect speech signals to expand endpoints. The detected speech signal is input into the depth autoencoder, and the features of the speech signal are extracted through deep learning. The Gaussian mixture model of deep machine learning is constructed using a continuous distribution hidden Markov model, and the extracted features are input into the model to complete feature recognition. The experimental results show that the proposed method has high endpoint detection accuracy, good denoising effect, and high recognition accuracy, and this method has higher application value.

关键词： deep machine learning first-order FIR high pass digital filter deep autoencoder endpoint detec-tion Gaussian mixture model speech feature recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：