检索结果-内蒙古大学图书馆

2nd International Symposium on Artificial Intelligence for Medicine Sciences, ISAIMS 2021

作者： Chen, Qian Li, Min Chen, Cheng Chen, Chen Lv, Xiaoyi College of Software Key Laboratory of Software Engineering Technology Key Laboratory of Signal Detection and Processing Xinjiang University Urumqi China College of Information Science and Engineering Xinjiang University Urumqi China

ISBN: (纸本)9781450395588

As a kind of cancer with high incidence, skin cancer seriously threatens people's life and health. Early detection, early diagnosis, and early treatment are one of the effective ways to increase the survival rate of patients with skin diseases. Therefore, this paper used the ResNet50 model as a feature extractor based on deep learning, and combined machine learning algorithms to explore the heterogeneity among multi-type of skin cancer, and tried to construct the best performance computer-aided diagnosis model. The research first used preprocessing techniques such as hair noise removal and data enhancement, and then achieved a classification accuracy of 90.497% on the public data set of ISIC2019. The proposed model effectively relieves the huge work pressure of dermatologists, and provides a reference for the better use of dermoscopic images for intelligent diagnosis and classification of skin cancer. © 2021 ACM.

关键词： Computer aided diagnosis

来源：评论

学校读者我要写书评

暂无评论

A Methodological Study of Document Layout Analysis

A Methodological Study of Document Layout Analysis

引用

Virtual Reality, Human-Computer Interaction and Artificial Intelligence (VRHCIAI), International Conference on

作者： Chunhu Zhang Mayire Ibrayim Askar Hamdulla Xinjiang Key Laboratory of Signal Detection and Processing Urumqi China School of Information Science and Engineering Xinjiang University Xinjiang China

Document layout analysis is an important part of document information processing systems, which is essential for many applications such as optical character recognition (OCR) systems, machine translation, information retrieval, and document structured data extraction, as well as for digitizing paper documents and classifying and identifying document image regions. Document-like images contain a wealth of information, and in order to automatically extract and classify regions of interest in document images, the document images are programmed to analyze the layout content for subsequent OCR and automatic transcription. However, the proposed algorithms still have more limitations due to various document layouts and variations of block positions, inter-class and within-class variations, and background noise. This paper first summarizes the traditional learning algorithms based on tour smoothing and segmentation projection, deep learning algorithms using recurrent convolutional neural networks and twin networks, and algorithms combining traditional learning and deep learning proposed in recent years. The current mainstream algorithms and common datasets in experiments for deep learning and their access are highlighted. As well as the comparison of some algorithms on benchmark datasets, and some experimental results with good robustness are given. Finally, the future research areas are prospected for further development.

关键词： Deep learning Text analysis Smoothing methods Layout Optical character recognition Virtual reality Information processing

来源：评论

学校读者我要写书评

暂无评论

A Speaker Recognition Method Based on Stable Learning

A Speaker Recognition Method Based on Stable Learning

引用

International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Jian Zhang Jing Ma Xiaochen Guo Lin Li Liang He School of Computer Science and Technology Xinjiang University China Xinjiang Key Laboratory of Signal Detection and Processing China School of Electronic Science and Engineering Xiamen University China Department of Electronic Engineering Tsinghua University China

With the development of deep learning, speaker recognition systems have shown increasingly better performance. The generalization ability of the models is also an important aspect of performance evaluation. Typically, a baseline system is used to compare against the improved models to demonstrate performance enhancements. However, we cannot determine the differences in learned voiceprint features between the improved models and the baseline system. This paper introduces an improved speaker recognition system based on the ECAPA-TDNN model. It utilizes stable learning to eliminate sample correlation and employs attribution analysis to compare the differences in voiceprint feature learning between the improved and baseline systems. Experimental results demonstrate that stable learning improves the model’s generalization performance and helps it learn better voiceprint features. The effectiveness and generalization capability of the proposed method are verified through experiments on the VoxCeleb, CNCeleb, and LibriSpeech datasets. This work is important for enhancing speaker recognition performance, analyzing differences in voiceprint feature learning, and promoting advancements in the field.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Sub-word based unsupervised bilingual dictionary induction for Chinese-Uyghur

Sub-word based unsupervised bilingual dictionary induction f...

引用

International Conference on Asian Language processing (IALP)

作者： Anwar Aysa Mijit Ablimit Hankiz Yilahun Askar Hamdulla Xinjiang Key Laboratory of Signal Detection and Processing School of Information Science and Engineering Xinjiang University of China Urumqi Xinjiang China

ISBN: (数字)9781665476744

ISBN: (纸本)9781665476751

In this paper, we focus on the task of bilingual dictionary induction for the Chinese-Uyghur language pair. Usually, correlating long-distance linguistic information requires cross-linguistic information as supervision, which often requires parallel corpora to link in seed lexicons. And the parallel corpora are expensive. The low-resource Uyghur language text data are only available in a small amount, and the derivative morphological structure is vibrant and complex. In bilingual processing aligning most similar units and entity stems is the first step. So separating sentences into morpheme sequences is essential in the cross-lingual processing tasks. Uyghur words in text sentences consist of stems joined with several suffixes/prefixes. Rich and complex multiple affix forms exist in the text, forming many derivative words. This situation can easily lead to an increase in the repetition rate of intentional features in the text, which affects the efficiency of bilingual dictionary extraction. In this work, we actively explore the resource construction and granularity optimization of minority low-resource languages and learn cross-language word embeddings without the supervision of parallel data. A Chinese-Uyghur bilingual dictionary extraction method is proposed based on the neural network cross-language word embedding vector technology and the multilingual morphological analyzer. Experiments show that the way based on morpheme sequence significantly improved compared to the baseline model of the word sequence.

关键词： Dictionaries Error analysis Neural networks Linguistics Feature extraction Task analysis Unsupervised learning

来源：评论

学校读者我要写书评

暂无评论

Adaptive Gaussian Regularization Constrained Sparse Subspace Clustering for Image Segmentation

Adaptive Gaussian Regularization Constrained Sparse Subspace...

引用

International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Sensen Song Dayong Ren Zhenhong Jia Fei Shi School of Computer Science and Technology Xinjiang University Urumqi China Key Laboratory of Signal Detection and Processing Xinjiang Uygur Autonomous Region Urumqi China College of Mathematics and System Science Xinjiang University Urumqi China National Key Laboratory for Novel Software Technology Nanjing University Nanjing China

Sparse Subspace Clustering (SSC) is integral to image processing, drawing from spectral clustering foundations. However, prevalent methods, relying on an l 1 -norm constraint, fail to capture nuanced inter-region correlations, affecting segmentation efficacy. To remedy this, we introduce an Adaptive Gaussian Regularization Constrained SSC for enhanced image segmentation. This method begins with superpixel preprocessing to enrich local information. Given the Gaussian nature of the SSC’s sparse coefficient matrix, a Gaussian probability density function is infused as a regularization term, reinforcing regional image ties and facilitating similarity matrix creation. Using spectral clustering, we then define superpixel clusters leading to the final segmentation. When tested against the BSDS500 and SBD datasets and other leading algorithms, our model showcases marked improvements in natural image segmentation.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Speaker Recognition Based on Pre-Trained Model and Deep Clustering

Speaker Recognition Based on Pre-Trained Model and Deep Clus...

引用

IEEE International Conference on Multimedia and Expo (ICME)

作者： Liang He Zhida Song Shuanghong Liu Mengqi Niu Ying Hu Hao Huang School of Computer Science and Technology Xinjiang University Urumqi China School of Intelligence Science and Technology Xinjiang University Urumqi China Xinjiang Key Laboratory of Signal Detection and Processing Urumqi China Department of Electronic Engineering Tsinghua University Beijing China

ISBN: (数字)9798350390155

ISBN: (纸本)9798350390162

In this paper, we propose a novel loss by integrating a deep clustering (DC) loss at the frame-level and a speaker recognition loss at the segment-level into a single network without additional data requirements and exhaustive computation. The DC loss implicitly generates soft pseudo-phoneme labels for each frame-level feature, which facilitates extracting more discriminant speaker representation by suppressing phonetic content information. We study the DC loss not only on the acoustic feature, but also on the features extracted by the pre-trained models, such as wav2vec 2.0, HuBERT and WavLM. Experimental results on the VoxCeleb dataset shows that the overall system performance based on the pre-trained model features are better than the one on the acoustic feature. The proposed loss is significantly effective for systems on the acoustic feature and has a marginal improvement for systems on the pre-trained model feature.

关键词： Training Costs Computational modeling System performance Phonetics Multimedia databases Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Multi-View Speaker Embedding Learning for Enhanced Stability and Discriminability

Multi-View Speaker Embedding Learning for Enhanced Stability...

引用

International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Liang He Zhihua Fang Zuoer Chen Minqiang Xu Ying Meng Penghao Wang School of Computer Science and Technology Xinjiang University Urumqi China Xinjiang Key Laboratory of Signal Detection and Processing Urumqi China Department of Electronic Engineering Tsinghua University Beijing China iFly Digital Technology Hefei China

Deep neural network models based on x-vector have become the most popular framework for speaker recognition, and the quality of speaker features (embeddings) is important for open-set tasks such as speaker verification and speaker diarization. Currently, the most popular loss function is based on margin penalty, however, it only considers enlarging the inter-class distance while neglecting to reduce the intra-class feature differences. Therefore, we propose a multi-view learning approach that divides the training process into two views from the speaker embedding level. The classification view focuses on distinguishing the discriminability of different speakers, while the clustering view focuses on shrinking the feature boundaries of the same speaker, making intra-class differences smaller. The combined effect of the two perspectives achieves large inter-class distance and small intra-class distances, resulting in the extraction of more discriminative and stable speaker embeddings. We test the performance of the method on both speaker verification and speaker diarization tasks, and the results demonstrate the effectiveness of our approach.

关键词：

来源：评论

学校读者我要写书评

暂无评论

SMMA-Net: An Audio Clue-Based Target Speaker Extraction Network with Spectrogram Matching and Mutual Attention

SMMA-Net: An Audio Clue-Based Target Speaker Extraction Netw...

引用

International Conference on Acoustics, Speech, and signal processing (ICASSP)

作者： Ying Hu Haitao Xu Zhongcun Guo Hao Huang Liang He Key Laboratory of Signal Detection and Processing Xinjiang Urumqi China School of Computer Science and Technology Xinjiang University Urumqi China HiSilicon Technologies Co. Limited Department of Electronic Engineering Tsinghua University Beijing China

We propose a deep neural network with spectrogram matching and mutual attention (SMMA-Net) for audio clue-based target speaker extraction (TSE). To effectively use the auxiliary speech, we proposed spectrogram matching (SM) strategy and mutual attention (MA) block. We conducted all experiments on the WSJ0-2mix-extr dataset. The ablation and comparison studies verified the effectiveness of SM strategy and MA block. The experimental results show that our proposed method outperforms the state-of-the-art methods by a sizable margin of 1.3 dB on the metric of scale-invariant signal-to-distortion ratio improvement. Additionally, SMMA-Net achieved that the performance of model for TSE task exceeds that for speaker separation task under the similar architecture. The main code will be available at https://***/Ht-Xu/SMMA-Net.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Research on Disease Diagnosis Based on Teacher-Student Network and Raman Spectroscopy

SSRN

引用

SSRN 2023年

作者： Chen, Zishuo Tian, Xuecong Chen, Chen Lv, Xiaoyi Chen, Cheng College of Software Xinjiang University Xinjiang Urumqi830046 China College of Information Science and Engineering Xinjiang University Xinjian Urumqi830046 China Key Laboratory of Signal Detection and Processing Xinjiang University Xinjiang Urumqi830046 China

In the field of spectral analysis, the common Raman spectral feature selection model can extract features effectively, but it will change the original data *** teacher model assists the student model in distillation training by transferring the trained knowledge to the student *** this process, teacher models provide guidance and help student models better understand and master complex knowledge *** teacher-student network combined with Raman spectroscopy can perform feature selection on the premise of retaining the original features, and transfer the performance of the complex deep neural network structure to another lightweight network structure *** the construction of teacher-student network, this study of diabetic nephropathy and primary Sjogren's syndrome to verify, select Isomap, t-SNE, LLE, MDS, SE five models as the teacher network, build MLP neural network as the student network, multi-layer perceptron classification, and according to the evaluation index accuracy, precision, recall, F1 index to select the best *** 50% cross-validation, the results showed that the accuracy of t-SNE and SE models in the diagnosis of diabetic nephropathy was up to 98.3%, which was 14.02% higher than that of existing *** the diagnosis of primary Sjogren's syndrome, the accuracy of other models except MDS was 100%, which was 10.48% higher than that of existing ***, Raman spectroscopy combined with teacher-student networks for disease diagnosis is an efficient and accurate protocol that has produced good experimental results in the diagnosis of diabetic nephropathy and Sjogren's *** study provides a new way of thinking for the diagnosis of other diseases. © 2023, The Authors. All rights reserved.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Research on Disease Diagnosis Based on Teacher-Student Network and Raman Spectroscopy

SSRN

引用

SSRN 2023年

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：