检索结果-内蒙古大学图书馆

19th Annual Conference of the International-Speech-Communication-Association (INTERSPEECH 2018)

作者： Su, Rongfeng Liu, Xunying Wang, Lan Chinese Acad Sci Shenzhen Inst Adv Technol CAS Key Lab Human Machine Intelligence Synergy Sy Beijing Peoples R China Univ Chinese Acad Sci Shenzhen Coll Adv Technol Beijing Peoples R China Chinese Univ Hong Kong Hong Kong Peoples R China

ISBN: (纸本)9781510872219

visual information can be incorporated into automatic speech recognition (ASR) systems to improve their robustness in adverse acoustic conditions. Conventional audio-visual speech recognition (AVSR) systems require highly specialized audio-visual (AV) data in both system training and evaluation. For many real-world speech recognition applications, only audio information is available. This presents a major challenge to a wider application of AVSR systems. In order to address this challenge, this paper proposes a semi-supervised visual feature learning approach for developing AVSR systems on a DARPA GALE Mandarin broadcast transcription task. Audio to visual feature inversion long short-term memory neural networks (LSTMs) were initially constructed using limited amounts of out of domain AV data. The acoustic features domain mismatch against the broadcast data was further reduced using multi-level domain adaptive deep networks. visual features were then automatically generated from the broadcast speech audio and used in both AVSR system training and testing time. Experimental results suggest a CNN based AVSR system using the proposed semi-supervised cross-domain audio-to-visual feature generation technique outperformed the baseline audio only CNN ASR system by an average CER reduction of 6.8% relative. In particular. on the most difficult Phoenix TV subset, a CER reduction of 1.32% absolute (8.34% relative) was obtained.

关键词： audio-visual speech recognition semi supervised visual feature learning domain adaptation

来源：评论

学校读者我要写书评

暂无评论

DELVING INTO THE EXPLAINABILITY OF PROTOTYPE-BASED CNN FOR BIOLOGICAL CELL ANALYSIS 31

DELVING INTO THE EXPLAINABILITY OF PROTOTYPE-BASED CNN FOR B...

引用

2024 International Conference on Image Processing

作者： Blanchard, Martin Delezay, Olivier Ducottet, Christophe Muselet, Damien Univ Jean Monnet St Etienne CNRS Inst Optique Grad Sch Lab Hubert CurienUMR 5516 St Etienne France Univ Jean Monnet Lab Sainboise INSERM UMR 1059 St Etienne France

ISBN: (纸本)9798350349405;9798350349399

Deep learning for automated cell imaging analysis has become a tool of choice to process large amounts of data. But many of these methods lack explainability, slowing down their deployment for tasks such as diagnosis. We present a prototype-based framework to analyze structural changes which addresses the specific challenges of explainability in the context of cell imaging. Our method relies on classification between two distinct cell populations in a weakly supervised context where no label for individual cells is available. Our model extracts typical features from each population, representing intra-cellular structure, and provides an explanation on its classification decision by creating visualization of the local textures corresponding to the structures of interest. We show a real application where it effectively highlights a change in the organization of the actin content of the cells.

关键词： visual feature learning Biocellular Imaging Explainable AI Deep Neural Networks

来源：评论

学校读者我要写书评

暂无评论

learning visual features from Product Title for Image Retrieval 20

Learning Visual Features from Product Title for Image Retrie...

引用

28th ACM International Conference on Multimedia (MM)

作者： Feng, Fangxiang Niu, Tianrui Li, Ruifan Wang, Xiaojie Jiang, Huixing Beijing Univ Posts & Telecommun Sch Artificial Intelligence Beijing Peoples R China Meituan Dianping Grp Beijing Peoples R China

ISBN: (纸本)9781450379885

There is a huge market demand for searching for products by images in e-commerce sites. visual features play the most important role in solving this content-based image retrieval task. Most existing methods leverage pre-trained models on other large-scale datasets with well-annotated labels, e.g. the ImageNet dataset, to extract visual features. However, due to the large difference between the product images and the images in ImageNet, the feature extractor trained on ImageNet is riot efficient in extracting the visual features of product images. And retraining the feature extractor on the product images is faced with the dilemma of lacking the annotated labels. In this paper, we utilize the easily accessible text information, that is, the product title, as a supervised signal to learn the features of the product image. Specifically, we use the n-grams extracted from the product title as the label of the product image to construct a dataset for image classification. This dataset is then used to fine-tuned a pre-trained model. Finally, the basic max-pooling activation of convolutions (MAC) feature is extracted from the fine-tuned model. As a result, we achieve the fourth position in the Grand Challenge of AI Meets Beauty in 2020 ACM Multimedia by using only a single ResNet-50 model without any human annotations and pre-processing or post-processing tricks. Our code is available at: https://***/FanpdangFeng/AI- Meets-Beauty- 2020.

关键词： visual feature learning Bag of n-grams Image retrieval CNN MAC

来源：评论

学校读者我要写书评

暂无评论

Deep Co-Space: Sample Mining Across feature Transformation for Semi-Supervised learning

引用

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 2018年第10期28卷 2667-2678页

作者： Chen, Ziliang Wang, Keze Wang, Xiao Peng, Pai Izquierdo, Ebroul Lin, Liang Sun Yat Sen Univ Sch Data & Comp Sci Guangzhou 510006 Guangdong Peoples R China Minist Educ Engn Res Ctr Adv Comp Engn Software Guangzhou 510006 Guangdong Peoples R China Hong Kong Polytech Univ Dept Comp Hong Kong Hong Kong Peoples R China Anhui Univ Sch Comp Sci Hefei 100044 Anhui Peoples R China Tencent Youtu Lab Shanghai 200233 Peoples R China Queen Mary Univ London Sch Elect Engn & Comp Sci London E1 4NS England

Aiming at improving the performance of visual classification in a cost-effective manner, this paper proposes an incremental semi-supervised learning paradigm called deep co-space (DCS). Unlike many conventional semi-supervised learning methods usually performed within a fixed feature space, our DCS gradually propagates information from labeled samples to unlabeled ones along with deep feature learning. We regard deep feature learning as a series of steps pursuing feature transformation, i.e., projecting the samples from a previous space into a new one, which tends to select the reliable unlabeled samples with respect to this setting. Specifically, for each unlabeled image instance, we measure its reliability by calculating the category variations of feature transformation from two different neighborhood variation perspectives and merged them into a unified sample mining criterion deriving from Hellinger distance. Then, those samples keeping stable correlation to their neighboring samples (i.e., having small category variation in distribution) across the successive feature space transformation are automatically received labels and incorporated into the model for incrementally training in terms of classification. Our extensive experiments on standard image classification benchmarks (e.g., Caltech-256 and SUN-397) demonstrate that the proposed framework is capable of effectively mining from large-scale unlabeled images, which boosts image classification performance and achieves promising results compared with other semi-supervised learning methods.

关键词： Cost-effective model visual classification deep semi-supervised learning incremental processing visual feature learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：