检索结果-内蒙古大学图书馆

International Conference on Electrical and Electronics Engineering, ELECO

作者： Filiz Gürkan Bilge Günsel Deniz Kumlu Multimedia Signal Processing and Pattern Recognition Group Istanbul Technical University Turkey

We propose a method that performs dense motion classification integrated with particle filter tracking for monitoring whether the viewer is involved in the screened content or not. We first perform the color based particle filtering that enables us tracking head of the user through the video sequence. It is followed by optical flow estimation via SIFT flow applied on the tracked regions. Finally the features extracted based on the viewer head rotation and location are fed into the random forest classifier to report the involvement level of the tracked person. It is shown that the used probabilistic motion estimation model with the support of tracking significantly reduces the computational complexity while it provides comparable performance with the state-of-the-art methods. The proposed scheme allows online monitoring the viewer therefore can be integrated to the interactive multimedia systems.

关键词： Head Image motion analysis Computer vision Tracking Particle filters Magnetic heads Motion estimation

来源：评论

学校读者我要写书评

暂无评论

AN ADAPTIVE TIME-FREQUENCY RESOLUTION FRAMEWORK FOR SINGLE CHANNEL SOURCE SEPARATION BASED ON NON-NEGATIVE TENSOR FACTORIZATION

AN ADAPTIVE TIME-FREQUENCY RESOLUTION FRAMEWORK FOR SINGLE C...

引用

IEEE International Conference on Acoustics, Speech, and signal processing

作者： S. Kirbiz B. Gunsel Multimedia Signal Processing and Pattern Recognition Group. Dept. of Electronics and Comm. Eng. Istanbul Technical University Turkey

ISBN: (纸本)9781479903573

In this paper, we propose an adaptive time-frequency resolution based single channel sound source separation method using Non-negative Tensor Factorization (NTF). The model aims to alleviate drawbacks of working by fixed length Short Time Fourier Transform (STFT) by minimizing the smearing of signal energy in both time and frequency. A joint optimization scheme has been applied based on KL-divergence where each layer of the tensor represents the mixture at a different resolution. In order to enclose sparseness into factorization, the resynthesis is made through an adaptive weighted fusion procedure which combines the separated sources in a manner that maximizes the energy concentration. Test results reported over a large sound database indicate the introduced NTF based fusion method improves the sound quality both in terms of conventional and perceptual distortion measures.

关键词： Tensor factorization Perceptual Distortion STFT fusion procedure signal energy Resolving power Voice Quality fusion method frequency resolution

来源：评论

学校读者我要写书评

暂无评论

SLEEPINESS DETECTION FROM SPEECH BY PERCEPTUAL FEATURES

SLEEPINESS DETECTION FROM SPEECH BY PERCEPTUAL FEATURES

引用

IEEE International Conference on Acoustics, Speech and signal processing

作者： Bilge Gunsel Cenk Sezgin Jarek Krajewski Multimedia Signal Processing and Pattern Recognition Group Istanbul Technical Univ. Turkey Experimental Industrial Psychology Univ. of Wuppertal Germany

ISBN: (纸本)9781479903573

We propose a two-class classification scheme with a small number of features for sleepiness detection. Unlike the conventional methods that rely on the linguistics content of speech, we work with prosodic features extracted by psychoacoustic masking in spectral and temporal domain. Our features also model the variations between non-sleepy and sleepy modes in a quasi-continuum space with the help of code words learned by a bag-of-features scheme. These improve the unweighted recall rates for unseen people and minimize the language dependence. Recall rates reported based on Karolinska Sleepiness Scale (KSS) for Support Vector Machine and Learning Vector Quantization classifiers show that the developed system enables us monitoring sleepiness efficiently with a lower complexity compared to the reported benchmarking results for Sleepy Language Corpus.

关键词： Speech Sleepiness Linguistics Pragmatics Support Vector Network Retrieval recall codeword recall Mental Recall quasi-continuum learning vector quantization

来源：评论

学校读者我要写书评

暂无评论

ITU MSPR TRECVID 2010 video copy detection system

ITU MSPR TRECVID 2010 video copy detection system

引用

TREC Video Retrieval Evaluation, TRECVID 2010

作者： Kutluk, Sezer Gunsel, Bilge Multimedia Signal Processing and Pattern Recognition Group Department of Electronics and Communications Engineering Istanbul Technical University Maslak/Istanbul 34496 Turkey

In this paper we describe the system designed by the ITU MSPR group.for content based video fingerprinting as applied to the TRECVID 2010 Content Based Copy Detection (CBCD) benchmark. This year focus of the system was on integration of audio and video fingerprinting to improve the robustness to attacks. The proposed system consists of three main modules: Audio/video fingerprint extraction, audio/video search and retrieval, and audiovisual decision fusion. We propose a video feature extraction scheme based on the Nonnegative Matrix Factorization (NMF) which is an efficient dimension reduction technique in video processing. Video fingerprint generation module takes the factorization matrices generated by NMF as its input and converts them to binary hashes by differencial coding [1, 2]. For audio data we perform an audio fingerprinting method that is similar to the one proposed in [3]. Extracted audio and video hashes are indexed into a database. Searching module first applies a hash matching procedure to locate potential matching points both in audio and video. This is followed by decision fusion that eliminates false alarms and finalizes the matching and retrieval.

关键词： Matrix algebra

来源：评论

学校读者我要写书评

暂无评论

Istanbul technical university at TRECVID 2008

Istanbul technical university at TRECVID 2008

引用

TREC Video Retrieval Evaluation, TRECVID 2008

作者： Gursoy, O. Gunsel, B. Multimedia Signal Processing and Pattern Recognition Group Department of Electronics and Communications Engineering Istanbul Technical University Maslak/Istanbul 34496 Turkey

ITU MSPR group.participates the TREC Video Retrieval Evaluation (TRECVID) in Content Based Copy Detection (CBCD) task. The system proposed by ITU MSPR consists of two main modules: Extraction of video fingerprints and search/retrieval. We propose a feature extraction scheme based on the Nonnegative Matrix Factorization(NMF)[1], which is an efficient dimension reduction technique in video processing[2]. Video fingerprint generation module takes the factorization matrices generated by NMF as its input and converts them to binary hashes by differencial coding. Extracted hashes are indexed into a database. Searching module first applies a hash matching procedure to locate potential matching points. It is followed by temporal merging that eliminates false alarms while combining subsegments. Initial results are promising for insertion of pattern, reencoding, blurring, change of gamma and noise addition. Future work will include impoving the current results and searching for robustness to geometric transformations such as shift, crop, flip and picture-in-picture.

关键词： Matrix algebra

来源：评论

学校读者我要写书评

暂无评论

A Region-Based Representation of Images in MARS

引用

Journal of VLSI signal processing Systems for signal, Image, and Video Technology 1998年第1-2期20卷 137-150页

作者： Servetto, Sergio D. Rui, Yong Ramchandran, Kannan Huang, Thomas S. Beckman Inst. Adv. Sci. and Technol. Univ. Illinois at Urbana-Champaign Urbana IL 61801 United States Universidad Nacional de La Plata Argentina Univ. Illinois at Urbana-Champaign United States Comp. Res. Adv. Applications Group IBM Argentina Argentina Image Formation and Processing Group Beckman Institute UIUC United States Department of Computer Science UNLP Argentina Dept. of Elec. and Comp. Engineering UIUC United States Multimedia Commun. Res. Department Bell Laboratories Murray Hill NJ United States Info. Sciences Research Department AT and T Labs. Florham Park NJ United States Department of Computer Science UIUC United States Southeast University China Tsinghua University China University of Illinois Urbana-Champaign IL United States Image Formation and Processing Group Beckman Inst. Advance Sci. Technol. UIUC United States Vis. Technol. Grp. of Microsoft Res. Redmond WA United States City College of New York United States Columbia University United States AT and T Bell Labs. United States Ctr. for Telecommunications Research Columbia University United States Elec. and Comp. Eng. Department United States Beckman Institute Coordinated Science Laboratory IL United States IEEE Signal Processing Society United States IEEE IMDSP Technical Committee United States IEEE Transactions on Image Proc. United States National Taiwan University Taipei Taiwan Massachusetts Inst. of Technology Cambridge MA United States Department of Electrical Engineering MIT United States School of Electrical Engineering United States Lab. for Info. and Signal Processing Purdue University United States Dept. of Elec. and Comp. Engineering United States Coordinated Science Laboratory United States Image Formation and Processing Group Beckman Inst. Adv. Sci. and Technol. United States MIT Lincoln Laboratory IBM Thomas J. Watson Research Center Rheinishes Landes Museum Bonn Germany Swiss Institutes of Technology Zurich Switzerland Swiss Institutes of Technology Lausanne S

We study the problem of representing images within a multimedia Database Management System (DBMS), in order to support fast retrieval operations without compromising storage efficiency. To achieve this goal, we propose new image coding techniques which combine a wavelet representation, embedded coding of the wavelet coefficients, and segmentation of image-domain regions in the wavelet domain. A bitstream is generated in which each image region is encoded independently of other regions, without having to explicitly store information describing the regions. Simulation results show that our proposed algorithms achieve coding performance which compares favorably, both perceptually and objectively, to that achieved using state-of-the-art image/video coding techniques while additionally providing region-based support.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：