In this paper,we introduce a new dataset for scene classification based on camera *** classify the most common scenes that have been researched much *** dataset consists of 12 scene *** category contains 500 to 2000 *...
详细信息
In this paper,we introduce a new dataset for scene classification based on camera *** classify the most common scenes that have been researched much *** dataset consists of 12 scene *** category contains 500 to 2000 *** images are high resolution such as 2000x2000 *** images in the dataset are original,namely,each image brings with a camera metadata (EXIF).Various types,metadata cues of photos,strict definitions among scenes are characteristic factors that make this dataset a very challenging testbed for photo *** supply the scene photos together with scene labeling,as well as the EXIF information extraction via methodology,and we apply the dataset into sementic scene classification up to now.
The a priori signal-to-noise (SNR) is one of the most important parameters in the short-time spectrum estimation techniques in speech enhancement. A new and convenient algorithm to estimate the priori SNR is involved ...
详细信息
To solve the frame delay problem and match the previous frame,Plapous et al.[IEEE Transactions on Audio,Speech,and Language Processing,2006,14(6):2098–2108]introduced a novel approach called two-step noise reduction(...
详细信息
To solve the frame delay problem and match the previous frame,Plapous et al.[IEEE Transactions on Audio,Speech,and Language Processing,2006,14(6):2098–2108]introduced a novel approach called two-step noise reduction(TSNR)technique to improve the performance of the speech enhancement ***,TSNR approach results in spectral peaks of short duration and the broken spectral outlier,which degrade the spectral characteristics of the *** solve this problem,a cepstral smoothing step is added in order to remove these spectral peaks brought by TSNR *** analysis shows that the proposed approach can effectively smooth the spectral peaks and keep the spectral outlier so as to protect the speech *** results also show that the proposed approach can bring significant improvement compared to decision-directed(DD)and TSNR approaches,especially in non-stationary noisy environments.
The van der Waerden number W(r, k) is the least integer N such that every r-coloring of {1,2, • • •, N} contains a monochromatic arithmetic progression of length at least k. Rabung gave a method to obtain lower bounds...
详细信息
The van der Waerden number W(r, k) is the least integer N such that every r-coloring of {1,2, • • •, N} contains a monochromatic arithmetic progression of length at least k. Rabung gave a method to obtain lower bounds on W(2,k) based on quadratic residues, and performed computations on all primes no greater than 20117. By improving the efficiency of the algorithm of Rabung, we perform the computation for all primes up to 6 x 107, and obtain lower bounds on W(2, k) for k between 11 and 23.
On mobile terminals, voice-based local search services [1] are quickly becoming a new important application. Voice search is essentially a large vocabulary speech recognition task with an open ended vocabulary, and th...
详细信息
作者:
Duo ChenJun ChengDacheng TaoCollege of Communication Engineering
Chongqing University Chongqing 400044 China. He is also with the Shenzhen Key Laboratory of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences. Shenzhen Institutes of Advanced Technology
Chinese Academy of Sciences Shenzhen 518055 China. He is also with the Chinese University of Hong Kong and Guangdong Provincial Key Laboratory of Robotics and Intelligent System. Center for Quantum Computation and Intelligent System
Faculty of Engineering and Information Technology University of Technology Sydney New South Wales 2007 Australia.
To facilitate human-robot interactions, human gender information is very important. Motivated by the success of manifold learning for visual recognition, we present a novel clustering-based discriminative locality ali...
详细信息
ISBN:
(数字)9781467317368
ISBN:
(纸本)9781467317375
To facilitate human-robot interactions, human gender information is very important. Motivated by the success of manifold learning for visual recognition, we present a novel clustering-based discriminative locality alignment (CDLA) algorithm to discover the low-dimensional intrinsic submanifold from the embedding high-dimensional ambient space for improving the face gender recognition performance. In particular, CDLA exploits the global geometry through k-means clustering, extracts the discriminative information through margin maximization and explores the local geometry through intra cluster sample concentration. These three properties uniquely characterize CDLA for face gender recognition. The experimental results obtained from the FERET data sets suggest the superiority of the proposed method in terms of recognition speed and accuracy by comparing with several representative methods.
Audio event detection has become a hot research due to its wide applications in many fields, such as multimedia retrieval etc., the detection needs large amounts of labeled samples to train the audio event models, but...
详细信息
Audio event detection has become a hot research due to its wide applications in many fields, such as multimedia retrieval etc., the detection needs large amounts of labeled samples to train the audio event models, but in real life, the labeled samples are expensive to obtain, the shortage of such labeled samples is a big obstacle. Active learning is an efficient way to deal with the problem of insufficient labeled samples. The most popular support vector machines active learning is the margin based sampling (MBS), which is to query the sample closest to the current hyperplane, but when the current hyperplane is far away from the true hyperplane, the sample closest to the current hyperplane is not so informative, querying such samples would have a much slower adjustment of the hyperplane. In order to accelerate the adjustment, this paper proposes the misclassification and margin based sampling (MMBS) active learning algorithm. In order to query more informative samples, MMBS selects samples based on misclassified samples' KL divergence in the first few iterations, after that, considering the lower misclassification confidence and the outlier problem, it switches to MBS. Experiments show that compared to MBS and representative sampling (RepS), MMBS can get the highest detection performance under the same human annotation workload.
Keypoint detection is important for object recognition, image retrieval, mosaicing etc., and has attracted ample research. In this paper, we propose a novel wavelet-based detector (NWBD) based on the previous research...
详细信息
Keypoint detection is important for object recognition, image retrieval, mosaicing etc., and has attracted ample research. In this paper, we propose a novel wavelet-based detector (NWBD) based on the previous researches on keypoint detection. NWBD is performed in wavelet pyramid space, it extracts the local extrema of the energy map computed by intra-scale coefficient product (ISCP) as the candidate keypoint, and then discards some points by Hessian matrix. In the experiments, the novel detector was compared with Harris detector and SIFT detector by the evaluation of repeatability, and it achieved better performance for some scenes in the database provided by Mikolajcyzk and Schmid, such as wall, trees, and graffiti.
Content-based image retrieval (CBIR) has got an intense interest and seen considerable progress over the last decade. But most of the time it is only applied in laboratory. One important reason for this is the diversi...
详细信息
In microgblogs, a user usually follows or is followed by many other users. The content updating and reading is a complex process involving intensive interactions among publishers and readers. It also forms the basis o...
详细信息
In microgblogs, a user usually follows or is followed by many other users. The content updating and reading is a complex process involving intensive interactions among publishers and readers. It also forms the basis of information diffusion in social networks. In the situation of massive followers, tweets reading would heavily depend on user behaviors and interactions. The tweets reading probability (TRP) would be a vital parameter measuring the effectiveness and influence of tweets. Our work proposed a fundamental model, namely competing-window, to simulate the process of multi-node interactions and analyzed TRP in social network. Based on Sina Microblog, we built a standard data set and run massive experiments on empirical data to extract user behavior patterns. By adopting simulating approaches, TRP in a none-preference social network was obtained. The results indicate that typical TRP is about 8% and different user behaviors affect TRP differently.
暂无评论