Channel attention mechanisms have been commonly applied in many visual tasks for effective performance improvement. It is able to reinforce the informative channels as well as to suppress the useless channels. Recentl...
详细信息
We introduce a novel problem of scene sketch zero-shot learning (SSZSL), which is a challenging task, since (i) different from photo, the gap between common semantic domain (e.g., word vector) and sketch is too huge t...
详细信息
Quantitative estimation of the acute ischemic infarct is crucial to improve neurological outcomes of the patients with stroke symptoms. Since the density of lesions is subtle and can be confounded by normal physiologi...
详细信息
Cryo-electron microscopy (cryo-EM) has become a mainstream technology for solving spatial structures of biomacromolecules, while the processing of cryo-EM images is a very challenging task. One of the great challenges...
详细信息
This paper describes a model for performing action classification in real-time video streaming. This model can simultaneously analyze the spatio-temporal information of video under the constraint of low delay. In addi...
详细信息
This paper describes a model for performing action classification in real-time video streaming. This model can simultaneously analyze the spatio-temporal information of video under the constraint of low delay. In addition, in order to prevent the model from judging motionless segments in the video as motion, the model in this article is equipped with the ability of distinguish the segments of motion from the stationary ones. The experimental results show that the model can complete the action classification task with little delay, which ensures that the classification result can be output in real time with the constant input of the video image.
A novel text-independent speaker identification (SI) method is proposed. This method uses the Mel-frequency Cepstral coefficients (MFCCs) and the dynamic information among adjacent frames as feature sets to capture sp...
详细信息
In this paper we proposed an end-to-end short utterances speech language identification(SLD) approach based on a Long Short Term Memory (LSTM) neural network which is special suitable for SLD application in intelligen...
详细信息
Recommender systems show increasingly importance with the development of E-commerce, news and multimedia applications. Traditional recommendation algorithms such as collaborative-filtering-based methods and graph-base...
详细信息
Recommender systems show increasingly importance with the development of E-commerce, news and multimedia applications. Traditional recommendation algorithms such as collaborative-filtering-based methods and graph-based methods mainly use items' original attributes and relationships between items and users, ignoring items' chronological order in browsing sessions. In recent years, RNN-based methods show their superiority when dealing with the sequential data, and some modified RNN models have been proposed. However, these RNN models only use the sequence order of items and neglect items' browsing time information. It is widely accepted that users tend to spend more time on their interested items, and these interested items are always closely related to users' current target. Based on the above view, items' browsing time is an important feature in recommendations. In this paper, we propose a modified RNN-based recommender system called TA4Rec, which can recommend the probable Item that may be clicked in the next moment. Our main contribution is to introduce a method to calculate the time-attention factors from browsing items' duration time and add time-attention factors to the RNN-based model. We conduct experiments on RecSys Challenge 2015 dataset and the result shows that TA4Rec model has gained obvious improvement on session-based recommendations than the classic session-based recommender method.
We introduce a novel approach that is used to convert images into the corresponding language descriptions. This method follows the most popular encoder-decoder architecture. The encoder uses the recently proposed dens...
详细信息
We introduce a novel approach that is used to convert images into the corresponding language descriptions. This method follows the most popular encoder-decoder architecture. The encoder uses the recently proposed densely convolutional neural network (DenseNet) to extract the feature maps. Meanwhile, the decoder uses the long short time memory (LSTM) to parse the feature maps to descriptions. We predict the next word of descriptions by taking the effective combination of feature maps with word embedding of current input word by “visual attention switch”. Finally, we compare the performance of the proposed model with other baseline models and achieve good results.
In the field of cardiac arterial interventional therapy, coronary angiography imaging provides key information to physicians for treatment strategy selection, while the lesion identification process is time-consuming ...
详细信息
In the field of cardiac arterial interventional therapy, coronary angiography imaging provides key information to physicians for treatment strategy selection, while the lesion identification process is time-consuming and error-prone even for experienced doctors. This paper proposes a method for the automatic detection of lesion in cardiac coronary angiography based on the deep learning and convolution neural network for the very first time. We used 2925 medical images for building the model. Several lesions exist on the vessel of each image. We will regard these lesion areas as objects that are different from other background areas. We designed a model based on the convolution neural network, applying some advanced building block including CReLU, Inception and other advanced technology such as batch normalization, residual connections, skip-layer connection in our network model. After training, the network model can distinguish the difference between a lesion area and a normal vessel area (background), which can detect the location of the coronary artery lesion in real time without any manual intervention. For the stenosis lesion, the recall rate of detection achieves 0.88.
暂无评论