检索结果-内蒙古大学图书馆

2019第十一届数字图像处理国际会议

作者： Qiuli Lin Feng Liu Qiang Zhao Ran Xu Jiangsu Province Key Lab on Image Processing & Image Communications Nanjing University of Posts and Telecommunications College of Educational Science and Technology Nanjing University of Posts and Telecommunications Key Lab of Broadband Wireless Communication and Sensor Network Technology Ministry of Education Nanjing University of Posts and Telecommunications

In this paper,a novel methodology is presented to settle the region of interest(ROI) detection problem in vehicle color recognition so as to remove the redundant components of vehicles that interfere greatly with color *** order to make full use of the local color and spatial information,vehicle images are divided into different superpixels at *** spatial relationship between superpixels and the outermost pixels is then used for the background removal of vehicle *** comparing with the vehicle window clustering centroids obtained by k-means,the superpixels close to the universal color characteristics of windows are removed so that the dominant color superpixels are ***,a linear Support Vector Machine classifier is trained for color *** experiments demonstrate that the proposed methodology is effective for color region of interest detection and thus contribute to vehicle color recognition.

关键词： vehicle color recognition region of interest superpixel universal color characteristics Support Vector Machine

来源：评论

学校读者我要写书评

暂无评论

Remote Sensing image Scene Classification Based on SURF Feature and Deep Learning

Remote Sensing Image Scene Classification Based on SURF Feat...

引用

Asia-Pacific Signal and Information processing Association Annual Summit and Conference (APSIPA)

作者： Jinxiang Liang Jianwu Dang Yangping Wang Jingyu Yang Zhenhai Zhang Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphic & Image Processing Gansu Provincial Key Lab of System Dynamics and Reliability of Rail Transport Equipment Lanzhou Bocai Technology Co. Ltd.

ISBN: (数字)9781728132488

ISBN: (纸本)9781728132495

Remote sensing image scene classification is one of the key points in remote sensing image interpretation. The traditional remote sensing image scene classification feature performance is not strong, and the deep learning extraction semantic feature process is complex. This paper proposes a fusion feature remote sensing image scene classification method which is based on artificial features and deep learning semantic features. Firstly, the SURF feature of the remote sensing image is extracted and encoded by the VLAD algorithm. The semantic feature of a remote sensing image is extracted by transfer learning. Then the feature reduction is performed by PCA algorithm and feature fusion is performed. Finally, the scene classifier is trained by using the random forest algorithm. The experimental results show that the classification accuracy and Kappa coefficient of this method are higher and the method is effective.

关键词： Feature extraction Remote sensing Machine learning Semantics image analysis Classification algorithms Forestry

来源：评论

学校读者我要写书评

暂无评论

Direct parametric reconstruction in dynamic PET using deep image prior and a novel parameter magnification strategy

引用

Computers in Biology and Medicine 2025年 194卷 110487页

作者： Xiaotong Hong Fanghu Wang Hao Sun Hossein Arabi Lijun Lu School of Biomedical Engineering Southern Medical University 1023 Shatai Road Guangzhou 510515 China Guangdong Provincial Key Laboratory of Medical Image Processing Southern Medical University 1023 Shatai Road Guangzhou 510515 China Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology Southern Medical University 1023 Shatai Road Guangzhou 510515 China The WeiLun PET Center Department of Nuclear Medicine Guangdong Provincial People's Hospital Guangdong Academy of Medical Sciences 510080 Guangzhou China Division of Nuclear Medicine and Molecular Imaging Department of Medical Imaging Geneva University Hospital CH.1211 Geneva 4 Switzerland Pazhou Lab Guangzhou 510330 China

Background/Purpose Multiple parametric imaging in positron emission tomography (PET) is challenging due to the noisy dynamic data and the complex mapping to kinetic parameters. Although methods like direct parametric reconstruction have been proposed to improve the image quality, limitations persist, particularly for nonlinear and small-value micro-parameters (e.g., k 2 , k 3 ). This study presents a novel unsupervised deep learning approach to reconstruct and improve the quality of these micro-parameters. Methods We proposed a direct parametric image reconstruction model, DIP-PM, integrating deep image prior (DIP) with a parameter magnification (PM) strategy. The model employs a U-Net generator to predict multiple parametric images using a CT image prior, with each output channel subsequently magnified by a factor to adjust the intensity. The model was optimized with a log-likelihood loss computed between the measured projection data and forward projected data. Two tracer datasets were simulated for evaluation: 82 Rb data using the 1-tissue compartment (1 TC) model and 18 F-FDG data using the 2-tissue compartment (2 TC) model, with 10-fold magnification applied to the 1 TC k 2 and the 2 TC k 3 , respectively. DIP-PM was compared to the indirect method, direct algorithm (OTEM) and the DIP method without parameter magnification (DIP-only). Performance was assessed on phantom data using peak signal-to-noise ratio (PSNR), normalized root mean square error (NRMSE) and structural similarity index (SSIM), as well as on real 18 F-FDG scan from a male subject. Results For the 1 TC model, OTEM performed well in K 1 reconstruction, but both indirect and OTEM methods showed high noise and poor performance in k 2 . The DIP-only method suppressed noise in k 2 , but failed to reconstruct fine structures in the myocardium. DIP-PM outperformed other methods with well-preserved detailed structures, particularly in k 2 , achieving the best metrics (PSNR: 19.00, NRMSE: 0.3002, SSIM: 0

关键词： Compartmental model Deep image prior Dynamic PET Multiparametric imaging Parameter magnification

来源：评论

学校读者我要写书评

暂无评论

A Novel Representation for Video-based Person Reidentification with Attribute-constraints

A Novel Representation for Video-based Person Reidentificati...

引用

International Conference on Signal processing Proceedings (ICSP)

作者： Wanru Song Jieying Zheng Yahong Wu Qingqing Zhao Changhong Chen Feng Liu Jiangsu Key Lab of Image Processing and Image Communications Nanjing University of Posts and Telecommunications Nanjing China

ISBN: (纸本)9781538646748;9781538646731

Person re-identification is an important task in the field of intelligent video surveillance, which has become one of the research focus spots in the field of computer vision. Video-based person re-identification aims to verify a pedestrian identity of the video sequences which captured from non-overlapping cameras at different time. In this paper, we propose a novel feature extractor based on LSTM networks. These LSTM networks are used to extract the effective space-time feature representation named the attribute-constraints space-time feature (ASTF). Different from other methods, we manually annotate pedestrians in videos with three attributes. In the meantime, the attributes with the IDs of pedestrians are regarded as labels to train the feature extractor. The ASTF representation for a testing video is extracted by this feature extractor, which is an effective space-time feature representation for video-based re-identification. Extensive experiments on two public datasets demonstrate that our approach outperforms the state-of-the-art video-based re-identification methods.

关键词： Feature extraction Task analysis Training Cameras Testing Measurement Video sequences

来源：评论

学校读者我要写书评

暂无评论

3D-EPI Blip-Up/Down Acquisition (BUDA) with CAIPI and Joint Hankel Structured Low-Rank Reconstruction for Rapid Distortion-Free High-Resolution T2* Mapping

arXiv

引用

arXiv 2022年

作者： Chen, Zhifeng Liao, Congyu Cao, Xiaozhi Poser, Benedikt A. Xu, Zhongbiao Lo, Wei-Ching Wen, Manyi Cho, Jaejin Tian, Qiyuan Wang, Yaohui Feng, Yanqiu Xia, Ling Chen, Wufan Liu, Feng Bilgic, Berkin School of Biomedical Engineering Guangdong Provincial Key Laboratory of Medical Image Processing Guangdong Province Engineering Laboratory for Medical Imaging and Diagnostic Technology Southern Medical University Guangzhou China Athinoula A. Martinos Center for Biomedical Imaging Massachusetts General Hospital CharlestownMA United States Department of Radiology Harvard Medical School CharlestownMA United States Department of Data Science and AI Faculty of IT Monash University ClaytonVIC Australia Department of Radiology Stanford University Stanford CA United States Maastricht Brain Imaging Center Faculty of Psychology and Neuroscience University of Maastricht Netherlands Department of Radiotherapy Cancer Center Guangdong Provincial People's Hospital Guangdong Academy of Medical Science Guangzhou China Siemens Medical Solutions BostonMA United States Department of Chemical Pathology The Chinese University of Hong Kong Hong Kong Division of Superconducting Magnet Science and Technology Institute of Electrical Engineering Chinese Academy of Sciences Beijing China Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence Key Laboratory of Mental Health of the Ministry of Education Southern Medical University Guangzhou China Department of Biomedical Engineering Zhejiang University Hangzhou China Research Center for Healthcare Data Science Zhejiang Lab Hangzhou China School of Information Technology and Electrical Engineering The University of Queensland BrisbaneQLD Australia Harvard-MIT Division of Health Sciences and Technology Massachusetts Institute of Technology CambridgeMA United States

Purpose: This work aims to develop a novel distortion-free 3D-EPI acquisition and image reconstruction technique for fast and robust, high-resolution, whole-brain imaging as well as quantitative T2* mapping. Methods: 3D-Blip-Up and -Down Acquisition (3D-BUDA) sequence is designed for both single- and multi-echo 3D GRE-EPI imaging using multiple shots with blip-up and -down readouts to encode B0 field map information. Complementary k-space coverage is achieved using controlled aliasing in parallel imaging (CAIPI) sampling across the shots. For image reconstruction, an iterative hard-thresholding algorithm is employed to minimize the cost function that combines field map information informed parallel imaging with the structured low-rank constraint for multi-shot 3D-BUDA data. Extending 3D-BUDA to multi-echo imaging permits T2* mapping. For this, we propose constructing a joint Hankel matrix along both echo and shot dimensions to improve the reconstruction. Results: Experimental results on in vivo multi-echo data demonstrate that, by performing joint reconstruction along with both echo and shot dimensions, reconstruction accuracy is improved compared to standard 3D-BUDA reconstruction. CAIPI sampling is further shown to enhance the image quality. For T2* mapping, T2* values from 3D-Joint-CAIPI-BUDA and reference multi-echo GRE are within limits of agreement as quantified by Bland-Altman analysis. Conclusions: The proposed technique enables rapid 3D distortion-free high-resolution imaging and T2* mapping. Specifically, 3D-BUDA enables 1-mm isotropic whole-brain imaging in 22 s at 3 T and 9 s on a 7 T scanner. The combination of multi-echo 3D-BUDA with CAIPI acquisition and joint reconstruction enables distortion-free whole-brain T2* mapping in 47 s at 1.1×1.1×1.0 mm3 resolution. Copyright © 2022, The Authors. All rights reserved.

关键词： Mapping

来源：评论

学校读者我要写书评

暂无评论

REFUGE2 CHALLENGE: A TREASURE TROVE FOR MULTI-DIMENSION ANALYSIS AND EVALUATION IN GLAUCOMA SCREENING

arXiv

引用

arXiv 2022年

作者： Fang, Huihui Li, Fei Wu, Junde Fu, Huazhu Sun, Xu Son, Jaemin Yu, Shuang Zhang, Menglu Yuan, Chenglang Bian, Cheng Lei, Baiying Zhao, Benjian Xu, Xinxing Li, Shaohua Fumero, Francisco Sigut, José Almubarak, Haidar Bazi, Yakoub Guo, Yuanhao Zhou, Yating Baid, Ujjwal Innani, Shubham Guo, Tianjiao Yang, Jie Orlando, José Ignacio Bogunović, Hrvoje Zhang, Xiulan Xu, Yanwu The REFUGE2 Challenge Australia State Key Laboratory of Ophthalmology Zhongshan Ophthalmic Center Sun Yat-Sen University Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science Guangzhou China Intelligent Healthcare Unit Baidu Inc. Beijing China The Institute of High Performance Computing Agency for Science Technology and Research Singapore Yatiris Group PLADEMA Institute CONICET UNICEN Tandil Argentina Christian Doppler Lab for Artificial Intelligence in Retina Department of Ophthalmology and Optometry Medical University of Vienna Vienna Austria VUNO Inc Seoul Korea Republic of Tencent HealthCare Tencent Shenzhen China Computer Vision Institute College of Computer Science and Software Engineering of Shenzhen University Shenzhen China School of Biomedical Engineering Health Science Center Shenzhen University China Xiaohe Healthcare ByteDance Guangdong Guangzhou510000 China School of Biomedical Engineering Shenzhen University China College of Computer Science & Software Engineering Shenzhen University China Department of Computer Science and Systems Engineering Universidad de La Laguna Spain Saudi Electronic University Saudi Arabia King Saud University Saudi Arabia Institute of Automation Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China SGGS Institute of Engineering and Technology India Institute of Medical Robotics Shanghai Jiao Tong University China Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University China

With the rapid development of artificial intelligence (AI) in medical image processing, deep learning in color fundus photography (CFP) analysis is also evolving. Although there are some open-source, labeled datasets of CFPs in the ophthalmology community, large-scale datasets for screening only have labels of disease categories, and datasets with annotations of fundus structures are usually small in size. In addition, labeling standards are not uniform across datasets, and there is no clear information on the acquisition device. Here we release a multi-annotation, multi-quality, and multi-device color fundus image dataset for glaucoma analysis on an original challenge-Retinal Fundus Glaucoma Challenge 2nd Edition (REFUGE2). The REFUGE2 dataset contains 2000 color fundus images with annotations of glaucoma classification, optic disc/cup segmentation, as well as fovea localization. Meanwhile, the REFUGE2 challenge sets three sub-tasks of automatic glaucoma diagnosis and fundus structure analysis and provides an online evaluation framework. Based on the characteristics of multi-device and multi-quality data, some methods with strong generalizations are provided in the challenge to make the predictions more robust. This shows that REFUGE2 brings attention to the characteristics of real-world multi-domain data, bridging the gap between scientific research and clinical application. © 2022, CC BY-NC-ND.

关键词： Color

来源：评论

学校读者我要写书评

暂无评论

Controlling Expressivity using Input Codes in Neural Network based TTS

Controlling Expressivity using Input Codes in Neural Network...

引用

Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia)

作者： Xiaolian Zhu Lei Xie Xiao Chen Xiaoyan Lou Xuan Zhu Xingjun Tan Shaanxi Provincial Key Laboratory of Speech and Image Information Processing School of Computer Science Northwestern Polytechnical University Xi’an Hebei University of Economics and Business Shijiazhuang China Shaanxi Provincial Key Laboratory of Speech and Image Information Processing School of Computer Science Northwestern Polytechnical University Xi’an China Language Computing Lab Samsung R&D Institute of China Beijing China

ISBN: (纸本)9781538653128

This paper presents a study on the use of input codes in the neural network acoustic modeling for expressive TTS. Specifically, we use different kinds of input codes, augmented with the linguistic features, as the input of a BLSTM-based acoustic model, to control the expressivity of the synthesized speech. The input codes, in one-hot representation, include dialogue code, sentiment code and sentence position code. The dialogue code indicates whether the text is a dialogue or narration in an audiobook story. The sentiment code is obtained from a sentiment analysis tool, which labels each sentence as positive, negative and neutral. The sentence position code indicates the position of the sentence in the paragraph. We believe these codes are highly related to the expressiveness of the audiobook speech. Experiments on the data from the Blizzard Challenge 2017 demonstrate the effectiveness of the use of input codes in the neural network approach for expressive TTS.

关键词： Neural networks Linguistics Acoustics Hidden Markov models Speech synthesis Adaptation models Speech coding

来源：评论

学校读者我要写书评

暂无评论

RGB-D based action recognition with light-weight 3D convolutional networks

arXiv

引用

arXiv 2018年

作者： Zhang, Haokui Li, Ying Wang, Peng Liu, Yu Shen, Chunhua Shaanxi Provincial Key Lab of Speech and Image Information Processing School of Computer Science Northwestern Polytechnical University Xi’an710129 China School of Computer Science University of Adelaide AdelaideSA5005 Australia

Different from RGB videos, depth data in RGB-D videos provide key complementary information for tristimulus visual data which potentially could achieve accuracy improvement for action recognition. However, most of the existing action recognition models solely using RGB videos limit the performance capacity. Additionally, the state-of-the-art action recognition models, namely 3D convolutional neural networks (3D-CNNs) contain tremendous parameters suffering from computational inefficiency. In this paper, we propose a series of 3D lightweight architectures for action recognition based on RGB-D data. Compared with conventional 3D-CNN models, the proposed lightweight 3D-CNNs have considerably less parameters involving lower computation cost, while it results in favorable recognition performance. Experimental results on two public benchmark datasets show that our models can approximate or outperform the state-of-the-art approaches. Specifically, on the RGB+DNTU (NTU) dataset, we achieve 93.2% and 97.6% for cross-subject and cross-view measurement, and on the Northwestern-UCLA Multiview Action 3D (N-UCLA) dataset, we achieve 95.5% accuracy of cross-view. Copyright © 2018, The Authors. All rights reserved.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Improved Moving Target Detection Based on Multi-Model Mean Model

引用

IOP Conference Series: Earth and Environmental Science 2019年第5期252卷

作者： Weiwei Wang Deyong Gao Yangping Wang Decheng Gao School of Electronic and Information Engineering Lanzhou Jiaotong University Lanzhou China Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphics & Image Processing Lanzhou China Gansu Provincial Key Lab of System Dynamics and Reliability of Rail Transport Equipment Lanzhou China Gansu Institute of Metrology Lanzhou China

Aiming at the problem of low detection accuracy of multi-mode mean model in complex scenarios, an improved detection method of moving target based on multi-mode mean model is ***, the background model is constructed using the multi-mode mean value model. According to different scene information, different thresholds are set and adjusted adaptively. The foreground image obtained by background difference method is detected by frame difference method, and the experiment is compared and analyzed. The detection rate and the error rate are reduced, and the detection accuracy is improved. Finally, the simulation results of three-segment video verify the effectiveness of the proposed method.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Dense Trajectory Action Recognition Algorithm Based on Improved SURF

引用

IOP Conference Series: Earth and Environmental Science 2019年第3期252卷

作者： Hu Zhao Jianwu Dang Song Wang Yangping Wang Decheng Gao School of Electronic and Information Engineering Lanzhou Jiaotong University Lanzhou China Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphics & Image Processing Lanzhou China Gansu Provincial Key Lab of System Dynamics and Reliability of Rail Transport Equipment Lanzhou China Gansu Institute of Metrology Lanzhou China

In order to improve the time-consuming and large error problem of camera motion estimation in dense trajectory feature extraction of video, a dense trajectory action recognition algorithm based on Improved Speeded-Up Robust Features (SURF) is proposed. The algorithm mainly performs dense sampling of video images, and then executes camera motion estimation. In the feature point detection stage, the Gaussian pyramid layer was constructed dynamically to improve the real-time and accuracy of feature point extraction. Based on the SURF algorithm, the brightness center algorithm is used to obtain direction of feature. Binary Robust Independent Elementary Feature (BRIEF) is used to generate feature descriptors to determine matching points and optimized images, then to conducts feature tracking and feature extraction on the images to classify features. The experimental results show that the algorithm performs better in terms of speed when removing camera motion, and improves the real-time performance of feature extraction and the accuracy of action recognition.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：