检索结果-内蒙古大学图书馆

International Conference on Computer Vision (ICCV)

作者： Yunqian Wen Bo Liu Jingyi Cao Rong Xie Li Song Institute of Image Communication and Network Engineering Shanghai Jiao Tong University School of Computer Science University of Technology Sydney MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University

Face de-identification involves concealing the true identity of a face while retaining other facial characteristics. Current target-generic methods typically disentangle identity features in the latent space, using adversarial training to balance privacy and utility. However, this pattern often leads to a trade-off between privacy and utility, and the latent space remains difficult to explain. To address these issues, we propose IDeudemon, which employs a "divide and conquer" strategy to protect identity and preserve utility step by step while maintaining good explainability. In Step I, we obfuscate the 3D disentangled ID code calculated by a parametric NeRF model to protect identity. In Step II, we incorporate visual similarity assistance and train a GAN with adjusted losses to preserve image utility. Thanks to the powerful 3D prior and delicate generative designs, our approach could protect the identity naturally, produce high quality details and is robust to different poses and expressions. Extensive experiments demonstrate that the proposed IDeudemon outperforms previous state-of-the-art methods.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Depth-aware video frame interpolation

arXiv

引用

arXiv 2019年

作者： Bao, Wenbo Lai, Wei-Sheng Ma, Chao Zhang, Xiaoyun Gao, Zhiyong Yang, Ming-Hsuan Institute of Image Communication and Network Engineering Shanghai Jiao Tong University MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University University of California Merced United States Google

Video frame interpolation aims to synthesize nonexistent frames in-between the original frames. While significant advances have been made from the recent deep convolutional neural networks, the quality of interpolation is often reduced due to large object motion or occlusion. In this work, we propose a video frame interpolation method which explicitly detects the occlusion by exploring the depth information. Specifically, we develop a depth-aware flow projection layer to synthesize intermediate flows that preferably sample closer objects than farther ones. In addition, we learn hierarchical features to gather contextual information from neighboring pixels. The proposed model then warps the input frames, depth maps, and contextual features based on the optical flow and local interpolation kernels for synthesizing the output frame. Our model is compact, efficient, and fully differentiable. Quantitative and qualitative results demonstrate that the proposed model performs favorably against state-of-the-art frame interpolation methods on a wide variety of datasets. Copyright © 2019, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Gait Planning and Motion Control Based on Vrep Simulation for Quadruped Robot

Gait Planning and Motion Control Based on Vrep Simulation fo...

引用

WRC Symposium on Advanced Robotics and Automation (WRC SARA)

作者： Linqi Zhou Zhihua Chen Jun Liu Zhi Liu Yumeng Chen Liting Zhang key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition and MOE Key Lab of Nondestructive Testing Technology Nanchang Hangkong University Nanchang China State Key Laboratory of Intelligent Control and Decision of Complex Systems School of Automation Beijing Institute of Technology Beijing China

Gait planning of quadruped robots plays an important role in achieving less walking, including dynamic and static gait. In this article, a static and dynamic gait control method based on center of gravity stability margin is proposed. Firstly, the robot model and kinematics modeling are introduced. Secondly, the robot’s foot static and dynamic gait were planned and the foot trajectory was designed. Finally, two types of gait of the robot were simulated using Vrep simulation software, and the differences in stability and speed between the coordinated gait with speed and stability in the static and dynamic gait of a 12 degree of freedom robot were analyzed, verifying the effectiveness of the gait control method proposed in this paper.

关键词：

来源：评论

学校读者我要写书评

暂无评论

image Fusion Algorithm Based on Spatial Frequency-Motivated Pulse Coupled Neural Networks in Nonsubsampled Contourlet Transform Domain

引用

Acta Automatica Sinica 2008年第12期34卷 1508-1514页

作者： Xiao-Bo QU Jing-Wen YAN Hong-Zhi XIAO Zi-Qian ZHU Department of Communication Engineering Xiamen University Xiamen 361005 P.R.China Key Laboratory of Digital Signal and Image Processing of Guangdong Province Shantou University Shantou 515063 P. R. China Research Institute of Chinese Radar Electronic Equipment Wuxi 214063 P.R.China

Nonsubsampled contourlet transform (NSCT) provides flexible multiresolution, anisotropy, and directional expansion for images. Compared with the original contourlet transform, it is shift-invariant and can overcome the pseudo-Gibbs phenomena around singularities. Pulse coupled neural networks (PCNN) is a visual cortex-inspired neural network and characterized by the global coupling and pulse synchronization of neurons. It has been proven suitable for image processing and successfully employed in image fusion. In this paper, NSCT is associated with PCNN and used in image fusion to make full use of the characteristics of them. Spatial frequency in NSCT domain is input to motivate PCNN and coefficients in NSCT domain with large firing times are selected as coefficients of the fused image. Experimental results demonstrate that the proposed algorithm outperforms typical wavelet-based, contourlet-based, PCNN-based, and contourlet-PCNN-based fusion algorithms in terms of objective criteria and visual appearance.

关键词： Contourlet pulse coupled neural networks (PCNN) wavelet image fusion multiscale transform

来源：评论

学校读者我要写书评

暂无评论

Hybrid Deep Neural Network--Hidden Markov Model (DNN-HMM) Based Speech Emotion Recognition

Hybrid Deep Neural Network--Hidden Markov Model (DNN-HMM) Ba...

引用

International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII

作者： Longfei Li Yong Zhao Dongmei Jiang Yanning Zhang Fengna Wang Isabel Gonzalez Enescu Valentin Hichem Sahli VUB-NPU Joint AVSP Research Lab Northwestern Polytechnical University Xi'an China Shaanxi Provincial Key Lab on Speech and Image Information Processing VUB-NPU Joint AVSP Research Lab Vrije Universiteit Brussel Brussels Belgium Electronics & Informatics Department

Deep Neural Network Hidden Markov Models, or DNN-HMMs, are recently very promising acoustic models achieving good speech recognition results over Gaussian mixture model based HMMs (GMM-HMMs). In this paper, for emotion recognition from speech, we investigate DNN-HMMs with restricted Boltzmann Machine (RBM) based unsupervised pre-training, and DNN-HMMs with discriminative pre-training. Emotion recognition experiments are carried out on these two models on the eNTERFACE'05 database and Berlin database, respectively, and results are compared with those from the GMM-HMMs, the shallow-NN-HMMs with two layers, as well as the Multi-layer Perceptrons HMMs (MLP-HMMs). Experimental results show that when the numbers of the hidden layers as well hidden units are properly set, the DNN could extend the labeling ability of GMM-HMM. Among all the models, the DNN-HMMs with discriminative pre-training obtain the best results. For example, for the eNTERFACE'05 database, the recognition accuracy improves 12.22% from the DNN-HMMs with unsupervised pre-training, 11.67% from the GMM-HMMs, 10.56% from the MLP-HMMs, and even 17.22% from the shallow-NN-HMMs, respectively.

关键词： Hidden Markov models Emotion recognition Training Speech recognition Speech Databases Neural networks

来源：评论

学校读者我要写书评

暂无评论

A Pilot Study of Relating MYCN-Gene Amplification with Neuroblastoma-Patient CT Scans

arXiv

引用

arXiv 2022年

作者： Zhang, Zihan Xiang, Xiang Peng, Xuehua Shao, Jianbo MoE Key Lab of Image Info Processing & Intelligent Control School of Artificial Intelligence & Automation China Wuhan Children's Hospital Tongji Medical College China Huazhong University of Science and Technology Wuhan430074 China

Neuroblastoma is one of the most common cancers in infants, and the initial diagnosis of this disease is difficult. At present, the MYCN gene amplification (MNA) status is detected by invasive pathological examination of tumor samples. This is time-consuming and may have a hidden impact on children. To handle this problem, we adopt multiple machine learning (ML) algorithms to predict the presence or absence of MYCN gene amplification. The dataset is composed of retrospective CT images of 23 neuroblastoma patients. Different from previous work, we develop the algorithm without manually-segmented primary tumors which is time-consuming and not practical. Instead, we only need the coordinate of the center point and the number of tumor slices given by a subspecialty-trained pediatric radiologist. Specifically, CNN-based method uses pre-trained convolutional neural network, and radiomics-based method extracts radiomics features. Our results show that CNN-based method outperforms the radiomics-based method. Copyright © 2022, The Authors. All rights reserved.

关键词： Computerized tomography

来源：评论

学校读者我要写书评

暂无评论

Multimodal depression recognition with dynamic visual and audio cues

Multimodal depression recognition with dynamic visual and au...

引用

International Conference on Affective Computing and Intelligent Interaction and Workshops, ACII

作者： Lang He Dongmei Jiang Hichem Sahli NPU-VUB Joint AVSP Research lab Shaanxi Key Lab on Speech and Image Information Processing Xi'an China Deptartment of Electronics & Informatics(ETRO) Vrije Universiteit Brussel(VUB) Brussels Belgium Interuniversity Microelectronics Centre(IMEC) Heverlee Belgium

In this paper, we present our system design for audio visual multi-modal depression recognition. To improve the estimation accuracy of the Beck Depression Inventory (BDI) score, besides the Low Level Descriptors (LLD) features and the Local Gabor Binary Pattern-Three Orthogonal Planes (LGBP-TOP) features provided by the 2014 Audio/Visual Emotion Challenge and Workshop (AVEC2014), we extract extra features to capture key behavioural changes associated with depression. From audio we extract the speaking rate, and from video, the head pose features, the Space-Temporal Interesting Point (STIP) features, and local kinematic features via the Divergence-Curl-Shear descriptors. These features describe body movements, and spatio-temporal changes within the image sequence. We also consider global dynamic features, obtained using motion history histogram (MHH), bag of words (BOW) features and vector of local aggregated descriptors (VLAD). To capture the complementary information within the used features, we evaluate two fusion systems - the feature fusion scheme, and the model fusion scheme via local linear regression (LLR). Experiments are carried out on the training set and development set of the Depression Recognition Sub-Challenge (DSC) of AVEC2014, we obtain root mean square error (RMSE) of 7.6697, and mean absolute error (MAE) of 6.1683 on the development set, which are better or comparable with the state of the art results of the AVEC2014 challenge.

关键词： Feature extraction Visualization Speech Head Histograms Optical imaging History

来源：评论

学校读者我要写书评

暂无评论

Research on the Strike Strategy of Quadcopter Unmanned Aerial Vehicles for Mobile Targets

Research on the Strike Strategy of Quadcopter Unmanned Aeria...

引用

WRC Symposium on Advanced Robotics and Automation (WRC SARA)

作者： Wencai Zhang Junjie Sun Yanying Li Shen Yan Ning Zhang Jiaqi Wang Zhihua Chen Mingran Pan Engineering Technicians of Norinco Group Liao Shen Industries Group CO.LTD Shenyang China Key Laboratory of Jiangxi Province for Image Processing and Pattern Recognition and MOE Key Lab of Nondestructive Testing Technology Nanchang Hangkong University Nanchang China

ISBN: (数字)9798331506100

ISBN: (纸本)9798331506117

This paper studies a strike path strategy for quadcopter drones targeting ground maneuvering targets. The strategy sets the strike path to two different strike speeds, which improves the stability and robustness of quadcopter drones while shortening strike time and increasing hit rates. Consider the attitude control of quadcopter unmanned aerial vehicles during motion, and verify the flight reliability of the mechanism through simulation experiments. Set different slope strike paths and obtain the optimal strike path through experiments, while proving the effectiveness of this strike strategy in engineering applications.

关键词： Attitude control Prototypes Autonomous aerial vehicles Stability analysis Robustness Robotics and automation Aircraft propulsion

来源：评论

学校读者我要写书评

暂无评论

Indirect effects among biodiversity loss of mutualistic ecosystems

引用

National Science Open 2022年第2期1卷 8-18页

作者： Guangwei Wang Xueming Liu Guanrong Chen Hai-Tao Zhang School of Artificial Intelligence and Automation Key Laboratory of Image Processing and Intelligent ControlHuazhong University of Science and TechnologyWuhan 430074China State Key Lab of Digital Manufacturing Equipment and Technology Huazhong University of Science and TechnologyWuhan 430074China Department of Electrical Engineering City University of Hong KongHong KongChina

Drastic reduction in biodiversity has been a severe threat to ecosystems,which is exacerbated when losing few species leads to disastrous and even irreparable ***,revealing the mechanism underlying biodiversity loss is of uttermost *** this study,we show that abundant indirect interactions among mutualistic ecosystems are critical in determining species’*** topological and ecological characteristics,we propose an indicator derived from a dynamic model to identify keystone species and quantify their influence,which outperforms widely-used indicators like degree in realistic and simulated ***,we demonstrate that networks with high modularity,heterogeneity,biodiversity,and less intimate interactions tend to have larger indirect effects,which are more amenable in predicting decline of biodiversity with the proposed *** findings shed some light onto the influence of apposite biodiversities,paving the way from complex network theory to ecosystem protection and restoration.

关键词： biodiversity indirect effect complex network mutualism

来源：评论

学校读者我要写书评

暂无评论

MGTN: Multi-scale Graph Transformer Network for 3D Point Cloud Semantic Segmentation

MGTN: Multi-scale Graph Transformer Network for 3D Point Clo...

引用

IEEE Visual communications and image processing (VCIP)

作者： Da Ai Siyu Qin Zihe Nie Hui Yuan Ying Liu Xi’an Key Laboratory of Image Processing Technology and Applications for Public Security Xi’an University of Posts and Telecommunications Xi’an China School of Communication and Information Engineering Xi’an University of Posts and Telecommunications Xi’an China School of Control Science and Engineering Shandong University Jinan China

ISBN: (数字)9798331529543

ISBN: (纸本)9798331529550

The structural similarity of point clouds presents challenges in accurately recognizing and segmenting semantic information at the demarcation points of complex scenes or objects. In this study, we propose a multi-scale graph transformer network (MGTN) for 3D point cloud semantic segmentation. First, a multi-scale graph convolution (MSG-Conv) is devised to address the limitations faced by existing methods when extracting local and global features of point cloud data with varying densities simultaneously. Subsequently, we employ a graph-transformer (G-T) module to enhance edge details and spatial position information in the point cloud, thereby improving recognition accuracy for small objects and confusing elements such as columns and beams. Extensive testing on ShapeNet parts and S3DIS datasets was conducted to demonstrate the effectiveness of MGTN. Compared to the baseline network DGCNN, our proposed MGTN achieves substantial performance improvements, as evidenced by notable increases in mIoU of 1.5% and 18.5% on the ShapeNet parts and S3DIS datasets respectively. Additionally, MGTN outperforms the recent CFSA- Net by 2.3% and 3.4% on OA and mIoU respectively.

关键词： Point cloud compression Three-dimensional displays Visual communication Convolution Semantic segmentation Semantics Transformers Feature extraction Data mining Testing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：