检索结果-内蒙古大学图书馆

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Dayong Wang Yishen Deng Weisheng Li Xin Lu Frederic Dufaux Bo Hang Ce Zhu Key Laboratory of Big Data Intelligent Computing Chongqing University of Posts and Telecommunications Guangxi Key Laboratory of Machine Vision and Intelligent Control Wuzhou University Chongqing Key Laboratory of Image Cognition Chongqing University of Posts and Telecommunications Faculty of Computing Engineering and Media (CEM) De Montfort University Université Paris-Saclay CNRS CentraleSupélec Laboratoire des Signaux et systèmes The Computer School Hubei University of Arts and Science School of Information and Communication Engineering University of Electronic Science and Technology of China

Versatile Video Coding (VVC) now supports Screen Content Coding (SCC) by integrating two efficient coding modes: Intra Block Copy (IBC) and Palette (PLT). However, the numerous modes and the Quad-Tree Plus Multi-Type Tree (QTMT) structure inherent to VVC contribute to a very high coding complexity. To effectively reduce the computational complexity of VVC SCC, we propose a fast Intra mode prediction algorithm for VVC SCC. More specifically, we first use the difference of minimum Sum of Absolute Transformed Differences (SATD) value of four Directional Modes (DMs) of Intra and the SATD value of the IBC-merge mode to determine whether to early skip Intra checking. Subsequently, we use a decision tree to determine whether to early terminate the checking after block differential pulse coded modulation (BDPCM). Finally, we employ a decision tree to determine whether to early skip multiple transform selection (MTS) and low frequency non-separable transform (LFNST) checking. The results demonstrate that our algorithm achieves an average encoding time reduction of 34.34% with a negligible Bjøntegaard delta bitrate increase of 0.46%.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Channel and space-based joint rate allocation algorithm

Channel and space-based joint rate allocation algorithm

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Dayong Wang Chao Yuan Yu Sun Xin Lu Hui Guo Frederic Dufaux Ce Zhu Key Laboratory of Big Data Intelligent Computing Chongqing University of Posts and Telecommunications Guangxi Key Laboratory of Machine Vision and Intelligent Control Wuzhou University Chongqing Key Laboratory of Image Cognition Chongqing University of Posts and Telecommunications China Department of Computer Science University of Central Arkansas Faculty of Computing Engineering and Media (CEM) De Montfort University UK Université Paris-Saclay CNRS CentraleSupélec Laboratoire Des Signaux et Systèmes France School of Information and Communication Engineering University of Electronic Science and Technology of China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

Rate control is a critical component for image and video compression Particularly under limited network bandwidth conditions, bitrate control is essential to ensure efficient image transmission by effectively allocation channel resources. In this research, since both Channel and Spatial have relationship with rate allocation, we first propose a joint Channel-wise and Spatial-wise Quantization scheme to determine optimal quantization parameters. Subsequently, we develop a quantization step estimation network to obtain parameters to efficiently allocate rate according to target rate. Experiments demonstrate that our algorithm significantly improve compressed image quality with minimal bitrate distortion and achieve accurate rate control with nearly 3% average bitrate error.

关键词： image quality Quantization (signal) image coding Bit rate Signal processing algorithms Estimation Rate-distortion Video compression Resource management Speech processing

来源：评论

学校读者我要写书评

暂无评论

引用

Conference on computer Vision and Pattern Recognition (CVPR)

作者： Jianghao Xiong Jianhuang Lai School of Computer Science and Engineering Sun Yat-Sen University China Guangdong Province Key Laboratory of Information Security Technology China Key Laboratory of Machine Intelligence and Advanced Computing Ministry of Education China Key Laboratory of Video and Image Intelligent Analysis and Applicaiton Technology Ministry of Public Security China

Group re-identification (G-ReID) aims to re-identify a group of people that is observed from non-overlapping camera systems. The existing literature has mainly addressed RGB-based problems, but RGB-infrared (RGB-IR) cross-modality matching problem has not been studied yet. In this paper, we propose a metric learning method Closest Permutation Matching (CPM) for RGB-IR G-ReID. We model each group as a set of single-person features which are extracted by MPANet, then we propose the metric Closest Permutation Distance (CPD) to measure the similarity between two sets of features. CPD is invariant with order changes of group members so that it solves the layout change problem in G-ReID. Furthermore, we introduce the problem of G-ReID without person labels. In the weak-supervised case, we design the Relation-aware Module (RAM) that exploits visual context and relations among group members to produce a modality-invariant order of features in each group, with which group member features within a set can be sorted to form a robust group representation against modality change. To support the study on RGB-IR G-ReID, we construct a new large-scale RGB-IR G-ReID dataset CM-Group. The dataset contains 15,440 RGB images and 15,506 infrared images of 427 groups and 1,013 identi-ties. Extensive experiments on the new dataset demonstrate the effectiveness of the proposed models and the complexity of CM-Group. The code and dataset are available at: https://***/WhollyOat/CM-Group.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Touchstone benchmark: are we on the right way for evaluating AI algorithms for medical segmentation? 24

Touchstone benchmark: are we on the right way for evaluating...

引用

Proceedings of the 38th International Conference on Neural information Processing Systems

作者： Pedro R. A. S. Bassi Wenxuan Li Yucheng Tang Fabian Isensee Zifu Wang Jieneng Chen Yu-Cheng Chou Saikat Roy Yannick Kirchhoff Maximilian Rokuss Ziyan Huang Jin Ye Junjun He Tassilo Wald Constantin Ulrich Michael Baumgartner Klaus H. Maier-Hein Paul Jaeger Yiwen Ye Yutong Xie Jianpeng Zhang Ziyang Chen Yong Xia Zhaohu Xing Lei Zhu Yousef Sadegheih Afshin Bozorgpour Pratibha Kumari Reza Azad Dorit Merhof Pengcheng Shi Ting Ma Yuxin Du Fan Bai Tiejun Huang Bo Zhao Haonan Wang Xiaomeng Li Hanxue Gu Haoyu Dong Jichen Yang Maciej A. Mazurowski Saumya Gupta Linshan Wu Jiaxin Zhuang Hao Chen Holger Roth Daguang Xu Matthew B. Blaschko Sergio Decherchi Andrea Cavalli Alan L. Yuille Zongwei Zhou Department of Computer Science Johns Hopkins University and Department of Pharmacy and Biotechnology University of Bologna and Center for Biomolecular Nanotechnologies Istituto Italiano di Tecnologia Department of Computer Science Johns Hopkins University NVIDIA Division of Medical Image Computing German Cancer Research Center (DKFZ) and Helmholtz Imaging German Cancer Research Center (DKFZ) ESAT-PSI KU Leuven Division of Medical Image Computing German Cancer Research Center (DKFZ) and Faculty of Mathematics and Computer Science Heidelberg University Division of Medical Image Computing German Cancer Research Center (DKFZ) and Faculty of Mathematics and Computer Science Heidelberg University and HIDSS4Health - Helmholtz Information and Data Science School for Health Shanghai Jiao Tong University Shanghai Artificial Intelligence Laboratory Division of Medical Image Computing German Cancer Research Center (DKFZ) Division of Medical Image Computing German Cancer Research Center (DKFZ) and Pattern Analysis and Learning Group Department of Radiation Oncology Heidelberg University Hospital Helmholtz Imaging German Cancer Research Center (DKFZ) and Interactive Machine Learning Group (IML) DKFZ School of Computer Science and Engineering Northwestern Polytechnical University Australian Institute for Machine Learning The University of Adelaide College of Computer Science and Technology Zhejiang University Hong Kong University of Science and Technology (Guangzhou) Hong Kong University of Science and Technology (Guangzhou) and Hong Kong University of Science and Technology Faculty of Informatics and Data Science University of Regensburg Faculty of Electrical Engineering and Information Technology RWTH Aachen University Faculty of Informatics and Data Science University of Regensburg and Fraunhofer Institute for Digital Medicine MEVIS Electronic & Information Engineering School Harbin Institute of Technology (Shenzhen) Shanghai Jiao Tong University and Beijing Academy of Artificial Intelligence (BAAI) S

ISBN: (纸本)9798331314385

How can we test AI performance? This question seems trivial, but it isn't. Standard benchmarks often have problems such as in-distribution and small-size test sets, oversimplified metrics, unfair comparisons, and short-term outcome pressure. As a consequence, good performance on standard benchmarks does not guarantee success in real-world scenarios. To address these problems, we present Touchstone, a large-scale collaborative segmentation benchmark of 9 types of abdominal organs. This benchmark is based on 5,195 training CT scans from 76 hospitals around the world and 5,903 testing CT scans from 11 additional hospitals. This diverse test set enhances the statistical significance of benchmark results and rigorously evaluates AI algorithms across out-of-distribution scenarios. We invited 14 inventors of 19 AI algorithms to train their algorithms, while our team, as a third party, independently evaluated these algorithms. In addition, we also evaluated pre-existing AI frameworks—which, differing from algorithms, are more flexible and can support different algorithms—including MONAI from NVIDIA, nnU-Net from DKFZ, and numerous other open-source frameworks. We are committed to expanding this benchmark to encourage more innovation of AI algorithms for the medical domain.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Analysis of Video Quality Datasets via Design of Minimalistic Video Quality Models

arXiv

引用

arXiv 2023年

作者： Sun, Wei Wen, Wen Min, Xiongkuo Lan, Long Zhai, Guangtao Ma, Kede The Institute of Image Communication and Information Processing Shanghai Jiao Tong University Shanghai200240 China The Department of Computer Science City University of Hong Kong Kowloon Hong Kong The Institute for Quantum Information State Key Laboratory of High Performance Computing College of Computer Science and Technology National University of Defense Technology Changsha410073 China The Institute of Image Communication and Information Processing The MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University Shanghai200240 China The Department of Computer Science The Shenzhen Research Institute City University of Hong Kong Kowloon Hong Kong

Blind video quality assessment (BVQA) plays an indispensable role in monitoring and improving the end-users’ viewing experience in various real-world video-enabled media applications. As an experimental field, the improvements of BVQA models have been measured primarily on a few human-rated VQA datasets. Thus, it is crucial to gain a better understanding of existing VQA datasets in order to properly evaluate the current progress in BVQA. Towards this goal, we conduct a first-of-its-kind computational analysis of VQA datasets via designing minimalistic BVQA models. By minimalistic, we restrict our family of BVQA models to build only upon basic blocks: a video preprocessor (for aggressive spatiotemporal downsampling), a spatial quality analyzer, an optional temporal quality analyzer, and a quality regressor, all with the simplest possible instantiations. By comparing the quality prediction performance of different model variants on eight VQA datasets with realistic distortions, we find that nearly all datasets suffer from the easy dataset problem of varying severity, some of which even admit blind image quality assessment (BIQA) solutions. We additionally justify our claims by comparing our model generalization capabilities on these VQA datasets, and by ablating a dizzying set of BVQA design choices related to the basic building blocks. Our results cast doubt on the current progress in BVQA, and meanwhile shed light on good practices of constructing next-generation VQA datasets and models. Code is available at https://***/sunwei925/***. Copyright © 2023, The Authors. All rights reserved.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

A Novel Mode Selection-Based Fast Intra Prediction Algorithm for Spatial SHVC

A Novel Mode Selection-Based Fast Intra Prediction Algorithm...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Dayong Wang Yu Sun Weisheng Li Lele Xie Xin Lu Frederic Dufaux Ce Zhu Chongqing Key Laboratory on Big Data for Bio Intelligence Department of Computer Science University of Central Arkansas Computing Chongqing Key Laboratory of Image Cognition School of Information and Communication Engineering Chongqing University of Posts and Telecommunications Faculty of Computing Engineering and Media (CEM) De Montfort University CNRS CentraleSupélec Laboratoire des Signaux et systèmes Université Paris-Saclay School of Information and Communication Engineering University of Electronic Science and Technology of China

Due to multi-layer encoding and Inter-layer prediction, Spatial Scalable High-Efficiency Video Coding (SSHVC) has extremely high coding complexity. It is very crucial to improve its coding speed so as to promote widespread and cost-effective SSHVC applications. In this paper, we have proposed a novel Mode Selection-Based Fast Intra Prediction algorithm for SSHVC. We reveal the RD costs of Inter-layer Reference (ILR) mode and Intra mode have a significant difference, and the RD costs of these two modes follow Gaussian distribution. Based on this observation, we propose to apply the classic Gaussian Mixture Model and Expectation Maximization in machine learning to determine whether ILR is the best mode so as to skip the Intra mode. Experimental results demonstrate that the proposed algorithm can significantly improve the coding speed with negligible coding efficiency loss.

关键词： Machine learning algorithms Costs Signal processing algorithms Machine learning Gaussian distribution Prediction algorithms Encoding

来源：评论

学校读者我要写书评

暂无评论

Local Neighbor Propagation on Graphs for Robust Feature Matching

SSRN

引用

SSRN 2023年

作者： Guo, Hanlin Xiao, Guobao Su, Lumei Zhou, Jiaxing Wang, Dahan Xiamen Key Laboratory of Frontier Electric Power Equipment and Intelligent Control School of Electrical Engineering and Automation Xiamen University of Technology China Fujian Key Laboratory of Sensing and Computing for Smart Cities School of Information Science and Engineering Xiamen University China College of Computer and Control Engineering Minjiang University China Fujian Key Laboratory of Pattern Recognition and Image Understanding School of Computer and Information Engineering Xiamen University of Technology China

Establishing reliable correspondences between two sets of feature points is a critical preprocessing step in many computer vision and pattern recognition tasks. In this paper, we propose a novel robust Local Neighbor Propagation on Graphs based feature matching (LNPG) method, to obtain good correspondences for feature matching. LNPG starts from a novel neighborhood graph construction strategy. The strategy leverages the spatial consistency constraint to generate a series of neighborhood sets, and employs the residual information to preserve the local neighborhood relationships of potential inliers (i.e., true matches). Subsequently, LNPG incorporates local neighbor propagation into the graph to enhance connections between data in different neighborhoods, by using the path-based similarity measurement and the adaptive graph partition. In addition, LNPG includes a novel consistency-filtering-based clustering algorithm for robust feature matching. This clustering algorithm introduces a reliable neighborhood consistency measure function and an effective cluster merging criterion for cluster detection and cluster merging during the clustering process, respectively. Overall, LNPG not only effectively distinguishes inliers from outliers, but also reliably classifies inliers into different transformation models between pairs of images. Experiments on publicly available datasets with different types of image transformations show the superiority of our LNPG in comparison with other state-of-the-art methods. © 2023, The Authors. All rights reserved.

关键词： Clustering algorithms

来源：评论

学校读者我要写书评

暂无评论

CODH++: Macro-semantic differences oriented instance segmentation network

引用

Expert Systems with Applications 2022年 202卷 1页

作者： Zhang, Wenchao Fu, Chong Cao, Lin Sham, Chiu-Wing School of Computer Science and Engineering Northeastern University Shenyang110819 China Engineering Research Center of Security Technology of Complex Network System Ministry of Education China Key Laboratory of Intelligent Computing in Medical Image Ministry of Education Northeastern University Shenyang110819 China School of Information and Communication Engineering Beijing Information Science and Technology University Beijing100101 China School of Computer Science The University of Auckland New Zealand

With the idea of divide and rule, there exist two different forms of semantic features flowing in the two stage instance segmentation paradigms. They are the global features at the image level and the instance features at the region-wise. The most significant distinction of the two macro-semantic morphological features lies in the different relevance of neighborhood features caused by background noise. Hence, we should consider different situations and make different schemes. Notice that the fields-of-view determines the range of local features that can be perceived in the convolution operation and implies the representation capability of the network. To this end, for FPN and Mask Head in two stage paradigms, we propose a more efficient methodology with Group-Inception and Asymmetric-Inception modules. This proposed methodology can act as a drop-in replacement to upgrade the plain convolution operation, which enables the network to look more via modeling long-range dependencies. Our method is simple yet effective. Quantitatively, we can significantly improve the state-of-the-art frameworks, including Mask R-CNN, Mask Scoring R-CNN, Cascade Mask R-CNN, and HTC by about 1.2%–2.2% AP on MS COCO test-dev yet with fewer parameters and FLOPs. Moreover, the proposed approach achieves competitive performances on the Scapes, KINS and SBD datasets. The source code of our method will be made available. © 2022 Elsevier Ltd

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

DPER: Diffusion Prior Driven Neural Representation for Limited Angle and Sparse View CT Reconstruction

arXiv

引用

arXiv 2024年

作者： Du, Chenhe Lin, Xiyue Wu, Qing Tian, Xuanyu Su, Ying Luo, Zhe Zheng, Rui Chen, Yang Wei, Hongjiang Zhou, S. Kevin Yu, Jingyi Zhang, Yuyao The School of Information Science and Technology ShanghaiTech University Shanghai China The Department of Critical Care Medicine Zhongshan Hospital Fudan University Shanghai China The Laboratory of Image Science and Technology The School of Computer Science and Engineering The Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications Southeast University Ministry of Education Nanjing210096 China The School of Biomedical Engineering Institute of Medical Robotics Shanghai Jiao Tong University Shanghai China School of Biomedical Engineering Suzhou Institute for Advanced Research University of Science and Technology of China Suzhou215123 China Institute of Computing Technology CAS Beijing100190 China

Limited-angle and sparse-view computed tomography (LACT and SVCT) are crucial for expanding the scope of X-ray CT applications. However, they face challenges due to incomplete data acquisition, resulting in diverse artifacts in the reconstructed CT images. Emerging implicit neural representation (INR) techniques, such as NeRF, NeAT, and NeRP, have shown promise in under-determined CT imaging reconstruction tasks. However, the unsupervised nature of INR architecture imposes limited constraints on the solution space, particularly for the highly ill-posed reconstruction task posed by LACT and ultra-SVCT. In this study, we introduce the Diffusion Prior Driven Neural Representation (DPER), an advanced unsupervised framework designed to address the exceptionally ill-posed CT reconstruction inverse problems. DPER adopts the Half Quadratic Splitting (HQS) algorithm to decompose the inverse problem into data fidelity and distribution prior sub-problems. The two sub-problems are respectively addressed by INR reconstruction scheme and pre-trained score-based diffusion model. This combination first injects the implicit image local consistency prior from INR. Additionally, it effectively augments the feasibility of the solution space for the inverse problem through the generative diffusion model, resulting in increased stability and precision in the solutions. We conduct comprehensive experiments to evaluate the performance of DPER on LACT and ultra-SVCT reconstruction with two public datasets (AAPM and LIDC), an in-house clinical COVID-19 dataset and a public raw projection dataset created by Mayo Clinic. The results show that our method outperforms the state-of-the-art reconstruction methods on in-domain datasets, while achieving significant performance improvements on out-of-domain (OOD) datasets. Copyright © 2024, The Authors. All rights reserved.

关键词： Inverse problems

来源：评论

学校读者我要写书评

暂无评论

Joint feature and task aware multi-task feature learning for Alzheimer’s disease diagnosis

Joint feature and task aware multi-task feature learning for...

引用

IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

作者： Peng Cao Wei Liang Kai Zhang Shanshan Tang Jinzhu Yang College of Computer Science and Engineering Key Laboratory of Intelligent Computing in Medical Image Northeastern University Shenyang China College of Computer Science and Engineering Northeastern University Shenyang China College of Information Science and Engineering Northeastern University Shenyang China

ISBN: (纸本)9781665429825

Alzheimer’s disease (AD) is known as one of the major causes of dementia and is characterized by slow progression over several years. There have been efforts to identify the risk of developing AD in its earliest time. Recently, multi-task feature learning (MTFL) methods with sparsity-inducing $\ell_{2,1}$-norm have been widely studied to select a discriminative feature subset from MRI features. However, they ignore the complex relationships among imaging markers and among cognitive outcomes. Constructing the relationships with simple Pearson correlation coefficient may degrade model generalizability. To better capture the complicated but more flexible relationship between the cognitive scores and the neuroimaging measures, we propose a two-stage framework to jointly learn the structure within the feature correlation as well as within the task correlation. Moreover, we propose a dual graph regularization to encode the learned correlation structure. It is able to guide the training procedure of MTFL by incorporating both the inherent correlations. Extensive results on benchmark datasets show that for the proposed FTSMTFL model trained with the dual graph regularization, the proposed joint training framework outperforms existing methods and achieves state-of-the-art cognitive prediction performance of AD.

关键词： Representation learning Training Neuroimaging Correlation Predictive models Multitasking Stability analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：