检索结果-内蒙古大学图书馆

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Wu, XiangJi Zhang, Ziwen Feng, Jie Zhou, Lei Wu, Junmin Tucodec Inc Shanghai Peoples R China

ISBN: (数字)9781728193601

ISBN: (纸本)9781728193601

We present an end-to-end trainable framework for P-frame compression in this paper. A joint motion vector (MV) and residual prediction network MV-Residual is designed to extract the ensembled features of motion representations and residual information by treating the two successive frames as inputs. The prior probability of the latent representations is modeled by a hyperprior auto-encoder and trained jointly with the MV-Residual network. Specially, the spatially-displaced convolution is applied for video frame prediction, in which a motion kernel for each pixel is learned to generate predicted pixel by applying the kernel at a displaced location in the source image. Finally, novel rate allocation and post-processing strategies are used to produce the final compressed bits, considering the bits constraint of the challenge. The experimental results on validation set show that the proposed optimized framework can generate the highest MS-SSIM for P-frame compression competition.

关键词： Image coding Video compression Convolution Kernel computer vision conferences pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Pay Attention to Virality: understanding popularity of social media videos with the attention mechanism 31

Pay Attention to Virality: understanding popularity of socia...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Bielski, Adam Trzcinski, Tomasz Tooploox Wroclaw Poland Warsaw Univ Technol Tooploox Warsaw Poland

ISBN: (数字)9781538661000

ISBN: (纸本)9781538661000

Predicting popularity of social media videos before they are published is a challenging task, mainly due to the complexity of content distribution network as well as the number of factors that play part in this process. As solving this task provides tremendous help for media content creators, many successful methods were proposed to solve this problem with machine learning. In this work, we change the viewpoint and postulate that it is not only the predicted popularity that matters, but also, maybe even more importantly, understanding of how individual parts influence the final popularity score. To that end, we propose to combine the Grad-CAM visualization method with a soft attention mechanism. Our preliminary results show that this approach allows for more intuitive interpretation of the content impact on video popularity, while achieving competitive results in terms of prediction accuracy.

关键词： Videos Visualization Social network services conferences Task analysis Heating systems computer vision

来源：评论

学校读者我要写书评

暂无评论

Foreword to the Special Issue on "Geovision: computer vision for Geospatial Applications"

引用

ieee JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING 2016年第7期9卷 2840-2843页

作者： Tuia, Devis Wegner, Jan Dirk Mallet, Clement Yang, Michael Ying Univ Zurich CH-8057 Zurich Switzerland Swiss Fed Inst Technol CH-8093 Zurich Switzerland Univ Paris Est IGN LaSTIG 73 Ave Paris F-94160 St Mande France Univ Twente NL-7500 AE Enschede Netherlands

The nine papers in this special section focus on the development of new computer vision techniques for the interpretation of remote sensing images. These papers represent a follow-up of two workshops held in conjunction with the ieee conference on computer vision and pattern recognition (CVPR) 2015, that was held in Boston, MA, EARTHvision 2015 and MSF 2015. The purpose of both workshops and of this special issue is to foster fruitful collaboration of computer vision, Earth observation, and geospatial analysis communities.

关键词： Special issues and sections Meetings computer vision Geospatial analysis Remote sensing pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Application of computer vision and vector space model for tactical movement classification in badminton 30

Application of computer vision and vector space model for ta...

引用

30th ieee/CVF conference on computer vision and pattern recognition workshops (CVPRW)

作者： Weeratunga, Kokum Dharmaratne, Anuja How, Khoo Boon Monash Univ Sch Informat Technol Clayton Vic Australia Monash Univ Sch Engn Clayton Vic Australia Natl Sports Inst Malaysia Kuala Lumpur Malaysia

ISBN: (纸本)9781538607336

Performance profiling in sports allow evaluating opponents' tactics and the development of counter tactics to gain a competitive advantage. The work presented develops a comprehensive methodology to automate tactical profiling in elite badminton. The proposed approach uses computer vision techniques to automate data gathering from video footage. The image processing algorithm is validated using video footage of the highest level tournaments, including the Olympic Games. The average accuracy of player position detection is 96.03% and 97.09% on the two halves of a badminton court. Next, frequent trajectories of badminton players are extracted and classified according to their tactical relevance. The classification performs at 97.79% accuracy, 97.81% precision, 97.44% recall, and 97.62% F-score. The combination of automated player position detection, frequent trajectory extraction, and the subsequent classification can be used to automatically generate player tactical profiles.

关键词： Estimation Field programmable gate arrays OFDM Modulation Filtration Clocks Signal processing algorithms

来源：评论

学校读者我要写书评

暂无评论

Large Receptive Field Networks for High-Scale Image Super-Resolution 31

Large Receptive Field Networks for High-Scale Image Super-Re...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Seif, George Androutsos, Dimitrios Ryerson Univ Toronto ON Canada

ISBN: (数字)9781538661000

ISBN: (纸本)9781538661000

Convolutional Neural Networks have been the backbone of recent rapid progress in Single-Image Super-Resolution. However, existing networks are very deep with many network parameters, thus having a large memory footprint and being challenging to train. We propose Large Receptive Field Networks which strive to directly expand the receptive field of Super-Resolution networks without increasing depth or parameter count. In particular, we use two different methods to expand the network receptive field: 1-D separable kernels and atrous convolutions. We conduct considerable experiments to study the performance of various arrangement schemes of the 1-D separable kernels and atrous convolution in terms of accuracy (PSNR / SSIM), parameter count, and speed, while focusing on the more challenging high upscaling factors. Extensive benchmark evaluations demonstrate the effectiveness of our approach.

关键词： Kernel Image resolution Convolutional codes Convolution Signal resolution Memory management computer vision

来源：评论

学校读者我要写书评

暂无评论

Collective Activity Detection using Hinge-loss Markov Random Fields

Collective Activity Detection using Hinge-loss Markov Random...

引用

26th ieee conference on computer vision and pattern recognition (CVPR)

作者： London, Ben Khamis, Sameh Bach, Stephen H. Huang, Bert Getoor, Lise Davis, Larry Univ Maryland College Pk MD 20742 USA

ISBN: (纸本)9780769549903

We propose hinge-loss Markov random fields (HL-MRFs), a powerful class of continuous-valued graphical models, for high-level computer vision tasks. HL-MRFs are characterized by log-concave density functions, and are able to perform efficient, exact inference. Their templated hinge-loss potential functions naturally encode soft-valued logical rules. Using the declarative modeling language probabilistic soft logic, one can easily define HL-MRFs via familiar constructs from first-order logic. We apply HL-MRFs to the task of activity detection, using principles of collective classification. Our model is simple, intuitive and interpretable. We evaluate our model on two datasets and show that it achieves significant lift over the low-level detectors.

关键词： Markov processes computer vision image classification object detection probabilistic logic random processes

来源：评论

学校读者我要写书评

暂无评论

Efficient Online Multi-Camera Tracking with Memory-Efficient Accumulated Appearance Features and Trajectory Validation

Efficient Online Multi-Camera Tracking with Memory-Efficient...

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Lap Quoc Tran Huan Duc Vi Asilla Tokyo Japan

ISBN: (纸本)9798350365474

Multi-camera tracking (MCT) plays a crucial role in various computer vision applications. However, accurate tracking of individuals across multiple cameras faces challenges, particularly with identity switches. In this paper, we present an efficient online MCT system that tackles these challenges through online processing. Our system leverages memory-efficient accumulated appearance features to provide stable representations of individuals across cameras and time. By incorporating trajectory validation using hierarchical agglomerative clustering (HAC) in overlapping regions, ID transfers are identified and rectified. Evaluation on the 2024 AI City Challenge Track 1 dataset [39] demonstrates the competitive performance of our system, achieving accurate tracking in both overlapping and non-overlapping camera networks. With a 40.3% HOTA score [29], our system ranked 9th in the challenge. The integration of trajectory validation enhances performance by 8% over the baseline, and the accumulated appearance features further contribute to a 17% improvement.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Finding Facial Forgery Artifacts with Parts-Based Detectors

Finding Facial Forgery Artifacts with Parts-Based Detectors

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Schwarcz, Steven Chellappa, Rama Univ Maryland College Pk MD 20742 USA Johns Hopkins Univ Baltimore MD USA

ISBN: (纸本)9781665448994

Manipulated videos, especially those where the identity of an individual has been modified using deep neural networks, are becoming an increasingly relevant threat in the modern day. In this paper, we seek to develop a generalizable, explainable solution to detecting these manipulated videos. To achieve this, we design a series of forgery detection systems that each focus on one individual part of the face. These parts-based detection systems, which can be combined and used together in a single architecture, meet all of our desired criteria - they generalize effectively between datasets and give us valuable insights into what the network is looking at when making its decision. We thus use these detectors to perform detailed empirical analysis on the FaceForensics++, Celeb-DF, and Facebook Deep-fake Detection Challenge datasets, examining not just what the detectors find but also collecting and analyzing useful related statistics on the datasets themselves.

关键词： Deep learning computer vision Social networking (online) Face recognition conferences Neural networks Detectors

来源：评论

学校读者我要写书评

暂无评论

Contrastive Domain Adaptation

Contrastive Domain Adaptation

引用

ieee/CVF conference on computer vision and pattern recognition (CVPR)

作者： Thota, Mamatha Leontidis, Georgios Univ Lincoln Sch Comp Sci Lincoln LN6 7TS England Univ Aberdeen Dept Comp Sci Aberdeen AB24 3UE Scotland

ISBN: (纸本)9781665448994

Recently, contrastive self-supervised learning has become a key component for learning visual representations across many computer vision tasks and benchmarks. However, contrastive learning in the context of domain adaptation remains largely underexplored. In this paper, we propose to extend contrastive learning to a new domain adaptation setting, a particular situation occurring where the similarity is learned and deployed on samples following different probability distributions without access to labels. Contrastive learning learns by comparing and contrasting positive and negative pairs of samples in an unsupervised setting without access to source and target labels. We have developed a variation of a recently proposed contrastive learning framework that helps tackle the domain adaptation problem, further identifying and removing possible negatives similar to the anchor to mitigate the effects of false negatives. Extensive experiments demonstrate that the proposed method adapts well, and improves the performance on the downstream domain adaptation task.

关键词： Training computer vision Adaptation models Visualization Machine learning algorithms Pipelines Machine learning

来源：评论

学校读者我要写书评

暂无评论

Next generation FPGAs and SOCs - How embedded systems can profit

Next generation FPGAs and SOCs - How embedded systems can pr...

引用

26th ieee conference on computer vision and pattern recognition (CVPR)

作者： Eberli, Felix Supercomp Syst AG CH-8005 Zurich Switzerland

ISBN: (纸本)9780769549903

New SOC like the Xilinx Zynq 7045 allow researchers and developers to combine the advantages of writing software for control functionality and having accelerators in the FPGA logic for the number crunching. The dual core Cortex-A9 ARM processor runs with up to 1 GHz and the FPGA has up to 900 DSP slices allowing a performance of up to 1,334 GMACs. SCS is porting a lot of algorithms like SGM stereo [1], Stixel clustering or an optical flow [2] to such devices allowing new cars to see their environment and react appropriately. The new developed SCS Zynq 7045 module will allow accelerated development using this technology.

关键词： digital signal processing chips embedded systems field programmable gate arrays image sequences pattern clustering stereo image processing system-on-chip

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：