咨询与建议

限定检索结果

文献类型

  • 84 篇 会议
  • 67 篇 期刊文献

馆藏范围

  • 151 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 110 篇 工学
    • 95 篇 计算机科学与技术...
    • 78 篇 软件工程
    • 33 篇 信息与通信工程
    • 17 篇 控制科学与工程
    • 15 篇 电子科学与技术(可...
    • 12 篇 电气工程
    • 10 篇 光学工程
    • 9 篇 网络空间安全
    • 7 篇 生物工程
    • 6 篇 机械工程
    • 6 篇 化学工程与技术
    • 5 篇 仪器科学与技术
    • 5 篇 生物医学工程(可授...
    • 4 篇 材料科学与工程(可...
    • 3 篇 石油与天然气工程
    • 2 篇 建筑学
    • 1 篇 动力工程及工程热...
    • 1 篇 土木工程
  • 40 篇 理学
    • 24 篇 数学
    • 18 篇 物理学
    • 7 篇 化学
    • 7 篇 生物学
    • 7 篇 统计学(可授理学、...
    • 1 篇 地球物理学
    • 1 篇 系统科学
  • 32 篇 管理学
    • 17 篇 图书情报与档案管...
    • 13 篇 管理科学与工程(可...
    • 6 篇 工商管理
  • 1 篇 经济学
    • 1 篇 应用经济学
  • 1 篇 法学
    • 1 篇 社会学
  • 1 篇 农学

主题

  • 7 篇 object detection
  • 4 篇 contrastive lear...
  • 4 篇 computational mo...
  • 4 篇 decoding
  • 4 篇 image reconstruc...
  • 3 篇 convolution
  • 3 篇 quality of servi...
  • 3 篇 benchmark testin...
  • 3 篇 benchmarking
  • 3 篇 computer vision
  • 3 篇 training
  • 3 篇 adaptation model...
  • 2 篇 low cost
  • 2 篇 machine translat...
  • 2 篇 transformers
  • 2 篇 image segmentati...
  • 2 篇 anomaly detectio...
  • 2 篇 algorithms
  • 2 篇 cameras
  • 2 篇 motion compensat...

机构

  • 65 篇 shanghai key lab...
  • 49 篇 shanghai collabo...
  • 26 篇 shanghai collabo...
  • 21 篇 microsoft resear...
  • 9 篇 microsoft cloud ...
  • 8 篇 meituan
  • 6 篇 shanghai key lab...
  • 6 篇 key lab. of inte...
  • 5 篇 huya inc
  • 4 篇 chongqing univer...
  • 4 篇 guilin universit...
  • 4 篇 university of to...
  • 4 篇 carnegie mellon ...
  • 4 篇 school of comput...
  • 4 篇 shanghai key lab...
  • 3 篇 singapore univer...
  • 3 篇 shanghai key lab...
  • 3 篇 university of ma...
  • 3 篇 school of comput...
  • 3 篇 shanghai ai labo...

作者

  • 49 篇 jiang yu-gang
  • 45 篇 wu zuxuan
  • 20 篇 zuxuan wu
  • 19 篇 yu-gang jiang
  • 16 篇 chen jingjing
  • 9 篇 wang junke
  • 8 篇 meng lingchen
  • 8 篇 chen dongdong
  • 8 篇 jiao yang
  • 8 篇 dai xiyang
  • 8 篇 yuan lu
  • 7 篇 liu qun
  • 7 篇 dai qi
  • 7 篇 chen shaoxiang
  • 7 篇 luo chong
  • 7 篇 xing zhen
  • 6 篇 ma lin
  • 6 篇 lin shouxun
  • 6 篇 dongdong chen
  • 6 篇 jingjing chen

语言

  • 114 篇 英文
  • 31 篇 其他
  • 6 篇 中文
检索条件"机构=Lab. of Visual Info. Processing"
151 条 记 录,以下是1-10 订阅
排序:
Lumen: Unleashing Versatile Vision-Centric Capabilities of Large Multimodal Models  38
Lumen: Unleashing Versatile Vision-Centric Capabilities of L...
收藏 引用
38th Conference on Neural info.mation processing Systems, NeurIPS 2024
作者: Jiao, Yang Chen, Shaoxiang Jie, Zequn Chen, Jingjing Ma, Lin Jiang, Yu-Gang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center on Intelligent Visual Computing China Meituan China
Large Multimodal Model (LMM) is a hot research topic in the computer vision area and has also demonstrated remarkable potential across multiple disciplinary fields. A recent trend is to further extend and enhance the ...
来源: 评论
OmniTokenizer: A Joint Image-Video Tokenizer for visual Generation  38
OmniTokenizer: A Joint Image-Video Tokenizer for Visual Gene...
收藏 引用
38th Conference on Neural info.mation processing Systems, NeurIPS 2024
作者: Wang, Junke Jiang, Yi Yuan, Zehuan Peng, Binyue Wu, Zuxuan Jiang, Yu-Gang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center on Intelligent Visual Computing China Bytedance Inc. China
Tokenizer, serving as a translator to map the intricate visual data into a compact latent space, lies at the core of visual generative models. Based on the finding that existing tokenizers are tailored to image or vid...
来源: 评论
DeepStack: Deeply Stacking visual Tokens is Surprisingly Simple and Effective for LMMs  38
DeepStack: Deeply Stacking Visual Tokens is Surprisingly Sim...
收藏 引用
38th Conference on Neural info.mation processing Systems, NeurIPS 2024
作者: Meng, Lingchen Yang, Jianwei Tian, Rui Dai, Xiyang Wu, Zuxuan Gao, Jianfeng Jiang, Yu-Gang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center of Intelligent Visual Computing China Microsoft Corporation United States
Most large multimodal models (LMMs) are implemented by feeding visual tokens as a sequence into the first layer of a large language model (LLM). The resulting architecture is simple but significantly increases computa...
来源: 评论
GenRec: Unifying Video Generation and Recognition with Diffusion Models  38
GenRec: Unifying Video Generation and Recognition with Diffu...
收藏 引用
38th Conference on Neural info.mation processing Systems, NeurIPS 2024
作者: Weng, Zejia Yang, Xitong Xing, Zhen Wu, Zuxuan Jiang, Yu-Gang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center of Intelligent Visual Computing China Department of Computer Science University of Maryland United States
Video diffusion models are able to generate high-quality videos by learning strong spatial-temporal priors on large-scale datasets. In this paper, we aim to investigate whether such priors derived from a generative pr...
来源: 评论
Lumen: unleashing versatile vision-centric capabilities of large multimodal models  24
Lumen: unleashing versatile vision-centric capabilities of l...
收藏 引用
Proceedings of the 38th International Conference on Neural info.mation processing Systems
作者: Yang Jiao Shaoxiang Chen Zequn Jie Jingjing Chen Lin Ma Yu-Gang Jiang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University and Shanghai Collaborative Innovation Center on Intelligent Visual Computing and Meituan Meituan Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University and Shanghai Collaborative Innovation Center on Intelligent Visual Computing
Large Multimodal Model (LMM) is a hot research topic in the computer vision area and has also demonstrated remarkable potential across multiple disciplinary fields. A recent trend is to further extend and enhance the ...
来源: 评论
Spk2ImgMamba: Spiking Camera Image Reconstruction with Multi-Scale State Space Models
Spk2ImgMamba: Spiking Camera Image Reconstruction with Multi...
收藏 引用
2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
作者: Yin, Jiaoyang Fan, Bin Xu, Chao Huang, Tiejun Shi, Boxin School of Computer Science Peking University State Key Lab of Multimedia Info. Processing China School of Computer Science Peking University Nat'l Eng. Research Ctr. of Visual Technology China School of Intelligence Science and Technology Peking University Nat'l Key Lab of General Ai China
As a bio-inspired vision sensor, the spiking camera has showcased remarkable capability in high-speed imaging with a sampling rate of 40,000 Hz. Reconstructing clear images from continuous spike streams, which is obta... 详细信息
来源: 评论
FOCUS: Towards Universal Foreground Segmentation
arXiv
收藏 引用
arXiv 2025年
作者: You, Zuyao Kong, Lingyu Meng, Lingchen Wu, Zuxuan Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center of Intelligent Visual Computing China
Foreground segmentation is a fundamental task in computer vision, encompassing various subdivision tasks. Previous research has typically designed task-specific architectures for each task, leading to a lack of unific... 详细信息
来源: 评论
DeepStack: deeply stacking visual tokens is surprisingly simple and effective for LMMs  24
DeepStack: deeply stacking visual tokens is surprisingly sim...
收藏 引用
Proceedings of the 38th International Conference on Neural info.mation processing Systems
作者: Lingchen Meng Jianwei Yang Rui Tian Xiyang Dai Zuxuan Wu Jianfeng Gao Yu-Gang Jiang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University and Shanghai Collaborative Innovation Center of Intelligent Visual Computing Microsoft Corporation
Most large multimodal models (LMMs) are implemented by feeding visual tokens as a sequence into the first layer of a large language model (LLM). The resulting architecture is simple but significantly increases computa...
来源: 评论
FOCUS: Towards Universal Foreground Segmentation  39
FOCUS: Towards Universal Foreground Segmentation
收藏 引用
39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025
作者: You, Zuyao Kong, Lingyu Meng, Lingchen Wu, Zuxuan Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center of Intelligent Visual Computing China
Foreground segmentation is a fundamental task in computer vision, encompassing various subdivision tasks. Previous research has typically designed task-specific architectures for each task, leading to a lack of unific... 详细信息
来源: 评论
BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection
BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detecti...
收藏 引用
Conference on Computer Vision and Pattern Recognition (CVPR)
作者: Zhenxin Li Shiyi Lan Jose M. Alvarez Zuxuan Wu Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University Shanghai Collaborative Innovation Center of Intelligent Visual Computing NVIDIA
Recently, the rise of query-based Transformer decoders is reshaping camera-based 3D object detection. These query-based decoders are surpassing the traditional dense BEV (Bird's Eye View)-based methods. However, w... 详细信息
来源: 评论