咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >OBI-BENCH: CAN LMMS AID IN STU... 收藏
arXiv

OBI-BENCH: CAN LMMS AID IN STUDY OF ANCIENT SCRIPT ON ORACLE BONES?

作     者:Chen, Zijian Chen, Tingzhu Zhang, Wenjun Zhai, Guangtao 

作者机构:Institute of Image Communication and Information Processing Shanghai Jiao Tong University China School of Humanities Shanghai Jiao Tong University China 

出 版 物:《arXiv》 (arXiv)

年 卷 期:2024年

核心收录:

主  题:Benchmarking 

摘      要:We introduce OBI-Bench, a holistic benchmark crafted to systematically evaluate large multi-modal models (LMMs) on whole-process oracle bone inscriptions (OBI) processing tasks demanding expert-level domain knowledge and deliberate cognition. OBI-Bench includes 5,523 meticulously collected diverse-sourced images, covering five key domain problems: recognition, rejoining, classification, retrieval, and deciphering. These images span centuries of archaeological findings and years of research by front-line scholars, comprising multi-stage font appearances from excavation to synthesis, such as original oracle bone, inked rubbings, oracle bone fragments, cropped single characters, and handprinted characters. Unlike existing benchmarks, OBI-Bench focuses on advanced visual perception and reasoning with OBI-specific knowledge, challenging LMMs to perform tasks akin to those faced by experts. The evaluation of 6 proprietary LMMs as well as 17 open-source LMMs highlights the substantial challenges and demands posed by OBI-Bench. Even the latest versions of GPT-4o, Gemini 1.5 Pro, and Qwen-VL-Max are still far from public-level humans in some fine-grained perception tasks. However, they perform at a level comparable to untrained humans in deciphering tasks, indicating remarkable capabilities in offering new interpretative perspectives and generating creative guesses. We hope OBI-Bench can facilitate the community to develop domain-specific multi-modal foundation models towards ancient language research and delve deeper to discover and enhance these untapped potentials of LMMs. © 2024, CC BY.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分