Knowledge distillation (KD) is known as a promising solution to compress large language models (LLMs) via transferring their knowledge to smaller *** this process, white-box KD methods usually minimize the distance be...
详细信息
The recently introduced path-star task is a minimal task designed to exemplify limitations to the abilities of language models (Bachmann and Nagarajan, 2024). It involves a path-star graph where multiple arms radiate ...
详细信息
The fashion domain includes a range of real-world multimodal tasks, such as multimodal retrieval and generation. Recent advancements in AI-generated content, particularly large language models for text and diffusion m...
详细信息
To evaluate knowledge in large language models (LLMs), current methods query the model and then evaluate its generated responses. In this work, we ask whether evaluation can be done before the model has generated any ...
详细信息
Large language models (LLMs) have become increasingly prevalent in our daily lives, leading to an expectation for LLMs to be trustworthy - both accurate and well-calibrated (the prediction confidence should align with...
详细信息
Large language models (LLMs) can solve complex multi-step problems, but little is known about how these computations are implemented internally. Motivated by this, we study how LLMs answer multi-hop queries such as &q...
详细信息
Large language models (LLMs) have demonstrated impressive language understanding and generation capabilities, enabling them to answer a wide range of questions across various domains. However, these models are not fla...
详细信息
As large vision-language models (LVLMs) evolve rapidly, the demand for high-quality and diverse data to align these models becomes increasingly crucial. However, the creation of such data with human supervision proves...
详细信息
Recent works often assume that Vision-language Model (VLM) representations are based on visual attributes like shape. However, it is unclear to what extent VLMs prioritize this information to represent concepts. We pr...
详细信息
Large language models (LLMs) have been adopted to process textual task description and accomplish procedural planning in embodied AI tasks because of their powerful reasoning ability. However, there is still lack of s...
详细信息
暂无评论