Private data, being larger and quality-higher than public data, can greatly improve large language models (LLM). However, due to privacy concerns, this data is often dispersed in multiple silos, making its secure util...
详细信息
Following multiple instructions is a crucial ability for large language models (LLMs). Evaluating this ability comes with significant challenges: (i) limited coherence between multiple instructions, (ii) positional bi...
详细信息
The emotions we experience involve complex processes;besides physiological aspects, research in psychology has studied cognitive appraisals where people assess their situations subjectively, according to their own val...
详细信息
ISBN:
(纸本)9798891760615
The emotions we experience involve complex processes;besides physiological aspects, research in psychology has studied cognitive appraisals where people assess their situations subjectively, according to their own values (Scherer, 2005). Thus, the same situation can often result in different emotional experiences. While the detection of emotion is a well-established task, there is very limited work so far on the automatic prediction of cognitive appraisals. This work fills the gap by presenting COVIDET- APPRAISALS, the most comprehensive dataset to-date that assesses 24 appraisal dimensions, each with a naturallanguage rationale, across 241 Reddit posts. COVIDET- APPRAISALS presents an ideal testbed to evaluate the ability of large language models- excelling at a wide range of NLP tasks - to automatically assess and explain cognitive appraisals. We found that while the best models are performant, open-sourced LLMs fall short at this task, presenting a new challenge in the future development of emotionally intelligent models. We release our dataset at https://***/ honglizhan/CovidET-Appraisals-Public.
This article presents a novel conversational artificial intelligence (CAI)-enabled active ideation system as a creative idea generation tool to assist novice product designers in mitigating the initial latency and ide...
详细信息
This article presents a novel conversational artificial intelligence (CAI)-enabled active ideation system as a creative idea generation tool to assist novice product designers in mitigating the initial latency and ideation bottlenecks that are commonly observed. It is a dynamic, interactive, and contextually responsive approach, actively involving a large language model (LLM) from the domain of naturallanguageprocessing (NLP) in artificial intelligence (AI) to produce multiple statements of potential ideas for different design problems. Integrating such AI models with ideation creates what we refer to as an active ideation scenario, which helps foster continuous dialog-based interaction, context-sensitive conversation, and prolific idea generation. An empirical study was conducted with 30 novice product designers to generate multiple ideas for given problems using traditional methods and the new CAI-based interface. The ideas generated by both methods were qualitatively evaluated by a panel of experts. The findings demonstrated the relative superiority of the proposed tool for generating prolific, meaningful, novel, and diverse ideas. The interface was enhanced by incorporating a prompt-engineered structured dialog style for each ideation stage to make it uniform and more convenient for the product designers. A pilot study was conducted and the resulting responses of such a structured CAI interface were found to be more succinct and aligned toward the subsequent design stage. The article thus established the rich potential of using generative AI (Gen-AI) for the early ill-structured phase of the creative product design process.
Visual storytelling aims to generate compelling narratives from image sequences. Existing models often focus on enhancing the representation of the image sequence, e.g., with external knowledge sources or advanced gra...
详细信息
ISBN:
(纸本)9798891760615
Visual storytelling aims to generate compelling narratives from image sequences. Existing models often focus on enhancing the representation of the image sequence, e.g., with external knowledge sources or advanced graph structures. Despite recent progress, the stories are often repetitive, illogical, and lacking in detail. To mitigate these issues, we present a novel framework which integrates visual representations with pretrained language models and planning. Our model translates the image sequence into a visual prefix, a sequence of continuous embeddings which language models can interpret. It also leverages a sequence of question-answer pairs as a blueprint plan for selecting salient visual concepts and determining how they should be assembled into a narrative. Automatic and human evaluation on the VIST benchmark (Huang et al., 2016) demonstrates that blueprint-based models generate stories that are more coherent, interesting, and natural compared to competitive baselines and state-of-the-art systems.
Multilingual large language models are designed, claimed, and expected to cater to speakers of varied languages. We hypothesise that the current practices of fine-tuning and evaluating these models may not perfectly a...
详细信息
Recent advancements in Large language Models (LLMs) have facilitated the development of Multimodal LLMs (MLLMs). Despite their impressive capabilities, MLLMs often suffer from over-reliance on unimodal biases (e.g., l...
详细信息
k-nearest-neighbor machine translation has demonstrated remarkable improvements in machine translation quality by creating a datastore of cached examples. However, these improvements have been limited to high-resource...
详细信息
Social media platforms are extensively used for expressing opinions or conveying information. The information available on such platforms can be used for various humanitarian and disaster-related tasks as distributing...
详细信息
ISBN:
(纸本)9798891760608
Social media platforms are extensively used for expressing opinions or conveying information. The information available on such platforms can be used for various humanitarian and disaster-related tasks as distributing messages in different formats through social media is quick and easy. Often this useful information during disaster events goes to waste as efficient systems don't exist which can turn these unstructured data into meaningful format which can ultimately assist aid agencies. In disaster identification and assessment, information available is naturally multimodal, however, most existing work has been solely focused on single modalities e.g. images or texts separately. When information from different modalities are integrated, it produces significantly better results. In this paper, we have explored different models which can lead to the development of a system that deals with multimodal datasets and can perform sequential hierarchical classification. Specifically, we aim to find the damage and its severity along with classifying the data into humanitarian categories. The different stages in the hierarchical classification have had their respective models selected by researching with many different modality specific models and approaches of multimodal classification including multi task learning. The hierarchical model can give results at different abstraction levels according to the use cases. Through extensive quantitative and qualitative analysis, we show how our system is effective in classifying the multimodal tweets along with an excellent computational efficiency and assessment performance. With the help of our approach, we aim to support disaster management through identification of situations involving humanitarian tragedies and aid in assessing the severity and type of damage.
Programming augmented by large language models (LLMs) opens up many new application areas, but also requires care. LLMs are accurate enough, on average, to replace core functionality, yet make basic mistakes that demo...
详细信息
暂无评论