The integration of Large Language models (LLMs) with robotic systems has opened new avenues for the development of empathetic and interactive robot partners. This paper introduces a service robot system that incorpora...
详细信息
ISBN:
(纸本)9789819607853;9789819607860
The integration of Large Language models (LLMs) with robotic systems has opened new avenues for the development of empathetic and interactive robot partners. This paper introduces a service robot system that incorporates multi-modal emotion recognition and LLM-based emotion dialogue generation. The system captures user emotions through a tri-modal emotion recognition model (TriMER), which processes audio, text, and facial expressions using advanced techniques like BiLSTM, CNN, and Deformable Convolutional Networks (DCN). Experiments conducted using the IEMOCAP dataset show that our TriMER model achieves an accuracy of 74.15% in recognizing emotions. By combining emotion recognition with LLM, the robot can better understand and respond to human emotions, facilitating more natural and empathetic interactions. This development holds promise for applications in elder care, aiming to enhance both physical and mental well-being.
The integration of multiple pre-trained models in robotic navigation has the advantage of combining diverse strengths, leading to robust and generalized performance. However, the effectiveness of these models is often...
详细信息
ISBN:
(纸本)9789819607884;9789819607891
The integration of multiple pre-trained models in robotic navigation has the advantage of combining diverse strengths, leading to robust and generalized performance. However, the effectiveness of these models is often limited by path planning strategies, necessitating improvements in navigation capabilities. To overcome this, we introduce the Free-form Instruction Guided Robotic Navigation Path Planning with Large Vision-Language Model (FIG-RN). This model leverages free-form instructions to extract landmarks and directional cues, utilizing a pre-trained visual-language model to associate these landmarks with map nodes, thereby laying the groundwork for subsequent path planning. It evaluates landmark-node matches, node accessibility, and orientation to optimize path planning. Compared to traditional models, FIG-RN offers significant benefits: (i) it requires no map annotations due to its use of high-quality pre-trained models, (ii) it maximizes information use from instructions for better path efficacy, and (iii) it refines visual-language model matching values for improved local navigation. Experimentally, FIG-RN outperforms LM-Nav in success rate, efficiency, and accuracy, with improvements of 0.2, 0.2143, and 0.208, respectively.
Task-oriented dialog (TOD) systems use external knowledge sources to help users accomplish specific tasks. While most current TOD research focuses on simple information-collecting tasks in a slot-filling framework, mu...
详细信息
ISBN:
(数字)9789819607921
ISBN:
(纸本)9789819607914;9789819607921
Task-oriented dialog (TOD) systems use external knowledge sources to help users accomplish specific tasks. While most current TOD research focuses on simple information-collecting tasks in a slot-filling framework, multi-step reasoning tasks like troubleshooting remain under-explored. Leveraging the advancements of large language models (LLMs), we propose a novel LLM-based multi-agent learning framework to build troubleshooting dialogue systems and evaluate the effectiveness of various multi-agent learning settings in a TOD system. Our results show that LLMs designed for open-domain dialog face challenges when directly applied to TOD systems, but with multi-agent cooperative enhancements, LLMs can achieve commendable performance.
Improving the precision of GPS/IMU localisation in autonomous cars is crucial for ensuring safe and efficient navigation. Several research studies have concentrated on enhancing the precision of localisation systems b...
详细信息
This paper introduces a novel design for flapping wing micro aerial vehicles (FWMAVs) that employs dynamic amplification. By utilising mechanical resonance, the design amplifies wing stroke and pitching motions, reduc...
详细信息
Arc welding, a predominant joining process, is increasingly being utilized in new applications, particularly metal 3D printing. Before 3D printing a part, its 3D CAD model is sliced into thin layers. Accurate identifi...
详细信息
Accurate estimation of spatial gait parameters is crucial for assessing fall risk in older adults, helping to identify potential movement impairments and prevent falls. Traditionally, extracting stride length from wea...
详细信息
Shape part segmentation is a critical task in computer graphics and robotics. However, traditional supervised methods rely heavily on large amounts of labeled data, which poses significant challenges in many real-worl...
详细信息
Robot-based Non-Planar Additive Manufacturing, as a manufacturing technique, may be a candidate for the fabrication of soft robotics compliant mechanisms, owing to the increased design freedom and final part mechanica...
详细信息
Underwater images are widely used in marine science, ocean engineering, and underwater robotics. However, challenges such as insufficient lighting, scattering, and absorption often degrade image quality, limiting thei...
详细信息
暂无评论