SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning

Interactive task learning framework to cope with unforeseen possibilities by exploiting
the formal semantic analysis of embodied conversation

Rimvydas Rubavicius, Peter David Fagan, Alex Lascarides, Subramanian Ramamoorthy

Centre for AI in Assistive Autonomy

University of Edinburgh

Paper Code

Abstract: This paper addresses a challenging interactive task learning scenario we call rearrangement under unawareness: an agent must manipulate a rigid-body environment without knowing a key concept necessary for solving the task and must learn about it during deployment. For example, the user may ask to "put the two granny smith apples inside the basket", but the agent cannot correctly identify which objects in the environment are "granny smith" as the agent has not been exposed to such a concept before. We introduce SECURE, an interactive task learning policy designed to tackle such scenarios. The unique feature of SECURE is its ability to enable agents to engage in semantic analysis when processing embodied conversations and making decisions. Through embodied conversation, a SECURE agent adjusts its deficient domain model by engaging in dialogue to identify and learn about previously unforeseen possibilities. The SECURE agent learns from the user's embodied corrective feedback when mistakes are made and strategically engages in dialogue to uncover useful information about novel concepts relevant to the task. These capabilities enable the SECURE agent to generalize to new tasks with the acquired knowledge. We demonstrate in the simulated Blocksworld and the real-world apple manipulation environments that the SECURE agent, which solves such rearrangements under unawareness, is more data-efficient than agents that do not engage in embodied conversation or semantic analysis.

Introduction

Figure 1: Comparison between grounding DINO predictions and ground-truth domain model.

In real-world scenarios, the robot has to solve tasks under unawareness due to uncertainty and false beliefs about the structure and parameters of the domain model (see Figure 1)
Embodied conversation allows one to cope with unawareness by enabling interactive symbol grounding
We propose an interactive task learning framework that processes embodied conversation using semantic analysis to make robot semantics-aware

Background: Semantic Analysis

Formal semantic analysis allows to interpret embodied conversation messages and their logical consequences
Sentence-level analysis: "Put the two granny smiths inside a basket" entails that "there are only two granny smiths"
Discourse-level analysis: correction on pick with message "No. This is golden delicious" entails that the picked object "is not granny smith"

Framework Overview

Agent's belief state contains domain theory build over the course of embodied conversation and examplars of observation-symbol pairs from experience.
Dialogue strategy measures the value of asking certain questions to the teacher.
Query value is measured using expected information gain: \(I(b,a) = H(b) - \mathbb{E}_{\phi \sim \mathrm{Result}(a)}[H(\mathrm{Update}(b,\phi))]\)
It is included in the value function in weigthing query value and extected reward that includes the cost of wrong prediction: \(Q(b,a)=\theta_1I(b,a)+\theta_2\mathbb{E}_{b}[R(a)]\)

Belief Update Examples

Figure 2: human-robot interaction using embodied conversation to ask a question to reduce the uncertainty
about the domain or to processes user's corrective feedback in case of a wrong actions

Experiments and Results

We evaluate different agents in the simulated blocks domain and a real-world fruit domain, in which agents start unaware of the domain-level concepts and thought interaction learns to ground newly discovered concepts
Engaging in embodied conversation and processing it using formal semantic analysis has compounding benefits for bootstrapping interactive task learning
Semantics-aware agents can cope with false initial beliefs and revise them using the evidence acquired from extended interaction in the domain

Citation

@misc{secure2025,
title={SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning},
author={Rimvydas Rubavicius and Peter David Fagan and Alex Lascarides and Subramanian Ramamoorthy},
year={2025},
eprint={2409.17755},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2409.17755},
}