Grounding Gaps in Language Model Generations

Omar Shaikh*

Kristina Gligorić*

Ashna Khetan

Matthias Gerstgrasser

Diyi Yang

Dan Jurafsky

Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2024

*Authors contributed equally

Abstract

Effective conversation requires common ground: a shared understanding between the participants. Common ground, however, does not emerge spontaneously in conversation. Speakers and listeners work together to both identify and construct a shared basis while avoiding misunderstanding. To accomplish grounding, humans rely on a range of dialogue acts, like clarification (What do you mean?) and acknowledgment (I understand.). In domains like teaching and emotional support, carefully constructing grounding prevents misunderstanding. However, it is unclear whether large language models (LLMs) leverage these dialogue acts in constructing common ground. To this end, we curate a set of grounding acts and propose corresponding metrics that quantify attempted grounding. We study whether LLMs use these grounding acts, simulating them taking turns from several dialogue datasets, and comparing the results to humans. We find that current LLMs are presumptive grounders, biased towards assuming common ground without using grounding acts. To understand the roots of this behavior, we examine the role of instruction tuning and reinforcement learning with human feedback (RLHF), finding that RLHF leads to less grounding. Altogether, our work highlights the need for more research investigating grounding in human-AI interaction.

Materials

Project

PDF

Code

BibTeX

					
@misc{shaikh2024grounding,
  title={Grounding Gaps in Language Model Generations}, 
  author={Omar Shaikh and Kristina Gligorić and Ashna Khetan and Matthias Gerstgrasser and Diyi Yang and Dan Jurafsky},
  year={2024},
  eprint={2311.09144},
  archivePrefix={arXiv},
  primaryClass={cs.CL}
}