Automation Confusion

CHI 2023 · Grounded Theory · Cimolino, Chen, Gutwin, Graham

Watch the CHI 2023 presentation →

Partial automation makes games simpler by performing some actions on behalf of the player. It also creates an interpretive challenge: the player must form an accurate model of what they control and what the AI controls in order to coordinate effectively. This study asked what happens when that model is wrong — not as a secondary concern, but as the primary object of investigation.

Ten non-gamer participants played two partially automated games while thinking aloud. Sessions were recorded and transcribed. Grounded theory analysis of the resulting data produced a taxonomy of twelve categories of mental model error, organised into two families.

False causation errors involve misattribution of control: players believing they cause actions the AI performs, or believing they do not cause actions they do. Both directions occurred. Players who over-attributed their own control sometimes stopped performing actions the AI was waiting for, because they believed they were already doing them. Players who under-attributed their own control sometimes stopped playing altogether, concluding they had no meaningful role.

Explanation errors involve the rules players construct to account for AI behaviour: rules that are too simple to explain what they observe, rules that don't correspond to any aspect of the system, or the absence of any rule. Players with wrong explanatory rules produced behaviour that was locally coherent — they acted consistently with their models — but that was counterproductive within the actual system.

The finding with the most direct design implication: feedback whose meaning players misunderstand is worse than no feedback. Players who received ambiguous output from the AI incorporated it into their models — incorrectly — and became more confident in wrong beliefs than players who received nothing. Adding interface elements to a confusing system can make the confusion worse.

The study was deliberately small. Grounded theory is a method for building theory from data, not for measuring effect sizes. Ten participants in think-aloud sessions produced a twelve-category taxonomy with design implications at the level of individual error types. The same investment in a confirmatory study would have produced a single number — whether confusion was present — and nothing about its structure.

Read the paper →