Gabriele Cimolino

Flibbertigibbet

2024  ·  LLM · Classical Planning · Voice

Watch the demo →

A voice-controlled game featuring a talking dragon, built on a hybrid architecture that separates natural language understanding from world-action execution. The player speaks; the system transcribes the input and passes it to a large language model whose system prompt is designed to parse the player's utterance into goals, expressed as structured JSON using a small fixed schema. A classical planning algorithm receives those goals and produces the sequence of actions that satisfies them. The language model then narrates and enacts that sequence in character.

The architectural decision — LLM for intent parsing, planner for execution — was motivated by a design constraint: the dragon's behaviour in the world needed to be predictable and consistent, properties that a generative model alone cannot reliably provide. The planner guarantees that the world changes only in ways the planning domain allows. The language model provides the flexibility and personality that make the interaction feel natural.

This was built before LLM tool-calling existed as a standardised interface. Structured JSON output from the language model was the available primitive for grounding natural language intent to executable actions. The architecture anticipates what tool-calling formalised: a separation between what the model understands and what the system is permitted to do.

The design problem the project was exploring: what does it mean for a player to collaborate with a character who can understand anything they say, but whose actions in the world are entirely determined by a separate reasoning system? This is a question about the design of the human-AI interface, not about the capabilities of either component. It remains interesting.