Chat + tool-call storytelling
Same retrieve_context query, two phases:
A retrieves with similarity-only ranking (no outcome history).
B retrieves after outcomes have been recorded — Q-values
shift the ranking so high-confidence-but-low-utility nodes drop below
high-utility ones. Two-phase retrieval is self-correcting.