Multi-turn Conversation Flow with Context Management

ML & AI · flowchart diagram · MIT

This diagram illustrates the process of handling multi-turn conversations for a language model, including appending user messages, rendering prompts, encod

Source: https://github.com/jiaran-king/MicroLM/blob/782ae02f10c14b484a317f22115a066b3b10b91d/Readme/%E9%A1%B9%E7%9B%AE%E5%85%A8%E6%99%AF%E5%9B%BE/00-%E5%85%A8%E6%B5%81%E7%A8%8B%E5%88%86%E6%9E%90%EF%BC%88%E8%AE%AD%E7%BB%83%E3%80%81%E6%8E%A8%E7%90%86%E3%80%81%E8%AF%84%E6%B5%8B%E4%B8%8E%E9%83%A8%E7%BD%B2%EF%BC%89.md
Curated by jiaran-king
LLM Chatbot Conversation flow Prompt management Context window AI State management

Mermaid source

%%{init: {"theme": "base", "themeVariables": {"background": "#ffffff", "primaryColor": "#eef6ff", "primaryBorderColor": "#60a5fa", "primaryTextColor": "#0f172a", "lineColor": "#64748b"}}}%%
flowchart TB
    U["user_message"] --> H1["追加到 conversations 历史"]
    H1 --> R["_render_prompt() / build_generation_prompt()<br>渲染 system / user / assistant 历史"]
    R --> E["encode → input_ids"]
    E --> C{"是否超出 context_length?"}
    C -- 是 --> CL["裁剪最早历史轮次<br>保留最近上下文"]
    C -- 否 --> G["model.generate()"]
    CL --> G
    G --> D["decode → assistant_text"]
    D --> S["sanitize<br>清理 surrogate · 条件性追加 EOS"]
    S --> H2["写回 conversations 历史"]
    H2 --> N["下一轮继续"]

    classDef hist fill:#eff6ff,stroke:#60a5fa,color:#0f172a;
    classDef prompt fill:#f8fafc,stroke:#94a3b8,color:#0f172a;
    classDef run fill:#f0fdf4,stroke:#22c55e,color:#0f172a;
    classDef risk fill:#fff7ed,stroke:#fb923c,color:#0f172a;
    class U,H1,H2,N hist;
    class R,E prompt;
    class G,D,S,CL run;
    class C risk;

What this diagram shows

It details the lifecycle of a user message in a multi-turn conversation, from being appended to history, rendered into a prompt, encoded, checked against context length (with truncation if needed), generated by the model, decoded, sanitized, and then stored back into history for subsequent turns. It highlights context management and prompt engineering aspects.

When to use it

This diagram is useful when designing or understanding the conversational flow for AI chatbots, especially those based on large language models, where managing conversation history, context window, and prompt construction is critical for coherent multi-turn interactions. It's also relevant for debugging issues related to long conversations or token limits.

How to adapt it for your project

This flow can be adapted by changing the prompt rendering strategy, implementing different context truncation policies (e.g., summarization instead of simple truncation), integrating different sanitization rules, or adding steps for persona management or external tool calls within the conversation loop. The 'model.generate()' step can be replaced with specific API calls or custom inference logic.

Key concepts

Multi-turn conversation
Context management
Prompt engineering
Tokenization and encoding
Context window truncation