Three-Stage Training and Inference Flow for Memory-Augmented AI

ML & AI · flowchart diagram · GPL-3.0

Illustrates a three-stage training process for an AI model incorporating memory grids, followed by its inference generation flow, including a detailed sing

Source: https://github.com/DslsDZC/LCM/blob/2d68ce9ccd51a0b72a51c76b9e6ee4fafc34abc5/readme/README_cn.md
Curated by DslsDZC
AI Training Inference Flow Memory Networks Machine Learning Deep Learning Flowchart System Architecture

Mermaid source

flowchart TB
    subgraph Train["三阶段训练流程"]
        T0[原始文本] --> T1[BPE Tokenizer]
        T1 --> T2[uint16 mmap]
        
        subgraph S1["Stage 1: LM 预训练"]
            direction LR
            S1A[tokens] --> S1B[GenHead 解码器]
            S1B --> S1C[交叉熵损失]
            S1C --> S1D[仅训练解码器]
        end
        
        subgraph S2["Stage 2: 记忆训练"]
            direction LR
            S2A[tokens] --> S2B[编码器 + 6格码本]
            S2B --> S2C[VQ + 对比 + 正交损失]
            S2C --> S2D[编码器/码本训练<br/>解码器冻结]
        end
        
        subgraph S3["Stage 3: 联合微调"]
            direction LR
            S3A[tokens] --> S3B[全部参数]
            S3B --> S3C[综合损失]
            S3C --> S3D[低学习率微调]
        end
        
        T2 --> S1
        S1 -->|加载解码器权重| S2
        S2 --> S3
    end

    subgraph Infer["推理生成流程"]
        I0[用户提示] --> I1[Tokenizer]
        I1 --> I2{首次?}
        I2 -->|是| I3[编码器完整编码<br/>+ 构建增量状态]
        I2 -->|否| I4[每 256 步?]
        I4 -->|是| I5[完整重编码<br/>复位累积漂移]
        I4 -->|否| I6["增量编码 O(d²) 单步更新"]
        I3 --> I7[瓶颈向量 z]
        I5 --> I7
        I6 --> I7
        I7 --> I8[C 推理引擎<br/>多步 DAG 认知循环]
        I8 --> I9[GenHead 解码器<br/>线性注意 + GLU]
        I9 --> I10[温度采样]
        I10 --> I11{遇到 EOS?}
        I11 -->|否| I12[追加 token<br/>更新状态]
        I12 --> I2
        I11 -->|是| I13[输出文本]
    end

    subgraph DAG["推理引擎单步 DAG"]
        direction TB
        Z([z]) --> Route[距离路由]
        Route --> HRQ[双曲层次格<br/>HRQ 检索]
        Route --> SP[稀疏格<br/>VQ 检索]
        Route --> LR[低秩格<br/>共享基检索]
        Route --> MF[流形格<br/>切空间滑动]
        Route --> BD[绑定格<br/>HRR 绑定/解绑]
        Route --> CT[对比格<br/>双码本检索]
        HRQ & SP & LR & MF & BD & CT --> Fusion[距离加权融合]
        Fusion --> GVal[全局价值格<br/>三定律安全检查]
        GVal --> Danger{危险格检测}
        Danger -->|危险| Halt[硬中断]
        Danger -->|安全| Conv{收敛?<br/>Δz < 阈值}
        Conv -->|否| Route
        Conv -->|是| ZQ([z_q 输出])
    end

What this diagram shows

This diagram details a three-stage training pipeline and an inference generation process for an AI system. The training includes LM pre-training (decoder only), memory training (encoder + 6 memory grids, decoder frozen), and joint fine-tuning. The inference flow covers user prompt processing, encoding, a multi-step C inference engine, and decoding with temperature sampling. A detailed DAG for the inference engine's single step shows routing to six specialized memory grids (Hyperbolic Hierarchical, Sparse, Low-Rank, Manifold, Binding, Contrastive), fusion, global value grid safety checks, and convergence loops.

When to use it

Use this diagram when designing, explaining, or documenting complex AI systems that feature multi-stage training, external memory components, and iterative inference processes. It is particularly useful for architectures involving specialized memory grids and a structured inference engine.

How to adapt it for your project

Adapt this diagram by modifying the specific types and number of memory grids to suit different data modalities or cognitive functions. Adjust the training stages, loss functions (e.g., VQ, contrastive, orthogonal), and parameter freezing strategies. Customize the inference engine's DAG structure, routing algorithms, safety checks, and re-encoding frequency (e.g., 256 steps) to optimize for performance, accuracy, or specific application requirements.

Key concepts

Three-Stage Training
Memory Grids
C Inference Engine
Directed Acyclic Graph (DAG)
Encoder-Decoder Architecture