Data Pipeline Diagrams

ETL flows, streaming pipelines, lakehouse architectures, Kafka topologies, Airflow DAGs — modern data engineering patterns.

7 diagrams

Kafka Stream Processing Pipeline (Producer → Topics → Consumers → Sinks)

End-to-end Kafka pipeline showing app events, CDC, and IoT telemetry flowing through topics, stream processors (Kafka Streams, ksqlDB, Flink), and sinking into Snowflake, Elasticsearch, and S3.

flowchart · Apache-2.0

InstructIE Six-Step Data Pipeline

Illustrates the InstructIE six-step data pipeline, transforming 171K raw data into 28.5K structured, auditable training sets for LLMs.

flowchart · MIT

InstructIE Six-Step Data Pipeline

This diagram illustrates the InstructIE six-step data pipeline, transforming 171K raw data into 28.5K structured training data through an auditable, engine

flowchart · MIT

BPE Training Data Preparation Flow

This flowchart illustrates the sequential steps for preparing a raw Chinese corpus for Byte Pair Encoding (BPE) training, covering cleaning, splitting, and

flowchart · MIT

Text Tokenization and Memmap Data Preparation

This flowchart illustrates the process of preparing text data for machine learning models, involving BPE tokenization and efficient storage using memmap .n

flowchart · MIT

Data Flow in a Visualization Tool

Illustrates the data flow from various sources through a GraphQL layer, configurator, and renderer, with persistence for chart configurations.

flowchart · BSD-3-Clause

Resume Parsing and User Profile Generation Flow

Diagram illustrating the process of parsing a user's resume to generate a structured XML user profile for personalized interactions, including caching and

flowchart · NOASSERTION