Show raw, refined and curated zones of a data lake feeding analytics and ML.
Free to start · Fully editable · Export to SVG, PNG, GIF & MP4
7 connected components you can rename, recolor, and extend with AI.
A data lake architecture diagram shows how raw data of any format is ingested into low-cost object storage and progressively refined. The central object store is surrounded by ingestion connectors, zoned layers for raw, refined and curated data, a catalog for metadata and discovery, a processing engine like Spark, and consumers including analytics, machine learning and lakehouse query engines.
Data engineers and ML platform teams use this diagram when designing scalable storage for structured and unstructured data on S3, ADLS or GCS. It clearly communicates a data lake or lakehouse design, the medallion zone strategy, and how raw inputs become governed datasets for analytics and model training.
A data lake is a centralized repository, usually on cloud object storage, that stores raw structured, semi-structured and unstructured data at scale until it is needed for analytics or machine learning.
A common medallion design uses a raw (bronze) zone for unprocessed data, a refined (silver) zone for cleaned data, and a curated (gold) zone for business-ready datasets.
A lakehouse adds warehouse-like features such as ACID transactions, schema enforcement and SQL performance on top of data lake storage using formats like Delta Lake or Iceberg.
A catalog stores metadata and lineage so users can discover, understand and trust datasets, preventing the lake from becoming an unusable data swamp.
Visualize how raw data is extracted, transformed, and loaded into a data warehouse
Show how sources, staging, storage layers and BI tools fit a modern warehouse
Map real-time event flow from producers through a broker to stream processors and sinks
Show domain-owned data products connected by a self-serve platform and governance
Map how warehouse data, a semantic layer and caching power business dashboards
Map governance roles, policies and controls from council down to data assets
Map independent services, an API gateway, databases and a message bus in a microservices system
Map API Gateway, Lambda functions, managed databases and event triggers in a serverless app
Open the data lake architecture diagram in the Infogiph canvas, then edit, animate, and export.
Use this template