Bronze Data
Raw, source-aligned data. Minimal cleaning, retains history.
Silver Data
Cleaned, filtered, and transformed data. Entity-aligned, ready for light analysis.
Gold Data
Highly curated, aggregated, and modeled data (e.g., Star Schema). Optimized for reporting.
Data Lake
A centralized repository to store large amounts of structured, semi-structured, and unstructured data.
Data Warehouse
A central repository of integrated data from one or more disparate sources, optimized for reporting and analysis.
Delta Lake
An open-source storage layer that brings ACID transactions and reliability to data lakes (Parquet files).
Schema-on-Read
Applying a structure (schema) to data only when it is read from the source (typical in Data Lakes).
Data Mesh
A decentralized, domain-oriented data architecture paradigm that treats data as a product.
Parquet / ORC
Columnar file formats optimized for analytical queries, storage efficiency, and performance in distributed systems.