D
Data Lake
A data lake is a centralized repository storing raw data in native format at any scale for analytics.
What is a Data Lake?
A data lake is a centralized repository that stores all structured and unstructured data at any scale in its raw format, enabling diverse analytics and machine learning workloads.
Data lake characteristics
Schema-on-read, Raw data storage, Multiple data types, Scalable storage, Analytics-ready.
Common misconceptions
- "Data lake equals data warehouse" — Different purposes
- "Data lakes are just cheap storage" — Require governance
- "Dump everything in" — Becomes data swamp without management