A data warehouse was defined by Bill Inmon as “a subject-oriented, integrated, nonvolatile, and time-variant collection of data in support of management’s decisions” over 30 years ago. However, the initial data warehouses were unable to store massive heterogeneous data, hence the creation of data lakes. In modern times, data lakehouse emerges as a new paradigm. It is an open data management architecture featured by strong data analytics and governance capabilities, high flexibility, and open storage.
If I could only use one word to describe the next-gen data lakehouse, it would be unification: