It is therefore questionable whether these approaches are still appropriate for all the modern issues and requirements of today. This consideration gave rise to the Data Vault modeling approach.
Challenges of classic Data Warehouses
In the Data Warehouse environment, there are two well-known modeling approaches according to Kimball and Inmon that have been used for countless years when it comes to storing data. However, these have to face more and more growing challenges:
New requirements
Larger amounts of data
Growing IT costs
What is Data Vault?
Data Vault is a modeling technique that is particularly suitable for agile Data Warehouses. It offers a high flexibility for extensions, a complete historization of the data and allows a parallelization of the data loading processes.
This hybrid approach combines all the advantages of the third normal form with the star schema. Especially in today's world, companies need to transform their businesses in ever shorter cycles and map these transformations in the Data Warehouse. Data Vault supports exactly these requirements without significantly increasing the complexity of the Data Warehouse over time. Unlike Kimball and Inmon, this eliminates the ever-increasing IT costs associated with extensive implementation and testing cycles and a long list of potential dependencies.
Procedure for Data Vault
The Data Integration Architecture of the Data Vault approach has robust standards and definition methods that bring information together to use them in a way that makes sense. The model consists of three basic table types:
- Massive reduction in development time when implementing business requirements
- Earlier return on investment (ROI)
- Scalable Data Warehouse
- Traceability of all data back to the source system
- Near-real-time loading (in addition to classic batch run)
- Big Data Processing (>Terabytes)
- Iterative, agile development cycles with incremental expansion of the DWH
- Few, automatable ETL patterns
Marc Bastien