Presented by Sam Bail at Airflow Summit 2021.
Data quality has become a much discussed topic in the fields of data engineering and data science, and it has become clear that data validation is absolutely crucial to ensuring the reliability of any data products and insights produced by an organization’s data pipelines. This session will outline patterns for combining three popular open source tools in the data ecosystem – dbt, Airflow, and Great Expectations – and use them to build a robust data pipeline with data validation at each critical step.

0:00 Welcome
3:20 Quick review of dbt
6:32 Overview of Great Expectations
14:00 Integrating dbt and Airflow
27:50 Testing with dbt and Great Expectations
41:40 Wrap-up
Video Rating: / 5

Email This Post Email This Post