Reverse ETL
While listening to Podcast stumbled upon this term of Reverse-ETL.
First lets align on a few basic terminologies:
What is a Data Warehouse (DW):
Data Warehouse centralizes and consolidates data from multiple sources like application log files and transaction applications. Orgainzation use DW to gain valuable insights from data, which can improve decision making by building a historical record of data. It can be considered the single source of truth.
What is ETL
ETL Stands for Extract Transform and Load. ETL is a data integration process that combines data from multiple data sources into a single, consistent data store that is loaded into a data warehouse or other target system.
What is Reverse ETL
Reverse ETL is the process of copying data from a central data warehouse to operational systems of record, including but not limited to SaaS tools used for growth, marketing, sales and support.
Discussion:
Hang On! this doesn’t make sense - why should someone do ETL and then Reverse it? Coz, just ETL in itself - is a HUGE infrastructure to both develop and maintain - why would I manage both. Well, as always devil is in the details.
Lets see an example from
Hightouch
Reasoning(by Hightouch):
Reverse ETL is necessary because your data warehouse — the platform you bought to eliminate data silos — has ironically become a data silo. Without reverse ETL, your business’s core definitions only live in the warehouse.
More detailed information like use cases are avaialble in this blog Reverse ETL
Conclusion:
The intent here seems to bridge the knowlege gap, that comes in due to data warehouse. Say: If a business user is so used to a particular platform/tech like SalesForce, the user would prefer the analystic output/data to be available in their platform. Incase of Data Warehouse, the SalesForce business user should understand and learn the new ways to access the data and its corresponding schema in the DW; and also learn the access mechanisms like scripts, sql etc.. With Reverse ETL, it creates a comfortable/known space for the user as all the “Actionable Data” from the analytics are available in their platform. So, sure it makes sense.
Only challenge, I see as an organization is - the effort in maintaining both ETL and Reverse ETL infrastructure. Any changes in the DW schema - should be notified to both the teams and the deliverables should be synced up - to avoid any broken pipelines.
Reference: DataWarehouse