Transform Pipelines bring through all columns (Schema Evolution)
Who this is for
Data Engineers building pipelines where the schema of the input table can change, for example
- The same pipeline is needed to run over similar, but different schemas, where a common set of columns exist but other columns may vary (e.g. slightly different versions of a source application)
- New columns are added to a source table over time, and should be added to the target table
What will it unlock
The ability to engineer pipelines that can work without modification across tables with different schemas as long as any common fields (such as join keys) do exist in the source schema.
This would mean component changes such as:
Table Input
- an option to select all columns at run time, not just design time, ensuring any new columns would be available downstream
Table Join
- all columns that enter the join would be output from the join automatically, only the join keys would need to be consistently available in the source input
Next Steps
We’re looking to gauge interest in this feature, and discuss potential use cases with interested users. If you’d be in interested in helping us design this feature please comment below or reach out via your account manager.
Subscribe to post
Get notified by email when there are changes.