I was just building a pipeline in DPC and using Create Table
and S3 Load. Pretty fundamental components. I was working with
a dataset I hadn't used before and I found myself going back and forth
creating the table and getting the metadata correct. In the end I zipped
back to METL and used the load generator tool. This is a really powerful
and effective improvement for the user. It removes a lot of friction by
doing the heavy lifting for you. In general I feel the DPC needs more
test/dry run functionality, the improvement on speed to value is superb.
Is Dry runs/Testing something we have on radar? Testing DB connections,
Metadata definition, credentials etc.
We could build a “validator” that checked the table metadata before
trying to load the data. This way customers can be confident they get up
front feedback instead of waiting for the failure and re-running.
Thanks!