CSV Source
Reads raw CSV from S3 bucket
FormatCSV
Rows1.2M
42 cols98% uptime
Clean
Remove nulls, dedupe rows
StrategyDedupe
99.2% pass
Transform
Type casting, rename cols
OutputParquet
18 transforms
Validate
Schema validation, anomaly check
Rules24 active
0.3% reject
Aggregate
Group by region, sum totals
GroupsRegion
5 aggregates
Data Warehouse
Write to PostgreSQL warehouse
TargetPostgreSQL
Tableanalytics.events
4.8M rows