Back
Data Pipeline
DAG
Running
6 nodes · 6 edges
CSV Source
Reads raw CSV from S3 bucket
Format
CSV
Rows
1.2M
42
cols
98%
uptime
Clean
Remove nulls, dedupe rows
Strategy
Dedupe
99.2%
pass
Transform
Type casting, rename cols
Output
Parquet
18
transforms
Validate
Schema validation, anomaly check
Rules
24 active
0.3%
reject
Aggregate
Group by region, sum totals
Groups
Region
5
aggregates
Data Warehouse
Write to PostgreSQL warehouse
Target
PostgreSQL
Table
analytics.events
4.8M
rows
+
100%
-
⊞