Two .NET 9 tools, purpose-built for high-throughput data movement and parallel process scheduling — from Progress OpenEdge sources to SQL Server targets, and beyond.
notna.io presents V3 — the third generation of an ETL and orchestration platform born inside a hospital information system serving over 2,000 employees. V1 entered clinical production and ran for 8 years, surfacing the real operational constraints that textbook ETL tools ignore: connection pooling exhaustion on legacy ODBC drivers, silent data quality failures with no visibility, manual scheduling bottlenecks, and the cost of reading the same source multiple times to feed multiple targets.
V2 addressed these incrementally. V3 was redesigned from scratch — multicast streaming, statistical anomaly detection, LPT wall-clock scheduling, plugin architecture — with every architectural decision rooted in 8 years of production evidence. Nothing in V3 is theoretical.
Each app is independently deployable and operable via CLI. Together they cover the full ETL orchestration lifecycle.
A modular, cross-database ETL console application that transfers data from one source
to one or more targets simultaneously. Designed to run autonomously — from a scheduled
task, SSIS ExecuteProcess,
or the Orchestrator — with zero user interaction once configured.
dbo.incremental_watermark, no CLI changes needed.@schema=, @truncate, @postprocess inline in the CLI arg.
A two-phase statistical profiling engine. Phase 1: assembly profile--object
scans a source column offline, builds a numeric or categorical signature
(min/max/mean/stddev, coverage threshold, value distribution), and persists it
to the Tools database. Phase 2: profiles are loaded once at startup and applied
per batch during streaming — any row deviating from the signature triggers a
warning without aborting the load. Zero extra DB round-trips during data movement.
Transformation rules applied inline to the batch already in memory during the streaming pass — type coercions, column renames, value mappings, derived columns — before the data is written to any target. No staging table, no second pass, no additional DB connection. Transformation cost is absorbed into the streaming time that already exists.
A generic parallel task orchestrator whose primary goal is to minimise total wall-clock time across a batch of heterogeneous tasks. It dispatches any command-line process — fully decoupled from what the command does — and focuses entirely on scheduling its lifecycle for maximum throughput.
A 24-task batch totalling 16 hours of sequential work completes in under 1 hour with 4 threads, automatically balanced by LPT (Longest Processing Time) bin-packing. The orchestrator computes the optimal thread count from the CV of task weights — no manual tuning needed.
resourceIdentifier group. Other domains continue uninterrupted.-2 (process failed to start) always HardFails unconditionally.--recovery and the same GUID: completed tasks are skipped, failed tasks are re-queued.The Orchestrator is the conductor. The Loader is a musician. Any number of musicians can play in parallel.