Building Parcelo: A Distributed Job Scheduler That Actually Makes Sense
"We need to backfill analytics for 50 million users, but skip anyone who's already been processed."
Simple enough, right? Except with traditional job schedulers, you'd have to:
- Query which 20 million users are already done
- Generate 30 million individual jobs for the remaining ones
- Enqueue them all
- Hope your query results are still valid by the time jobs start running
I kept running into this pattern: the scheduler couldn't skip work intelligently. You had to tell it every single thing to do, even when most of the work was figuring out what not to do.
So I built parcelo—a scheduler that treats workloads as binary trees. Instead of "create 30 million jobs," you say "process range 0-50M, but prune branches where users are already done." It splits the range recursively and skips entire subtrees without ever creating jobs for them.
The Problem That Started It All
Picture this: you need to process 10 million database records. Maybe you're migrating user data, calculating analytics, or syncing with an external API. The naive approach is simple:
for (let i = 0; i < 10_000_000; i++) {
await processRecord(i);
}This works great until:
- Your server crashes at record 8,234,567 and you lose all progress
- You need to skip certain records based on business logic
- You want to pause and resume the job
- You need to run this across multiple servers
Suddenly, your "simple" loop becomes a distributed systems problem.
Why Not Just Use BullMQ or Celery?
Here's the thing: BullMQ is excellent at managing individual jobs. You can queue 10 million tasks, and it'll process them reliably. But you're still manually creating those 10 million tasks. You're figuring out the chunking strategy. You're implementing the resume logic.
Parcelo sits one layer above that. You give it a range (0 to 10M) and say "figure it out." It decides how to split it, when to process leaves, and which subtrees to skip. BullMQ handles the worker coordination—Parcelo handles the intelligent splitting.
The Core Idea: Binary Tree Splitting
The key insight is treating a range as a binary tree. Instead of thinking "I need to process items 0 to 1,000,000," you think "I have a root node covering 0-1M, and I'll split it until each leaf is small enough to process."
Here's how it works:
// You start with one big range
{ start: 0, end: 1_000_000 }
// The scheduler splits it in half
{ start: 0, end: 500_000 } // Left child
{ start: 500_000, end: 1_000_000 } // Right child
// Each half gets split again if still too large
{ start: 0, end: 250_000 }
{ start: 250_000, end: 500_000 }
// ... and so on
// Eventually you get leaf nodes small enough to process
{ start: 0, end: 10_000 } // Process this!
{ start: 10_000, end: 20_000 } // Process this!The beauty is that splitting is O(log N)—even for a billion items, you only need ~30 splits to get manageable chunks. And since it's a tree, you can skip entire subtrees if your shouldProcess callback says "nope, not interested."
Architecture: How It All Fits Together
Let me walk you through the architecture. I'll be honest—I went through several iterations before landing on this design.
High-Level Overview
This means:
- Development: Use
InMemoryStore—zero setup, instant feedback - Production: Use
RedisStore—persistent, distributed, crash-safe - Future: Easy to add PostgreSQL, MongoDB, or whatever you need
The scheduler doesn't care what's underneath. It just calls storage.saveJob() and trusts the adapter to handle it.
Two Schedulers, One Codebase
I built two schedulers that share most of their code:
InMemoryScheduler (for dev/testing):
- No external dependencies
- Fast startup
- Perfect for unit tests
- State lost on crash (but that's fine for dev)
RangeScheduler (for production):
- Extends InMemoryScheduler
- Adds Redis persistence
- Adds BullMQ for distributed processing
- Adds stale node detection
- Adds heartbeat monitoring
The inheritance works beautifully because 90% of the logic is the same—the only difference is where data lives and how workers execute.
The Execution Flow: What Actually Happens
Let me trace through what happens when you create and run a job. This is where the magic (and complexity) lives.
The processing loop is the heart of the system. It continuously:
- Dequeues the highest-priority node (deeper nodes = higher priority)
- Checks if it should be processed (via
shouldProcesscallback) - If it's a leaf: executes the work function
- If it's not a leaf: splits it into two children
- Updates stats and checks for completion
The priority queue ensures we process deeper nodes first, which means we get to executable chunks faster. It's a depth-first approach, but with concurrency.
Design Decisions: The Good, The Bad, and The "I'll Fix It Later"
Decision 1: Binary Tree vs. Fixed Chunks
Why binary tree? I considered fixed-size chunks (just divide 1M by 1000, get 1000 chunks of 1000 items each). But binary trees give you:
- Flexibility: Split until you hit the right size
- Pruning: Skip entire subtrees efficiently
- Efficiency: O(log N) depth, not O(N) chunks
- Natural recursion: The code is cleaner
Trade-off: Slightly more complex, but worth it.
Decision 2: Priority Queue by Depth
Nodes are prioritized by depth—deeper nodes (closer to leaves) get processed first. This means:
- We reach executable chunks faster
- The tree "fills in" from the bottom up
- Better cache locality (processing similar-sized chunks together)
Alternative considered: FIFO. But depth-first feels more natural for tree structures.
Decision 3: Two Schedulers, Not One
I could have made one scheduler that "just works" everywhere. But having two separate classes is clearer:
- Developers know exactly what they're getting
- No hidden Redis dependencies in tests
- Easier to reason about
Trade-off: More code to maintain, but better developer experience.
Decision 4: Event-Driven Architecture
Everything emits events. This might seem like overkill, but it's incredibly useful:
- Progress tracking
- Monitoring and alerting
- Debugging (just listen to all events)
- Integration with other systems
Trade-off: Slight performance overhead, but the observability is worth it.
Decision 5: Generic Type System
The scheduler works with any comparable type:
- Numbers:
{ start: 0, end: 1000 } - BigInt:
{ start: 0n, end: 1000000n } - Dates:
{ start: new Date('2024-01-01'), end: new Date('2024-12-31') } - Strings:
{ start: 'aaa', end: 'zzz' } - Custom types: Just implement
RangeAdapter<T>
This was harder to implement than I thought. TypeScript's generic constraints are powerful but tricky. Worth it though—the API is clean and type-safe.
Performance: The Numbers That Matter
The Benchmark Setup
I created realistic benchmarks simulating actual workloads:
- Fast DB queries (10ms): Cache hits, indexed lookups
- Normal DB queries (50ms): Typical PostgreSQL operations
- API calls (100-500ms): External service calls
- File processing (100-500ms): CSV parsing, image processing
Each handler was implemented as await sleep(X) + a bit of CPU, so results are stable and reproducible.
Results at a Glance
The numbers vary depending on handler cost, but here’s the takeaway:
| Workload Type | Handler Time | Concurrency | Throughput | Speedup vs Sequential |
|---|---|---|---|---|
| Fast DB Query | ~50ms | 10 workers | ~430 items/sec | ~20× |
| API Call | ~200ms | 10 workers | ~50 items/sec | ~10× |
| Heavy File Work | ~500ms | 10 workers | ~80–100 items/sec | ~8–12× |
Sequential baselines were:
- 50ms → ~20 items/sec
- 200ms → ~5 items/sec
- 500ms → ~2 items/sec
What I Learned from Benchmarking
-
Throughput depends entirely on your work duration. The scheduler overhead is minimal (<5%), but if your work takes 500ms, you're not going to process 1000 items/sec.
-
Concurrency helps, but with diminishing returns. Going from 1 to 10 workers gives you ~5x speedup. Going from 10 to 20 gives you maybe 1.5x more. CPU cores and I/O become the bottleneck.
-
Redis adds overhead, but enables reliability. In-memory is faster, but you lose everything on crash. The trade-off is worth it for production.
-
The numbers are honest. I could have cherry-picked the best results, but I benchmarked realistic scenarios. For a 50ms DB query, you get ~430 items/sec. That's the truth, and it's good enough for most use cases.
Real-World Example: Processing User Data
Let me show you a real example I built while testing. Imagine you need to migrate user data from an old system to a new one:
import { RangeScheduler } from '@harshmange44/parcelo';
const scheduler = new RangeScheduler({
redisUrl: process.env.REDIS_URL,
defaultMaxConcurrency: 10,
enableMetrics: true,
});
// We have 1 million users to migrate
const jobId = await scheduler.createJob({
range: { start: 0, end: 1_000_000 },
maxRangeSize: 1000, // Process 1000 users at a time
// Skip users that are already migrated
shouldProcess: async (range) => {
const firstUser = await db.getUser(range.start);
const lastUser = await db.getUser(range.end - 1);
// If both are already migrated, skip this entire range
if (firstUser.migrated && lastUser.migrated) {
return false; // Skip entire subtree
}
// Otherwise, we need to check (might split further)
return true;
},
work: async (range) => {
// Process 1000 users
const users = await db.getUsers(range.start, range.end);
for (const user of users) {
if (!user.migrated) {
await migrateUser(user);
await db.markMigrated(user.id);
}
}
},
retry: {
maxAttempts: 3,
backoffMs: (attempt) => Math.pow(2, attempt) * 1000, // Exponential backoff
},
});
// Track progress
scheduler.on(SchedulerEvent.STATS_UPDATED, ({ stats }) => {
const progress = (stats.rangeProcessed / stats.rangeTotal) * 100;
console.log(`Progress: ${progress.toFixed(1)}%`);
});
await scheduler.startJob(jobId);What I love about this:
- Automatic splitting: Don't worry about chunk sizes
- Smart pruning: Skips already-migrated users efficiently
- Progress tracking: Real-time updates
- Fault tolerance: Retries on failure
- Resumable: If the server crashes, Redis has your state
The Hard Parts
1. Stale Node Detection
In a distributed system, workers can crash. A node might be marked "running" but the worker died. How do you detect this?
I implemented a heartbeat system:
- Each running node sends a heartbeat every 15 seconds
- A background process checks for stale nodes (no heartbeat for 5 minutes)
- Stale nodes are marked as failed and retried
This sounds simple, but the edge cases are brutal:
- What if the heartbeat process itself crashes?
- What if Redis is slow and heartbeats are delayed?
- What if a node finishes between heartbeat checks?
2. Type Safety with Generics
Making the scheduler work with any type (number, BigInt, Date, string) while maintaining type safety was... challenging. TypeScript's generic constraints are powerful but the error messages are cryptic.
I eventually created a RangeAdapter<T> interface that handles type-specific operations:
compare(a, b): Compare two valuesmidpoint(start, end): Find the middle valuesize(range): Calculate range sizeisValid(range): Check if range is valid
3. The Processing Loop
The main processing loop looks simple, but it's deceptively complex:
- Handle queue emptiness
- Check job status (might have been paused/cancelled)
- Process nodes while respecting concurrency limits
- Update stats efficiently (don't scan all nodes every time)
- Handle errors gracefully
- Emit events without blocking
Try It Yourself
If this sounds interesting, give parcelo a try:
npm install @harshmange44/parceloStart with the in-memory scheduler for development:
import { InMemoryScheduler } from '@harshmange44/parcelo';
const scheduler = new InMemoryScheduler();
const jobId = await scheduler.createJob({
range: { start: 0, end: 1000 },
maxRangeSize: 100,
work: async (range) => {
console.log(`Processing ${range.start} to ${range.end}`);
}
});
await scheduler.startJob(jobId);Then move to the distributed scheduler for production:
import { RangeScheduler } from '@harshmange44/parcelo';
const scheduler = new RangeScheduler({
redisUrl: 'redis://localhost:6379',
defaultMaxConcurrency: 10,
enableMetrics: true,
});
// Same API, but with persistence and reliabilityWrapping Up
Building parcelo was a journey. The code is open source, well-tested, and production-ready. If you find it useful, great! If you find bugs, please report them. If you have ideas, I'd love to hear them.
Check out the GitHub repository for the full source code, more examples, and detailed documentation.