Portfolio | Harsh Mange

A Deep Dive into Creating a Production-Ready Cloud Development Environment

Have you ever wondered what happens behind the scenes when you click "Create Project" on platforms like Replit, CodeSandbox, or Gitpod? How do they instantly spin up isolated development environments, provide real-time code editing, and deliver a fully functional terminal—all running seamlessly in your browser?

I spent months building Just Run It, a cloud-based IDE that does exactly that. This wasn't just a toy project—it's a production-grade platform that dynamically provisions Kubernetes pods, manages real-time file synchronization via WebSockets, and implements browser-based terminals using pseudo-TTY. In this article, I'll take you through the complete architecture, share the technical decisions I made, reveal the challenges I encountered, and document the hard-won lessons learned.

By the end of this deep dive, you'll understand:

How to dynamically provision isolated containers for each user project
How to implement real-time file synchronization with WebSockets
How to create browser-based terminals with pseudo-TTY
How to design a multi-tenant system with Kubernetes
The scalability considerations for serving thousands of concurrent users
The production gotchas that nobody tells you about

Let's dive in.

The Problem: Why Build a Cloud IDE?

I built Just Run It because I wanted to understand how platforms like Replit, CodeSandbox, and Gitpod actually work under the hood.

What happens when you click "Create Project"? How do they spin up isolated environments in seconds? How do they handle real-time file synchronization? How do they make terminals work in a browser?

These questions led me down a rabbit hole of infrastructure complexity that I was eager to explore:

Kubernetes orchestration — How do you dynamically provision containers for thousands of users?
Real-time communication — How do you sync file changes across WebSocket connections?
Process management — How do you create a real terminal experience in a browser using PTY?
Distributed storage — How do you ensure data persistence when containers are ephemeral?
Dynamic networking — How do you route traffic to the right container based on subdomains?
Multi-tenancy — How do you isolate users while sharing the same infrastructure?

Building a cloud IDE isn't just about creating a product—it's a crash course in distributed systems, container orchestration, and real-time architectures. Every component touches multiple layers of the stack, from the browser's WebSocket connection all the way down to Kubernetes API calls and container runtime.

That complexity is exactly what I wanted to dive into. Just Run It became my vehicle for understanding how modern cloud platforms are architected, one Kubernetes manifest at a time.

Architecture Overview

Just Run It consists of three core microservices orchestrating a Kubernetes cluster, with AWS S3 providing persistent storage:

┌─────────────────────────────────────────────────────────────────────────┐
│                              USER BROWSER                               │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐               │
│  │ Landing  │  │  Monaco  │  │ xterm.js │  │  Output  │               │
│  │   Page   │  │  Editor  │  │ Terminal │  │  iframe  │               │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └────┬─────┘               │
└───────┼─────────────┼─────────────┼─────────────┼───────────────────────┘
        │             │             │             │
        └─────────────┴──────┬──────┴─────────────┘
                             │
        ┌────────────────────┼────────────────────┐
        ▼                    ▼                    ▼
  ┌──────────┐      ┌──────────────┐      ┌──────────────┐
  │   Init   │      │ Orchestrator │      │    NGINX     │
  │ Service  │      │   Service    │      │   Ingress    │
  └────┬─────┘      └──────┬───────┘      └──────┬───────┘
       │                   │                     │
       ▼                   ▼                     ▼
  ┌──────────┐      ┌──────────────┐      ┌──────────────┐
  │  AWS S3  │◄─────│  Kubernetes  │──────►│Runner Pod    │
  │(Storage) │      │     API      │       │(Per Project) │
  └──────────┘      └──────────────┘       └──────────────┘

Each component plays a critical role. Let me break them down.

Service 1: The Init Service — Project Bootstrapping

The Problem: When a user clicks "Create New Project," they need a starting point. Nobody wants to stare at an empty directory, and manually setting up project structures is tedious.

The Solution: The Init Service copies language-specific templates from S3, giving users a fully configured starting point.

The Flow

User selects "Node.js" 
  → Init Service copies template from S3 
  → Project ready in seconds

Implementation

app.post("/project", async (req, res) => {
  const { projectId, language } = req.body;
  
  // Copy template files from S3
  // templates/node-js/* → projects/{projectId}/*
  await copyProjectFolder(
    `templates/${language}`,
    `projects/${projectId}`
  );
  
  return res.send("Project created!");
});

The magic happens in the S3 helper function:

// List all files in the template folder
const listedObjects = await s3.listObjectsV2({
  Bucket: "my-bucket",
  Prefix: "templates/node-js"
}).promise();
 
// Copy each file to the new project location
for (const object of listedObjects.Contents) {
  await s3.copyObject({
    Bucket: "my-bucket",
    CopySource: `my-bucket/${object.Key}`,
    Key: object.Key.replace("templates/node-js", `projects/${projectId}`)
  }).promise();
}

Why S3 Over a Database?

I chose S3 for file storage because:

Cost-effective for large files — Pennies per GB versus expensive database storage
No size limits — Projects can grow to gigabytes without issues
Built-in versioning — Future feature potential without re-architecting
Kubernetes native integration — Init containers can directly mount S3

Template Structure

S3 Bucket
├── templates/
│   ├── node-js/
│   │   ├── package.json
│   │   ├── index.js
│   │   └── README.md
│   ├── python/
│   │   ├── requirements.txt
│   │   └── main.py
│   └── react/
│       ├── package.json
│       ├── src/
│       └── public/
└── projects/
    ├── abc123/ ← User's project
    └── xyz789/ ← Another user's project

This structure makes adding new languages trivial—just upload a new template folder to S3.

Service 2: The Orchestrator — Kubernetes Wizardry

This is where the real magic happens. When a user opens their project, the Orchestrator dynamically creates Kubernetes resources to spin up an isolated development environment.

The Challenge

I needed to:

Create a dedicated container for each project
Pre-load project files before the application starts
Expose two endpoints: WebSocket (IDE communication) and HTTP (app output)
Route traffic based on subdomain (project-id.myplatform.com)

The Solution: Dynamic Kubernetes Manifests

Instead of manually creating YAML files for every project, I use a template with placeholders:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: service_name  # ← Placeholder
spec:
  replicas: 1
  template:
    spec:
      # Init container downloads files from S3 BEFORE main container starts
      initContainers:
        - name: copy-s3-resources
          image: amazon/aws-cli
          command: ["/bin/sh", "-c"]
          args:
            - aws s3 cp s3://my-bucket/projects/service_name/ /workspace/ --recursive
          volumeMounts:
            - name: workspace-volume
              mountPath: /workspace
      
      # Main container runs the development environment
      containers:
        - name: runner
          image: my-runner-image:latest
          ports:
            - containerPort: 3001  # WebSocket
            - containerPort: 3000  # HTTP
          volumeMounts:
            - name: workspace-volume
              mountPath: /workspace
          resources:
            requests:
              cpu: "1"
              memory: "1Gi"
            limits:
              cpu: "1"
              memory: "1Gi"

The Orchestrator reads this template, replaces service_name with the actual project ID, and applies it to Kubernetes:

const readAndParseKubeYaml = (filePath, projectId) => {
  const fileContent = fs.readFileSync(filePath, 'utf8');
  
  // Parse multi-document YAML (Deployment + Service + Ingress)
  const docs = yaml.parseAllDocuments(fileContent).map((doc) => {
    let docString = doc.toString();
    // Replace placeholder with actual project ID
    docString = docString.replace(/service_name/g, projectId);
    return yaml.parse(docString);
  });
  
  return docs;
};
 
app.post("/start", async (req, res) => {
  const { projectId } = req.body;
  const manifests = readAndParseKubeYaml("./service.yaml", projectId);
  
  for (const manifest of manifests) {
    switch (manifest.kind) {
      case "Deployment":
        await k8sAppsApi.createNamespacedDeployment("default", manifest);
        break;
      case "Service":
        await k8sCoreApi.createNamespacedService("default", manifest);
        break;
      case "Ingress":
        await k8sNetworkingApi.createNamespacedIngress("default", manifest);
        break;
    }
  }
  
  res.send({ message: "Environment ready!" });
});

The Init Container Pattern

This is one of my favorite Kubernetes patterns. The init container runs before the main container and:

Downloads project files from S3
Places them in a shared volume (/workspace)
Exits successfully
Main container starts with files already in place

Pod Lifecycle:

┌─────────────────────────────────────────────────────────┐
│ 1. Init Container (aws-cli)                            │
│    └── aws s3 cp s3://bucket/projects/abc123/ /workspace│
│                                                         │
│ 2. Init Container exits (success)                      │
│                                                         │
│ 3. Main Container (runner) starts                      │
│    └── /workspace already has all project files!       │
└─────────────────────────────────────────────────────────┘

This pattern is elegant, reliable, and built into Kubernetes. No custom orchestration needed.

Ingress: The Routing Magic

Each project gets two subdomains:

Domain	Port	Purpose
`abc123.justrunit.work.gd`	3001	WebSocket for IDE communication
`abc123.justrunit.run.place`	3000	HTTP for viewing app output

The Ingress configuration makes this possible:

apiVersion: networking.k8s.io/v1
kind: Ingress
spec:
  rules:
    - host: abc123.justrunit.work.gd
      http:
        paths:
          - path: /
            backend:
              service:
                name: abc123
                port:
                  number: 3001  # WebSocket
    
    - host: abc123.justrunit.run.place
      http:
        paths:
          - path: /
            backend:
              service:
                name: abc123
                port:
                  number: 3000  # HTTP

Why two domains? Security isolation. The user's running application shouldn't have access to the IDE's WebSocket connection. Separate domains provide clean separation of concerns.

Service 3: The Runner — Where Code Comes Alive

The Runner is the heart of the platform. It runs inside each project's pod and handles:

Real-time file operations via WebSocket
Terminal emulation with PTY
Syncing changes back to S3

WebSocket Events

I use Socket.IO for real-time communication. Here's the event protocol:

Event	Direction	Purpose
`loaded`	Server → Client	Initial file tree
`fetchDir`	Client → Server	List directory contents
`fetchContent`	Client → Server	Read file content
`updateContent`	Client → Server	Save file (+ S3 sync)
`requestTerminal`	Client → Server	Create terminal session
`terminalData`	Bidirectional	Terminal I/O

Implementation

io.on("connection", async (socket) => {
  // Extract project ID from subdomain
  // "abc123.justrunit.work.gd" → "abc123"
  const host = socket.handshake.headers.host;
  const projectId = host?.split('.')[0];
  
  // Send initial file structure
  socket.emit("loaded", {
    rootContent: await fetchDir("/workspace", "")
  });
  
  // File operations
  socket.on("fetchContent", async ({ path }, callback) => {
    const content = await fs.readFile(`/workspace/${path}`, "utf8");
    callback(content);
  });
  
  socket.on("updateContent", async ({ path, content }) => {
    // Save locally (instant feedback)
    await fs.writeFile(`/workspace/${path}`, content);
    
    // Persist to S3 (survives pod restarts!)
    await s3.putObject({
      Bucket: "my-bucket",
      Key: `projects/${projectId}/${path}`,
      Body: content
    }).promise();
  });
});

The Dual-Write Strategy

Every file save triggers two writes:

Local filesystem — Instant feedback for the user
S3 — Durability across pod restarts

Operation	Local Filesystem	S3
Read file	~1ms	~50-200ms
Write file	~1ms	~100-300ms
List directory	~1ms	~50-150ms

The local filesystem provides snappy UX, while S3 ensures data survives pod terminations.

The Terminal: PTY Magic

This was the trickiest part of the entire project. Browsers can't run bash directly, so I use node-pty to create pseudo-terminals.

What is a PTY?

A pseudo-terminal is a pair of virtual devices:

Master side: Controlled by our application
Slave side: Looks like a real terminal to programs (bash, vim, etc.)

When you run bash attached to a PTY, it behaves exactly like it would in a real terminal—supporting colors, cursor movement, job control, and more.

Architecture

┌───────────┐         ┌───────────┐         ┌──────────┐
│ xterm.js  │◄────────►│ Socket.IO │◄────────►│ node-pty │
│ (Browser) │ WebSocket│ (Server)  │   IPC    │  (PTY)   │
└───────────┘         └───────────┘         └────┬─────┘
                                                  │
                                                  ▼
                                            ┌───────────┐
                                            │   bash    │
                                            │ (process) │
                                            └───────────┘

Implementation

import { spawn } from 'node-pty';
 
class TerminalService {
  private sessions: Map<string, IPty> = new Map();
  
  createPty(socketId: string, onData: (data: string) => void) {
    // Spawn a real bash process
    const pty = spawn('bash', [], {
      name: 'xterm-256color',
      cols: 80,
      rows: 24,
      cwd: '/workspace',
      env: {
        ...process.env,
        PS1: '\\u@runner:\\w$ '  // Custom prompt
      }
    });
    
    // Stream output to client
    pty.onData((data) => onData(data));
    
    this.sessions.set(socketId, pty);
    return pty;
  }
  
  write(socketId: string, data: string) {
    // Forward keystrokes to bash
    this.sessions.get(socketId)?.write(data);
  }
}

On the frontend, xterm.js renders the terminal:

// Frontend
socket.emit("requestTerminal");
 
socket.on("terminal", ({ data }) => {
  // Render output in xterm.js
  terminal.write(data);
});
 
terminal.onData((data) => {
  // Send keystrokes to server
  socket.emit("terminalData", { data });
});

The result? A fully functional bash terminal in the browser:

user@runner:/workspace$ npm install
added 150 packages in 3.2s
 
user@runner:/workspace$ node index.js
Server running on port 3000

Signal Handling

Real terminals support signals like Ctrl+C (SIGINT) and Ctrl+Z (SIGTSTP). These work automatically with PTY because the terminal driver handles them:

User presses Ctrl+C
      ↓
xterm.js sends: "\x03" (ASCII ETX)
      ↓
Socket.IO transmits to server
      ↓
node-pty writes "\x03" to PTY master
      ↓
Terminal driver interprets as SIGINT
      ↓
bash sends SIGINT to foreground process
      ↓
Process terminates (or handles signal)

The Complete Data Flow

Let me walk through what happens when a user creates and uses a project:

Phase 1: Project Creation

User clicks "Create Node.js Project"
Frontend → POST /project { projectId: "abc123", language: "node-js" }
Init Service copies S3: templates/node-js/* → projects/abc123/*
Frontend navigates to /coding?projectId=abc123

Phase 2: Environment Provisioning

Frontend → POST /start { projectId: "abc123" }
Orchestrator creates Kubernetes resources:
- Deployment (with init container + runner)
- Service (internal networking)
- Ingress (domain routing)
Kubernetes schedules pod on a node
Init container runs: aws s3 cp → /workspace/
Runner container starts

Phase 3: Real-Time Coding

Frontend connects: ws://abc123.justrunit.work.gd
Runner sends file tree via loaded event
User clicks file → fetchContent → Monaco Editor displays it
User edits → updateContent → Local save + S3 sync
User opens terminal → requestTerminal → PTY spawned
User types "npm start" → terminalData → bash executes
App runs on port 3000 → visible at abc123.justrunit.run.place

Scalability: How Many Users Can This Handle?

This is the million-dollar question. Let's break it down.

Resource Requirements Per Project

Each project pod requests:

1 CPU core
1 GB RAM

Cluster Capacity

Cluster Size	Node Specs	Concurrent Projects	Use Case
Small	3 nodes × (4 CPU, 16GB)	~30-40	Development/Testing
Medium	10 nodes × (8 CPU, 32GB)	~150-200	Small startup
Large	50 nodes × (16 CPU, 64GB)	~1,000+	Growing platform
Enterprise	200+ nodes	~5,000+	Full scale

Bottlenecks & Solutions

Bottleneck	Impact	Solution
Ingress Controller	Single entry point	Deploy multiple replicas, use cloud LB
Orchestrator Service	K8s API calls are slow	Add caching, queue requests
S3 Rate Limits	3,500 PUT/s per prefix	Shard by project ID prefix
Pod Startup Time	10-30 seconds	Pre-warm pool of pods

Cost Optimization

At scale, costs matter. Here's what I'd implement:

Idle Detection — Terminate pods after 30 minutes of inactivity
Spot Instances — Use preemptible nodes for 60-80% cost savings
Right-sizing — Offer different tiers (0.5 CPU for small projects)
Cold Storage — Archive inactive projects to S3 Glacier

Networking Deep Dive

One of the most complex aspects is networking. Each project needs its own subdomain, and we need to handle both WebSocket and HTTP traffic differently.

Wildcard DNS: The Foundation

Instead of creating a DNS record for every project, I use wildcard DNS:

*.justrunit.work.gd → Load Balancer IP
*.justrunit.run.place → Load Balancer IP

This means abc123.justrunit.work.gd, xyz789.justrunit.work.gd, and any other subdomain all resolve to the same IP. The routing to the correct pod happens at the Ingress layer.

NGINX Ingress Controller: Traffic Cop

The NGINX Ingress Controller inspects the Host header to determine which pod to route to:

Request: GET / HTTP/1.1
Host: abc123.justrunit.work.gd
Connection: Upgrade
Upgrade: websocket

┌─────────────────────────────────────────────────────────┐
│ NGINX Ingress Controller                                │
├─────────────────────────────────────────────────────────┤
│ 1. TLS Termination (decrypt HTTPS)                      │
│ 2. Parse Host header: "abc123.justrunit.work.gd"        │
│ 3. Look up Ingress rules for this host                  │
│ 4. Find: route to Service "abc123" port 3001            │
│ 5. Detect WebSocket upgrade, maintain connection        │
│ 6. Forward to pod IP (from Service endpoints)           │
└─────────────────────────────────────────────────────────┘
                         ↓
              ┌─────────────────┐
              │  Pod: abc123    │
              │  Port: 3001     │
              └─────────────────┘

TLS Certificates at Scale

Managing SSL certificates for thousands of subdomains sounds nightmarish, but wildcard certificates make it simple:

spec:
  tls:
    - hosts:
        - "*.justrunit.work.gd"
      secretName: wildcard-work-gd-tls
    - hosts:
        - "*.justrunit.run.place"
      secretName: wildcard-run-place-tls

I use cert-manager with Let's Encrypt to automatically provision and renew these certificates.

Production Considerations

Building the core functionality is one thing. Running it in production is another.

Monitoring & Observability

A distributed system needs comprehensive monitoring. Key metrics I track:

# Resource usage
- container_cpu_usage_seconds_total
- container_memory_usage_bytes
- nginx_ingress_controller_requests_total

# Application metrics
- socket_io_connected_clients
- terminal_sessions_active
- s3_operations_total
- pod_startup_duration_seconds

Error Handling

Every external call needs robust error handling:

socket.on("updateContent", async ({ path, content }) => {
  try {
    await fs.writeFile(`/workspace/${path}`, content);
    
    try {
      await s3.putObject({...}).promise();
    } catch (s3Error) {
      // S3 failure shouldn't break UX
      logger.error('s3_sync_failed', { path, error: s3Error.message });
      
      // Queue for retry
      retryQueue.add({ path, content, projectId });
    }
  } catch (fsError) {
    socket.emit('error', { message: 'Failed to save file' });
    logger.error('file_save_failed', { path, error: fsError.message });
  }
});

Graceful Shutdown

When a pod is terminated, clean up gracefully:

process.on('SIGTERM', async () => {
  logger.info('shutdown_initiated', {});
  
  // Stop accepting new connections
  io.close();
  
  // Give existing operations time to complete
  await new Promise(resolve => setTimeout(resolve, 5000));
  
  // Close all terminal sessions
  terminalService.closeAll();
  
  // Flush any pending S3 writes
  await retryQueue.flush();
  
  process.exit(0);
});

Security Hardening

Security is non-negotiable for a platform that runs arbitrary user code.

Container Isolation:

securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  capabilities:
    drop:
      - ALL
  readOnlyRootFilesystem: true  # Except /workspace

Network Policies (prevent pods from communicating with each other):

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 0.0.0.0/0
            except:
              - 10.0.0.0/8       # Block internal network
              - 172.16.0.0/12
              - 192.168.0.0/16

Resource Limits (prevent resource exhaustion):

resources:
  requests:
    cpu: "500m"
    memory: "512Mi"
  limits:
    cpu: "1"
    memory: "1Gi"

Lessons Learned

1. Init Containers Are Underrated

The init container pattern solved my biggest challenge: how to pre-populate the filesystem before the app starts. It's elegant, reliable, and built into Kubernetes. No custom orchestration needed.

2. WebSockets Need Careful Error Handling

Connections drop. Networks fail. I learned to implement:

Automatic reconnection with exponential backoff
Message queuing during disconnects
Heartbeat pings to detect dead connections

3. PTY Is Not Just "Running Commands"

A real terminal needs:

Proper signal handling (Ctrl+C, Ctrl+Z)
Window resize events
ANSI escape code support
Session persistence

4. Multi-Tenancy Is Hard

Isolating users requires thinking about:

Resource limits (CPU, memory, disk)
Network policies (prevent cross-pod communication)
Filesystem isolation (each pod has its own /workspace)
Process isolation (containerization handles this)

5. Persistence Strategy Matters

I chose S3 because:

Pods are ephemeral—they can be killed anytime
S3 provides durability (11 9's)
Init containers make S3 → Pod sync seamless
Real-time sync keeps S3 updated

What I'd Do Differently

If I were starting over:

Use a Message Queue — Decouple the Orchestrator from synchronous K8s API calls. RabbitMQ or Redis Streams would make the system more resilient.

Implement Pod Pooling — Pre-create a pool of warm pods to reduce startup latency from 30 seconds to <2 seconds.

Cost Analysis

Let's talk money. Running a cloud IDE isn't cheap.

Per-Project Costs (AWS, us-east-1)

Resource	Specification	Monthly Cost
EC2 (pod)	1 vCPU, 1GB RAM	~$7.50
S3 Storage	100MB project	~$0.0023
Data Transfer	~1GB/month	~$0.09

Total per active project: ~$7.50/month

Platform Costs (Fixed)

Resource	Specification	Monthly Cost
EKS Control Plane	Managed Kubernetes	$72
Load Balancer	Network LB	$16
NAT Gateway	Outbound traffic	$32
Init/Orchestrator nodes	2× t3.medium	$60

Fixed monthly cost: ~$180

Break-Even Analysis

Fixed costs: $180/month
Per-project cost: $7.50/month

At $10/user/month pricing:
Break-even = 180 / (10 - 7.50) = 72 users

At $15/user/month pricing:
Break-even = 180 / (15 - 7.50) = 24 users

Conclusion

Building Just Run It has been an incredible learning journey. What started as curiosity about "how does Replit work?" turned into a deep dive through:

Kubernetes orchestration and dynamic resource management
Real-time systems with WebSockets and event-driven architecture
Process management with pseudo-terminals
Distributed storage patterns with S3
Multi-tenant security and isolation

Tech Stack Summary

Frontend:

React
Monaco Editor (VS Code editor)
xterm.js (terminal emulation)
Socket.IO Client

Backend:

Node.js, Express, TypeScript
Socket.IO (real-time communication)
node-pty (pseudo-terminal)

Infrastructure:

Kubernetes (container orchestration)
NGINX Ingress Controller
Docker
AWS S3 (persistent storage)
kubernetes-client-node (K8s API)

Building a Cloud IDE from Scratch: Architecting 'Just Run It' with Kubernetes, WebSockets, and Real-Time Terminals

A Deep Dive into Creating a Production-Ready Cloud Development Environment

The Problem: Why Build a Cloud IDE?

Architecture Overview

Service 1: The Init Service — Project Bootstrapping

The Flow

Implementation

Why S3 Over a Database?

Template Structure

Service 2: The Orchestrator — Kubernetes Wizardry

The Challenge

The Solution: Dynamic Kubernetes Manifests

The Init Container Pattern

Ingress: The Routing Magic

Service 3: The Runner — Where Code Comes Alive

WebSocket Events

Implementation

The Dual-Write Strategy

The Terminal: PTY Magic

What is a PTY?

Architecture

Implementation

Signal Handling

The Complete Data Flow

Phase 1: Project Creation

Phase 2: Environment Provisioning

Phase 3: Real-Time Coding

Scalability: How Many Users Can This Handle?

Resource Requirements Per Project

Cluster Capacity

Bottlenecks & Solutions

Cost Optimization

Networking Deep Dive

Wildcard DNS: The Foundation

NGINX Ingress Controller: Traffic Cop

TLS Certificates at Scale

Production Considerations

Monitoring & Observability

Error Handling

Graceful Shutdown

Security Hardening

Lessons Learned

1. Init Containers Are Underrated

2. WebSockets Need Careful Error Handling

3. PTY Is Not Just "Running Commands"

4. Multi-Tenancy Is Hard

5. Persistence Strategy Matters

What I'd Do Differently

Cost Analysis

Per-Project Costs (AWS, us-east-1)

Platform Costs (Fixed)

Break-Even Analysis

Conclusion

Tech Stack Summary

Written by Harsh Mange

Related Posts

Building Chronon: A Distributed Rate Limiter

Building Parcelo: A Distributed Job Scheduler That Actually Makes Sense

Spring Boot Internals Explained with Simple, Fun Example Like You're 5