Content is user-generated and unverified.

Orchestrator: Building a Self-Organizing AI Task System

What We're Building

The Simplest Version: A Chat That Creates Chats

At its core, Orchestrator starts as a regular LLM chat interface. But with one crucial difference: this chat can spawn new chats.

Think of it like this:

  • You ask the AI to "build a todo app"
  • Instead of trying to do everything in one long conversation, the AI creates separate chats:
    • One chat for "design the UI"
    • One chat for "set up the database"
    • One chat for "implement the frontend"
    • One chat for "write tests"

Each chat stays focused on its specific task. No context pollution. No losing track of what we're doing.

Why This Matters

Research shows that LLMs lose 39% of their performance in long conversations. They make assumptions, get confused, and can't recover from mistakes. By keeping each conversation short and focused, we maintain peak AI performance.

Evolution of the Design

Level 1: Basic Chat Splitting

Main Chat: "Build a todo app"
    ├── Chat 1: "Design UI"
    ├── Chat 2: "Backend API"
    └── Chat 3: "Frontend"

The user manually navigates between chats. Simple, but effective.

Level 2: Visual Task Tree

[Todo App Project]
    ├── [✓] Design Phase
    │   ├── [✓] User Stories
    │   └── [✓] Mockups
    ├── [●] Development (active)
    │   ├── [●] Backend API
    │   └── [ ] Frontend
    └── [ ] Testing

Now you see all tasks at once. Click on any node to jump into that conversation. The tree shows what's done, what's active, and what's pending.

Level 3: Smart Decomposition

The AI automatically knows when to split tasks:

User: "Add authentication to my app"

AI: "I'll break this down into focused tasks:"

├── Research auth providers (auto)
├── Design auth flow (interactive)
├── Implement login (interactive)
├── Add session management (auto)
└── Write auth tests (auto)

The system decides which tasks need your input and which can run automatically.

Level 4: Interactive + Autonomous Execution

Some tasks run by themselves, others wait for you:

  • Autonomous: Research, boilerplate generation, testing, validation
  • Interactive: Design decisions, complex logic, debugging

You can be coding the login component while the AI simultaneously researches auth providers and sets up test infrastructure.

Level 5: The Full Vision

A complete development environment where:

  1. You describe what you want in natural language
  2. The AI creates a plan as a tree of tasks
  3. You can modify the plan before execution
  4. Tasks execute in parallel where possible
  5. You jump in when needed for decisions or guidance
  6. The system learns from successful patterns

Core Features

1. Task Tree Visualization

  • Horizontal tree showing all tasks and their relationships
  • Real-time status updates (pending, active, completed, failed)
  • Click any node to view/participate in that conversation
  • Drag and drop to reorganize tasks

2. Execution Modes

Every task can be:

  • Interactive: Waits for your input, you guide the conversation
  • Autonomous: Runs automatically, you can watch but don't need to participate
  • Hybrid: Starts automatically but can request your input when needed

3. Context Isolation

Each task maintains its own context:

  • No information bleed between tasks
  • Clear inputs and outputs
  • Parent tasks can access child results
  • Sibling tasks remain independent

4. Smart Prompts

The system maintains a library of prompts that:

  • Know when to decompose vs execute
  • Understand task dependencies
  • Can identify when human input is needed
  • Improve over time based on success rates

User Interface

Split Screen Design

┌─────────────────────────┬───────────────────────┐
│                         │                       │
│     Task Tree View      │    Active Chat/       │
│                         │    Canvas Editor      │
│  [Project Root]         │                       │
│    ├─[✓] Setup          │  Current: Frontend    │
│    ├─[●] Frontend ←──── │                       │
│    │  ├─[●] Login       │  AI: Let's implement  │
│    │  └─[ ] Dashboard   │  the login component. │
│    └─[ ] Deploy         │                       │
│                         │  You: ...             │
└─────────────────────────┴───────────────────────┘

Key Interactions

  • Click a task node to switch to that conversation
  • Double-click to change execution mode
  • Right-click for options (delete, reorganize, clone)
  • Status indicators:
    • Gray: Pending
    • Blue: Active
    • Green: Completed
    • Red: Failed/Needs attention

Technical Implementation

Starting Simple

  1. Phase 1: Basic chat that can spawn chats
    • Simple parent-child relationship
    • Manual task creation
    • Basic state management
  2. Phase 2: Add visual tree
    • Tree visualization component
    • Click to navigate
    • Status tracking
  3. Phase 3: Smart decomposition
    • Decomposition prompts
    • Automatic task creation
    • Dependency detection
  4. Phase 4: Parallel execution
    • Task queue system
    • Autonomous execution engine
    • Result aggregation

Technology Stack

  • Frontend: Electron + React (for rich UI)
  • AI Integration: Vercel AI SDK (for streaming)
  • State Management: Task tree as central state
  • Persistence: Local SQLite for task history

The Magic: Prompt Engineering

Decomposition Prompt Example

You are a task decomposition expert. Given a user request,
decide whether to:
1. Execute directly (if simple and atomic)
2. Decompose into subtasks (if complex)

For each subtask, specify:
- Clear objective
- Execution mode (interactive/autonomous)
- Dependencies
- Expected outputs

Request: [user input]

Continuous Improvement

  • Track success rate of each prompt template
  • Identify patterns in failures
  • A/B test prompt variations
  • Build a library of proven patterns

Why This Approach Wins

  1. Defeats Context Degradation: Each conversation stays short and focused
  2. Parallel Progress: Multiple tasks advance simultaneously
  3. User Control: Jump in exactly where your expertise is needed
  4. Transparency: See the entire plan and progress at a glance
  5. Learning System: Gets better at decomposition and execution over time

Example Workflow

User: "I need to add a shopping cart to my e-commerce site"

Orchestrator:

I'll help you add a shopping cart. Here's my plan:

[Shopping Cart Feature]
├── [Auto] Research best practices
├── [Interactive] Design cart UI
├── [Auto] Set up database schema
├── [Interactive] Implement cart logic
├── [Auto] Create API endpoints
├── [Interactive] Frontend integration
└── [Auto] Write tests

Shall I proceed? You can modify this plan or jump into any task.

The user approves, and multiple tasks begin:

  • Research runs automatically in the background
  • Database schema is generated
  • User is prompted to start the UI design session

While the user works on UI design, the system has already completed research and is working on the API endpoints. By the time they're done with design, much of the groundwork is complete.

Conclusion

Orchestrator transforms how we work with AI by embracing a simple truth: focused conversations work better than long, meandering ones. By building a system where chats can create and manage other chats, we create a new paradigm for AI-assisted work - one that's more efficient, more reliable, and more transparent than anything before.

Content is user-generated and unverified.
    Orchestrator - Product Document | Claude