Content is user-generated and unverified.

Orchestrator: Building a Self-Organizing AI Task System

What We're Building

The Simplest Version: A Chat That Creates Chats

At its core, Orchestrator starts as a regular LLM chat interface. But with one crucial difference: this chat can spawn new chats.

Think of it like this:

You ask the AI to "build a todo app"
Instead of trying to do everything in one long conversation, the AI creates separate chats:
- One chat for "design the UI"
- One chat for "set up the database"
- One chat for "implement the frontend"
- One chat for "write tests"

Each chat stays focused on its specific task. No context pollution. No losing track of what we're doing.

Why This Matters

Research shows that LLMs lose 39% of their performance in long conversations. They make assumptions, get confused, and can't recover from mistakes. By keeping each conversation short and focused, we maintain peak AI performance.

Evolution of the Design

Level 1: Basic Chat Splitting

Main Chat: "Build a todo app"
    ├── Chat 1: "Design UI"
    ├── Chat 2: "Backend API"
    └── Chat 3: "Frontend"

The user manually navigates between chats. Simple, but effective.

Level 2: Visual Task Tree

[Todo App Project]
    ├── [✓] Design Phase
    │   ├── [✓] User Stories
    │   └── [✓] Mockups
    ├── [●] Development (active)
    │   ├── [●] Backend API
    │   └── [ ] Frontend
    └── [ ] Testing

Now you see all tasks at once. Click on any node to jump into that conversation. The tree shows what's done, what's active, and what's pending.

Level 3: Smart Decomposition

The AI automatically knows when to split tasks:

User: "Add authentication to my app"

AI: "I'll break this down into focused tasks:"

├── Research auth providers (auto)
├── Design auth flow (interactive)
├── Implement login (interactive)
├── Add session management (auto)
└── Write auth tests (auto)

The system decides which tasks need your input and which can run automatically.

Level 4: Interactive + Autonomous Execution

Some tasks run by themselves, others wait for you:

Autonomous: Research, boilerplate generation, testing, validation
Interactive: Design decisions, complex logic, debugging

You can be coding the login component while the AI simultaneously researches auth providers and sets up test infrastructure.

Level 5: The Full Vision

A complete development environment where:

You describe what you want in natural language
The AI creates a plan as a tree of tasks
You can modify the plan before execution
Tasks execute in parallel where possible
You jump in when needed for decisions or guidance
The system learns from successful patterns

Core Features

1. Task Tree Visualization

Horizontal tree showing all tasks and their relationships
Real-time status updates (pending, active, completed, failed)
Click any node to view/participate in that conversation
Drag and drop to reorganize tasks

2. Execution Modes

Every task can be:

Interactive: Waits for your input, you guide the conversation
Autonomous: Runs automatically, you can watch but don't need to participate
Hybrid: Starts automatically but can request your input when needed

3. Context Isolation

Each task maintains its own context:

No information bleed between tasks
Clear inputs and outputs
Parent tasks can access child results
Sibling tasks remain independent

4. Smart Prompts

The system maintains a library of prompts that:

Know when to decompose vs execute
Understand task dependencies
Can identify when human input is needed
Improve over time based on success rates

User Interface

Split Screen Design

┌─────────────────────────┬───────────────────────┐
│                         │                       │
│     Task Tree View      │    Active Chat/       │
│                         │    Canvas Editor      │
│  [Project Root]         │                       │
│    ├─[✓] Setup          │  Current: Frontend    │
│    ├─[●] Frontend ←──── │                       │
│    │  ├─[●] Login       │  AI: Let's implement  │
│    │  └─[ ] Dashboard   │  the login component. │
│    └─[ ] Deploy         │                       │
│                         │  You: ...             │
└─────────────────────────┴───────────────────────┘

Key Interactions

Click a task node to switch to that conversation
Double-click to change execution mode
Right-click for options (delete, reorganize, clone)
Status indicators:
- Gray: Pending
- Blue: Active
- Green: Completed
- Red: Failed/Needs attention

Technical Implementation

Starting Simple

Phase 1: Basic chat that can spawn chats
- Simple parent-child relationship
- Manual task creation
- Basic state management
Phase 2: Add visual tree
- Tree visualization component
- Click to navigate
- Status tracking
Phase 3: Smart decomposition
- Decomposition prompts
- Automatic task creation
- Dependency detection
Phase 4: Parallel execution
- Task queue system
- Autonomous execution engine
- Result aggregation

Technology Stack

Frontend: Electron + React (for rich UI)
AI Integration: Vercel AI SDK (for streaming)
State Management: Task tree as central state
Persistence: Local SQLite for task history

The Magic: Prompt Engineering

Decomposition Prompt Example

You are a task decomposition expert. Given a user request,
decide whether to:
1. Execute directly (if simple and atomic)
2. Decompose into subtasks (if complex)

For each subtask, specify:
- Clear objective
- Execution mode (interactive/autonomous)
- Dependencies
- Expected outputs

Request: [user input]

Continuous Improvement

Track success rate of each prompt template
Identify patterns in failures
A/B test prompt variations
Build a library of proven patterns

Why This Approach Wins

Defeats Context Degradation: Each conversation stays short and focused
Parallel Progress: Multiple tasks advance simultaneously
User Control: Jump in exactly where your expertise is needed
Transparency: See the entire plan and progress at a glance
Learning System: Gets better at decomposition and execution over time

Example Workflow

User: "I need to add a shopping cart to my e-commerce site"

Orchestrator:

I'll help you add a shopping cart. Here's my plan:

[Shopping Cart Feature]
├── [Auto] Research best practices
├── [Interactive] Design cart UI
├── [Auto] Set up database schema
├── [Interactive] Implement cart logic
├── [Auto] Create API endpoints
├── [Interactive] Frontend integration
└── [Auto] Write tests

Shall I proceed? You can modify this plan or jump into any task.

The user approves, and multiple tasks begin:

Research runs automatically in the background
Database schema is generated
User is prompted to start the UI design session

While the user works on UI design, the system has already completed research and is working on the API endpoints. By the time they're done with design, much of the groundwork is complete.

Conclusion

Orchestrator transforms how we work with AI by embracing a simple truth: focused conversations work better than long, meandering ones. By building a system where chats can create and manage other chats, we create a new paradigm for AI-assisted work - one that's more efficient, more reliable, and more transparent than anything before.

Content is user-generated and unverified.