ZERO OPERATORS

AUTONOMOUS AI SYSTEMS

You input a plan. Agents execute. The oracle verifies.

Created by Samyakh (Sam) Tukra

Autonomous Research & Engineering

Zero Operators is an autonomous AI research and engineering team. You give it a project — a GitHub repo, some source documents, and success criteria — and it builds, trains, validates, and delivers.

A coordinated team of 20 AI agents handles the full ML lifecycle: data engineering, model building, oracle validation, code review, testing, and explainability. You stay in the loop at human checkpoints. ZO remembers everything across sessions. It learns from its mistakes. And the delivery repo stays clean.

The difference

WITHOUT ZO
Context lost every session
"Where were we? Let me re-read everything..."
No verification
"It says it works. Does it?"
Same mistakes repeated
Bug fixed Friday. Same bug Monday.
Manual coordination
47 tabs, 12 prompts, copy-paste between tools.
No audit trail
"Who decided to drop that feature? When? Why?"
Infrastructure leaks
AI configs mixed into the delivery repo.
Ad-hoc workflow
"What's next? I guess we try training?"
WITH ZO
Memory persists across sessions
STATE.md + DECISION_LOG + semantic search. Resume exactly where you left off.
Oracle verifies every claim
Hard metrics, tiered criteria, statistical significance. Nothing ships unverified.
Self-evolution prevents recurrence
21 priors accumulated. Zero repeated failures.
One plan, autonomous execution
Write plan.md. Walk away. Come back to validated deliverables.
Complete audit trail
Every decision, every gate, every agent action — timestamped and searchable.
Clean repo separation
Delivery repo contains zero ZO artifacts. Always.
Structured 6-phase pipeline
Data → Features → Model → Training → Analysis → Packaging. Gated at every transition.

From data to delivery

Six phases. Gated at every transition. Oracle-validated.

01

Data Review

Audit, clean, validate, and version your data. Schema checks, outlier detection, class balance, drift baselines.

Data EngineerResearch ScoutCode Reviewer
AUTO GATE
02

Feature Engineering

Create derived features, statistical filtering, VIF pruning, domain validation. For DL: input representation design.

Data EngineerDomain EvaluatorResearch Scout
HUMAN GATE
03

Model Design

Architecture selection, loss function design, training strategy. Optimizer, LR schedule, mixed precision, checkpointing.

Model BuilderResearch ScoutCode Reviewer
AUTO GATE
04

Training & Iteration

Baseline training, diagnostics, autonomous iteration loop. Cross-validation, ensemble exploration. Oracle validates each run.

Model BuilderOracle / QATest Engineer
AUTO GATE
iterate until oracle passes
05

Analysis & Validation

Explainability (SHAP, GradCAM), error analysis, ablation studies, statistical significance, reproducibility verification.

XAI AgentDomain EvaluatorOracle / QA
HUMAN GATE
06

Packaging

Inference pipeline, model card, validation report, drift detection, test suite, clean delivery repo.

ML EngineerTest EngineerDocumentation Agent
AUTO GATE

Every claim is verified

No deliverable is complete until the oracle confirms it. Hard metrics. Tiered criteria. Statistical significance.

MUST-PASS
Test accuracy ≥ 95%
99%
All tests passing PENDING
Zero ZO artifacts in delivery PENDING
SHOULD-PASS
Coverage > 80% PENDING
Ruff lint clean PENDING
All phases complete PENDING
COULD-PASS
Statistical significance PENDING
Reproducibility verified PENDING
VALIDATED

The same mistake never happens twice

Persistent Memory

DECISION Architecture: Hybrid orchestration model T+0:15
GATE Phase 1 complete — 76 tests passing T+1:30
CHECKPOINT Human approved feature selection T+2:45
DECISION CNN architecture, custom loss function T+3:20
GATE Phase 4 — Oracle: 99% accuracy, PASS T+5:00
entries 2 and 3 match

Self-Evolution Protocol

ERROR Doc-codebase drift — 10 files stale
ROOT CAUSE missing_rule — no enforcement mechanism
PRIORS.md PR-005: Aspirational rules without enforcement are dead letter
RESOLVED Three-layer defense: validation script + hooks + cascade mappings

21 priors accumulated. Zero repeated failures.

Precision, not prompts

Every agent gets a precise contract: what to do, what to produce, how to know it's done.

Build a model
INPUTS processed data at data/processed/, feature list from Phase 2
OUTPUTS trained model at models/best.pt, training_metrics.jsonl, oracle evaluation
SUCCESS CRITERIA accuracy ≥ 95%, inference < 100ms, reproducible with seed
PRECEDENT DECISION_LOG: “Linear baseline scored 78%, try CNN next”
BUDGET max 5 iterations, 30 min wall clock
TOOLS / OFF-LIMITS PyTorch, CUDA, tensorboard  |  delivery repo configs, ZO infrastructure
Build a model
vague prompt
VS
full contract

Vague prompts produce vague results. Precise contracts produce verified deliverables.

Not another coding assistant

Most AI tools help you write code. ZO replaces the need to coordinate it.

Dimension
Coding Assistants (Cursor, Copilot)
Agent Frameworks (CrewAI, AutoGen)
Zero Operators
Unit of work
Line / function
Task / step
Entire project
Human role
Pair programmer
Prompt engineer
Research director
Verification
"Looks right to me"
Optional checks
Oracle-mandated, tiered, statistical
Memory
Current session only
Basic state
STATE, DECISION_LOG, PRIORS, semantic search
Learning
None
None
Self-evolution: failures update rules
Delivery
Code in your editor
Output files
Clean repo, zero infrastructure artifacts
Workflow
Ad-hoc
User-defined DAG
6-phase gated pipeline with human checkpoints

Unit of work

Assistants Line / function
Frameworks Task / step
Zero Operators Entire project

Human role

Assistants Pair programmer
Frameworks Prompt engineer
Zero Operators Research director

Verification

Assistants "Looks right to me"
Frameworks Optional checks
Zero Operators Oracle-mandated, tiered, statistical

Memory

Assistants Current session only
Frameworks Basic state
Zero Operators STATE, DECISION_LOG, PRIORS, semantic search

Learning

Assistants None
Frameworks None
Zero Operators Self-evolution: failures update rules

Delivery

Assistants Code in your editor
Frameworks Output files
Zero Operators Clean repo, zero infrastructure artifacts

Workflow

Assistants Ad-hoc
Frameworks User-defined DAG
Zero Operators 6-phase gated pipeline with human checkpoints

Coding assistants help you write lines. Agent frameworks help you chain tasks. ZO gives you a team that owns the project end-to-end.

Why Zero Operators

The plan is the only lever.

You don't prompt agents individually. You write one plan.md with objectives, metrics, and constraints. Agents decompose it, execute it, and verify it. Edit the plan — agents detect the delta and replan.

The oracle is the source of truth.

Not 'does the code compile?' Not 'did the agent say it's done?' The oracle runs hard metrics against actual output. 99% accuracy is either met or it isn't. No ambiguity. No hallucinated success.

The system learns from its own mistakes.

When something fails, ZO doesn't just fix it — it updates the rule that allowed the failure. PRIORS.md grows with every project. The same mistake literally cannot happen twice.

Zero operators means zero humans in the loop.

Humans approve the plan. Humans approve gate checkpoints. Everything between is autonomous. The name is the promise.

What Zero Operators is not

It is NOT A coding assistant
It IS A coordinated agent team that owns entire projects
It is NOT A chatbot you prompt
It IS An autonomous system you brief with a plan
It is NOT A wrapper around LLMs
It IS An orchestration engine with memory, verification, and self-evolution
It is NOT A no-code tool
It IS A specification-driven system — plan.md is precise, not simplified
It is NOT A one-shot tool
It IS A multi-session system with persistent memory and session recovery
It is NOT Magic
It IS Engineering discipline applied to AI coordination

ZO is a digital research and engineering team that happens to express itself in code, models, reports, and data artifacts.

20 agents. 3 tiers. Coordinated.

Plus a custom agent library — create domain-specific specialists from your plan.

Lead Orchestrator

opus

Reads plan, decomposes phases, issues contracts, gates work

Data Engineer

sonnet

Data loading, cleaning, validation, schema design, drift detection

Model Builder

opus

Architecture design, loss functions, hyperparameter search, training

Oracle / QA

sonnet

Executes oracle criteria, reports pass/fail with evidence

Code Reviewer

sonnet

Reviews all code, enforces conventions, catches security issues

Test Engineer

sonnet

Writes and runs unit, integration, regression, and edge case tests

Research Scout

opus

Literature review, prior art survey, baseline identification

XAI Agent

sonnet

Feature importance, model interpretation, explainability analysis

Domain Evaluator

opus

Business/domain coherence, edge cases, distributional shift

ML Engineer

sonnet

Productionisation, containerisation, inference optimisation

Infra Engineer

haiku

Compute allocation, experiment tracking, artifact storage

Plan Architect

opus

Drafts compliant plan.md from source documents and data

Data Scout

sonnet

Inspects raw data during plan drafting for quality and schema

Init Architect

opus

Conversational project setup, environment detection, scaffolding

Platform Build Team

Software Architect opus

Decomposes platform into buildable modules, defines contracts

Backend Engineer opus

Builds core ZO infrastructure: memory, orchestration, comms

Frontend Engineer sonnet

Command dashboard, agent panel, live log viewer

Platform Test Engineer sonnet

Tests all ZO modules: unit, integration, end-to-end

Platform Code Reviewer sonnet

Reviews platform code, enforces coding conventions

Documentation Agent haiku

Maintains docstrings, README, API documentation

Five commands to launch

git clone https://github.com/SamPlvs/zero-operators.git
cd zero-operators && ./setup.sh
# Set up environment
zo init my-project
# Scaffold project files
zo draft --project my-project
# Generate plan.md from your data
zo build plans/my-project.md
# Launch the agent team

Requires Python 3.11+, Claude Code CLI. See setup.sh for details.

Proven end-to-end

MNIST digit classifier built autonomously. 5 phases, 4 clean commits, zero human code.

0% MNIST test accuracy
$0 Total build cost
0 Platform tests passing
0 Agents defined