ainovel-cli's multi-agent harness architecture is the most serious attempt at long-form AI writing I've seen.
Why This Matters Right Now
Everyone's building AI writing tools. Most of them are glorified "continue this text" wrappers. They fall apart after chapter three, forget character names by chapter seven, and turn into incoherent soup by chapter twenty. Nobody has seriously solved the engineering problem of long-form coherence — until maybe now.
ainovel-cli is a 71-star Go project that quietly dropped a multi-agent novel generation engine with a design philosophy that's worth your attention even if you never write a single word of fiction. The architecture decisions here are a clinic in how to build reliable, long-running LLM pipelines.
What It Actually Does
You feed it one sentence. It produces a complete novel. That's the pitch. But the interesting part is how it refuses to let the process fall apart.
Four specialized agents divide the labor: Coordinator orchestrates everything; Architect handles premise, outline, character files, and world rules; Writer autonomously plans, drafts, self-reviews, and commits each chapter; Editor evaluates arcs across seven quality dimensions. Each has a constrained tool set — Writer gets plan_chapter, draft_chapter, check_consistency, commit_chapter. Editor gets read_chapter, save_review, save_arc_summary. Nobody does everything, which means nobody context-overflows trying.
The seven-dimension editorial review is genuinely ambitious: setting consistency, character behavior, pacing, narrative coherence, foreshadowing, hooks, and aesthetic quality — where aesthetic is further broken down into descriptive texture, narrative technique, dialogue differentiation, word quality, and emotional resonance. Every critique must cite the original text as evidence. That's not vibes-based editing.
The Technical Architecture Worth Stealing
The real gem here is the Scaffolding + Harness split, documented in the README's architecture section. Most agent frameworks conflate setup with runtime. This project separates them explicitly:
- Scaffolding — model selection, prompt assembly, tool binding, sub-agent wiring happens before the run starts
- Harness — once running, the host layer owns state transitions, checkpoint recovery, handoff packages, review gating, and commit consistency
Critically: the LLM never controls the control flow. State is driven by signal files. The Phase state machine follows a strict forward-only rule (init → premise → outline → writing → complete) with no backtracking. The Flow layer handles in-writing transitions (writing → reviewing → rewriting → polishing → steering). This is deterministic orchestration on top of non-deterministic generation — exactly the right separation.
Chapter-level checkpoint recovery is table-stakes in production pipelines but almost nobody ships it in open-source tools. Here, Ctrl+C, crashes, or network drops all resume from the last committed chapter, covering all five phases: planning, writing, review, rewrite, and user intervention.
The rolling arc planning is clever. Instead of planning 500 chapters upfront (which produces hollow outlines), the Architect only plans the first 2-arc skeleton plus detailed chapters for arc 1. Subsequent arcs expand lazily, informed by save_arc_summary and character state snapshots. Far-future planning stays grounded because it's generated when it's needed, not when it's speculative.
For context management, the novel_context tool loads a structured pack per chapter: prior summaries, timelines, active foreshadowing threads, character state, style rules, next-chapter forecast, and relevance-recommended historical chapters across four dimensions — foreshadowing, character appearances, state changes, and relationships. The adaptive strategy auto-switches between full-context, sliding window, and hierarchical summarization based on total chapter count, which is the right engineering answer to the 500-chapter problem.
All state lives in JSON + Mark