Avalon3 — Multi-AI Debate System

Avalon3 is a system where multiple AIs use a Collective Intelligence approach to debate, then generate code based on the results. It reduces the bias of any single AI and produces higher-quality results.

Overview

Why Avalon3?

When you rely on a single AI for code, you can be limited by that AI's biases and blind spots. Avalon3 solves this problem through multi-AI collective intelligence:

AspectSingle AIAvalon3
PerspectiveOne viewpointAll AIs analyze from 6 perspectives
Error DetectionSelf-reviewAll debaters except the implementer review in parallel
Design QualityAI-dependentOptimized through multi-angle debate
Code QualityPotentially biasedIncludes review + refinement process

The Four Roles

Avalon3 has 4 distinct roles. A single AI can serve multiple roles:

RoleDescriptionSelectionActive Stages
DebaterAnalyzes the project from 6 perspectives and debatesMultiple selection (all AIs participate)Research, Debate
SynthesizerConsolidates debate results into a single design documentSingle selectionSynthesis
ImplementerWrites all code based on the designSingle selectionImplementation, Refinement
ReviewerValidates the quality of implemented codeAutomatic (debaters minus implementer)Review, Refinement

Reviewers are determined automatically. All debaters except the implementer become reviewers, preventing any AI from reviewing its own code.

Example: 4 AIs Participating

When using Claude, Gemini, Ollama, and OpenCode:

StageParticipating AIsDescription
ResearchClaude + Gemini + Ollama + OpenCodeAll 4 research in parallel
DebateClaude + Gemini + Ollama + OpenCodeAll 4 debate across 6 perspectives
SynthesisGemini (synthesizer)Consolidates debate into a single design
ImplementationClaude (implementer)Writes all code based on the design
ReviewGemini + Ollama + OpenCode3 reviewers (excluding implementer) review in parallel
RefinementClaude fixes → 3 re-reviewOnly runs when critical issues are found

6-Stage Pipeline

Research → Debate → Synthesis → Implementation → Review → Refinement
Execution panel — 6-stage progress bar

Stage 1: Research

All debaters investigate relevant information in parallel:

  • Analyze existing code and project structure
  • Research the tech stack and dependencies
  • Review best practices and design patterns
Each AI independently produces research results that serve as the foundation for the next debate stage.

How Each AI Researches Differently

During the Research stage, each AI leverages its own built-in tools to gather information. This means the depth and scope of research varies by AI:

AIResearch MethodCharacteristics
Claude CodeActively browses the web using built-in browser toolsInvestigates latest docs, API references, GitHub issues in real time
Gemini CLIGoogle Search grounding supportReflects up-to-date information based on Google search results
OllamaTraining data-based analysisNo web access; relies on pre-trained knowledge of the local model
OpenCodeVaries by backend modelDepends on the capabilities of the connected AI model

Claude Code has built-in WebSearch and WebFetch tools, allowing it to autonomously open a browser during the Research stage and directly explore technical documentation, library release notes, community discussions, and more. This enables it to gather the latest information beyond its training data knowledge cutoff.

When AIs with diverse research capabilities participate together, deep analysis from training data combines with real-time web information to produce richer foundational material.

Stage 2: Debate

The core of Avalon3 — collective intelligence debate. Instead of an "Architect vs. Critic" model, all AIs analyze together from the same perspective.

Why Do All AIs Analyze from the Same Perspective?

A common approach is to assign each AI a different role — for example, "AI-A handles security, AI-B designs architecture." Avalon3 deliberately takes a different approach: all AIs analyze together across every perspective:

ProblemRole-Assignment ApproachAvalon3 Approach
Uneven AI capabilityA weaker AI produces lower-quality analysis for its assigned perspectiveAll AIs analyze every perspective, so no single perspective suffers from a weaker AI
Disconnected perspectivesEach AI only knows its own role and misses the bigger pictureAll AIs go through all 6 perspectives, gaining a holistic understanding of the project
No cross-validationOnly one opinion exists per perspectiveMultiple AIs validate and complement each other's analysis within the same perspective
Lack of progressive depthEach AI analyzes independently, limiting depthSequential speaking allows each AI to build on previous contributions for deeper analysis

For example, if security analysis is assigned solely to Claude, the result depends entirely on Claude's security expertise. In Avalon3, however, Claude, Gemini, Ollama, and OpenCode all speak sequentially on the Security perspective — so a vulnerability missed by one AI can be caught by another.

Debate in progress — 6-perspective sequential debate log

Six fixed perspectives are addressed sequentially:

OrderPerspectiveFocus
1🏗️ ArchitectureStructure, scalability, maintainability, design patterns
2🔒 SecurityVulnerabilities, edge cases, input validation, authentication
3PracticalPerformance, simplicity, pragmatism, MVP
4👤 UX & APIAPI usability, error messages, documentation
5😈 ChallengeFinding weaknesses, questioning assumptions, counterarguments
6🌟 SynthesisIntegrating all analyses, reaching final consensus
All debaters speak sequentially within each perspective. Each speaker references previous contributions, enabling progressively deeper analysis. For example, with 4 AIs, a total of 24 debate messages are generated (4 AIs × 6 perspectives).

Stage 3: Synthesis

The synthesizer analyzes the entire debate transcript and produces a single design document:

  • Directory structure
  • Per-file specifications (path, purpose, exports, imports, dependencies)
  • Interface contracts
  • Mandatory file list
  • Implementation guidelines
Design completeness is automatically validated, with up to 2 retries if the document is incomplete.

Consensus result — synthesized design document

Stage 4: Implementation

A single implementer writes all code based on the design:

  • Files are generated sequentially in dependency order
  • Previously generated files are included as context
  • Ensures consistent coding style and naming conventions
A single implementer is used because splitting implementation across multiple AIs causes style inconsistencies and interface mismatches.

Stage 5: Review

All debaters except the implementer review the code in parallel:

Review CategoryDescription
Design ComplianceFile paths and function signatures match the design
Import VerificationNo circular references or missing modules
CompletenessDetects pass, TODO, NotImplementedError, and other placeholders
Code QualityError handling, logging, and type hints
Each reviewer independently produces a review report, classifying issues by severity:
SeverityMeaningRefinement
Critical (red)Security vulnerabilities, logic errorsMust be fixed
Warning (orange)Potential issuesSelectively fixed
Info (blue)Improvement suggestionsReference only
Review results — Critical/Warning/Info issue list

Stage 6: Refinement

This stage only runs when critical issues are found in the review:

  1. The implementer fixes issues grouped by related files
  2. Reviewers re-review the modified code
  3. Repeats until no critical issues remain (up to 3 iterations)
If there are only warnings and no critical issues, this stage is skipped.


Avalon3 UI

Full Avalon3 sidebar

Settings Tab

SettingDescriptionOptions
Task TitleProject/task nameFree text
Task DescriptionDetailed requirementsFree text (long form)
DomainProject typeGeneral, API, Web, CLI, Trading, ML/AI, Data, Game, Embedded
ScaleProject sizeSingle File, Multi File, Module, Project
LanguageProgramming languagePython, JS, TS, Rust, Go, Java, C#, etc.
AttachmentsReference documents.txt, .md files
Settings tab — Domain and Scale selection
Domain

The domain setting provides AI debate context and determines mandatory files for the project:

DomainContext for AIsAuto-Added Mandatory Files
GeneralGeneral-purpose software(common files only)
APIREST/GraphQL API serverschema.sql, migrations/, openapi.yaml
WebWeb frontendindex.html, public/
CLICommand-line tool(common files only)
TradingTrading/financial systemsconfig.yaml, secrets.example.yaml
ML/AIMachine learning/AI projectsrequirements-dev.txt, notebooks/, data/.gitkeep
DataData processing(common files only)
GameGame development(common files only)
EmbeddedEmbedded systems(common files only)
Domain information is included in AI prompts during the Research, Debate, and Synthesis stages, guiding AIs to propose architectures and patterns appropriate for the specific domain.
Scale

Scale provides the AIs with a project size hint:

ScaleIntended ScopeEffect on AI
Single FileOne fileSimple script/utility-level design
Multi FileSeveral filesSmall multi-file project
Module (default)Module-levelModular architecture design
ProjectFull projectComplete project structure with tests and docs

Scale serves as a guideline that AIs reference during Debate and Synthesis to adjust design complexity. No programmatic limits (such as file count restrictions) are enforced.

Language

The programming language setting has a direct, practical impact on code generation and validation:

Impact AreaDescriptionExample (Python)
Project Config FilesAuto-generates language-specific config filesrequirements.txt, pyproject.toml
Test File PatternsTest file validation rulestests/test_*.py, conftest.py
Verification CommandsBuild/lint/test commandspytest, mypy, flake8
Syntax CheckingGrammar validation of generated codecompile() syntax check
Per-language auto-generated config files:
LanguageProject Config FilesVerification Commands
Pythonrequirements.txt, pyproject.tomlpytest, mypy, flake8
JavaScriptpackage.jsonnpm install, eslint, npm test
TypeScriptpackage.json, tsconfig.jsontsc --noEmit, eslint, npm test
Gogo.modgo build, go test
RustCargo.tomlcargo check, cargo test, cargo clippy
Javapom.xml or build.gradlemvn compile, mvn test
C#.csproj, .slndotnet build, dotnet test

AI Provider Settings

You can assign different AIs to each role:

AI provider selection UI
RoleDescriptionSelection
DebatersAIs participating in research and debateMultiple selection
SynthesizerAI that consolidates debate results into a design documentSingle selection
ImplementerAI that writes the codeSingle selection

Reviewers are not separately selected. All debaters except the implementer automatically become reviewers.

Supported AI Providers:

ProviderDescriptionNotes
Claude CLIAnthropic ClaudeMost stable
Gemini CLIGoogle GeminiFast responses
OllamaLocal AI modelsNo network required
OpenCodeOpen-source AIFree

Ollama also supports remote servers. Use the host:port|model format (e.g., 192.168.1.100:11434|llama3).

History Tab

View previous runs and reuse their settings:

History tab — previous run list
  • Run date, title, and task ID
  • Stage count, success/failure status
  • Click to reload previous settings

Execution Panel (Right)

Run Tab

Run tab — full progress view
  • Progress Bar — Current position within the 6 stages
  • Current Stage — Name of the running stage
  • Current Provider — Name of the AI in use
  • Log Output — Real-time log with timestamps

Results Tab

Results tab — generated code and review issues
  • Generated File List — File name, size, language
  • Review Issues — Categorized as Critical/Warning/Info
  • Statistics — File count, lines of code, token count, elapsed time
  • README.md — Auto-generated documentation

Completion Popup

Avalon3 completion popup — statistics and result buttons

When the pipeline finishes, a popup appears with the following information:

  • Task ID
  • Number of generated files
  • Lines of code
  • Elapsed time
  • View Results / Go to Files Tab buttons

Prompt Library Integration

You can load prompts from the Prompt Library into the task description in Avalon3 settings:

  1. Click the Load Prompt button in the Settings tab
  2. Search and select a prompt
  3. The selected prompt is automatically inserted into the task description
Prompt Library integration — search and selection

Output Location

{project}/
├── avalon_{timestamp}/              # Generated source code directory
│   ├── (source code files)
│   ├── README.md                    # Auto-generated documentation
│   └── avalon3_result.json          # Full result JSON
└── .projecthub/
    └── avalon3/
        ├── avalon3_result.json      # Result viewer copy (same content)
        └── history.json             # Run history

avalon3_result.json contains integrated results from all pipeline stages:

KeyContentPipeline Stage
researchEach AI's research findings and recommendationsResearch
debateDebate messages per round, participants, timestampsDebate
architectureSynthesized design document (tech stack, directory structure, file specs, interface contracts, implementation guidelines)Synthesis
reviewIssues per reviewer (Critical/Warning/Info), approval statusReview
solutionAll generated source code (path, content, language per file)Implementation
summaryStatistics (file count, debate messages, token usage, elapsed time)Overall
You can select per-stage filters in the result viewer to explore detailed data for each stage in a tree structure:
Result viewer — full pipeline result tree

Use Cases

Architecture Design

When designing the architecture for a complex system, multiple AIs analyze from 6 perspectives (structure, security, practicality, UX, challenge, synthesis) to arrive at the optimal design.

Code Refactoring

AIs use collective intelligence to debate the best refactoring strategy for existing code.

New Feature Implementation

When implementing a new feature, the entire process from multi-angle debate to design, implementation, review, and refinement is handled in one go.

Next Steps

  • Colligi — AI collective intelligence analysis
  • Alliance — AI collaborative workflow
  • AI Agent — Single-AI code generation