Files
stoned-ai/docs/03-IMPLEMENTATION-PLAN.md

4.4 KiB

Implementation Plan

Scope For This Implementation

  • Included:

    • Project scaffold (pyproject.toml, src/stoned_ai/, tests/)
    • TTS layer wrapping Arena's Kokoro backend
    • Cleaning engine for AI CLI output noise
    • AI backend abstraction supporting Codex and Gemini CLI backends
    • Web server with SSE delivery
    • Host view (/host) with text input, send, voice selection, session control
    • Broadcast view (/broadcast) styled for OBS browser source capture
    • WAV audio serving for both views
    • Per-speaker voice assignment (host voice + AI voice)
    • install.sh script and ~/.local/bin/stoned-web link
  • Excluded from initial build:

    • Claude API backend (Phase 2)
    • Visual avatar or waveform animation overlay
    • YouTube chat integration
    • Persistent conversation logging (nice to have, not required for launch)
    • Mobile-responsive host view (desktop only for now)

Phases

Phase 1: Arena Core Enhancements

  • Objective: Update the Arena project to support human speakers and the /broadcast view.
  • Files affected (in the Arena project):
    • src/arena/agents.py — add human runner.
    • src/arena/web.py — add /broadcast route and UI message input.
    • src/arena/core.py — ensure run_conversation handles the human backend.
  • Risks: Breaking existing AI-vs-AI modes in Arena.
  • Exit criteria: Arena can successfully pause a conversation loop to wait for human input from the Web UI.

Phase 2: Stoned Mode Presets

  • Objective: Define the specific prompt envelopes and cleaning rules for the Stoned.AI show.
  • Files likely affected:
    • src/arena/core.py — add MODE_STONED preset.
    • src/arena/clean.py — update regex for high-quality show output.
  • Exit criteria: Jason can select "Stoned Mode" in Arena and it loads the correct speaker labels and roles.

Phase 3: OBS Broadcast View

  • Objective: Build the read-only, styled broadcast page within Arena.
  • Files likely affected:
    • src/arena/web.py
  • Exit criteria: /broadcast is accessible and OBS captures audio/video end-to-end.

Phase 5: Claude API Backend (Post-Launch)

  • Objective: Add a Claude backend using the anthropic SDK as an alternative to Codex/Gemini.
  • Files likely affected:
    • src/stoned_ai/ai.py
    • pyproject.toml (add anthropic dependency)
  • Risks: Requires a valid ANTHROPIC_API_KEY environment variable on svc-ai. Must not break existing Codex/Gemini backends.
  • Exit criteria: The host view offers a Claude model option. A full conversation runs using the Claude API backend.

Order Of Operations

  1. Create pyproject.toml and package scaffold.
  2. Implement tts.py (Kokoro wrapper).
  3. Implement clean.py (noise stripping for Codex and Gemini).
  4. Implement ai.py (Codex and Gemini backends).
  5. Implement web.py — server core, session state, SSE stream, WAV serving.
  6. Implement /host view in web.py.
  7. Implement /broadcast view in web.py.
  8. Write scripts/install.sh.
  9. Smoke test: full end-to-end conversation from host view to broadcast view.
  10. Verify OBS browser source audio capture.

Testing Expectations

  • Unit tests: tts.py voice listing. clean.py noise stripping against fixture strings. ai.py CLI argument construction (mock subprocess).
  • Integration tests: Full SSE event sequence from host message submit to broadcast card render. Requires a live Codex or Gemini CLI.
  • Manual verification: OBS audio capture. Visual broadcast layout on stream. Per-speaker voice differentiation.

Documentation Expectations

  • README.md must be updated with usage instructions after Phase 2 is complete.
  • docs/09-PROJECT-STATUS.md must be updated after each phase completes.
  • docs/06-WORKER-HANDOFF.md must be updated before handing off to the implementation model.

Escalation Conditions

  • Stop and raise a change request if:
    • pykokoro cannot be imported without installing the full Arena package.
    • The Kokoro voice pipeline requires GPU on the current hardware and fails on CPU.
    • OBS cannot capture audio from a browser source pointing at svc-ai without additional configuration.
    • The Codex or Gemini CLI output format has changed in a way that breaks the cleaning engine.

Signature

  • Document role: governing
  • Created by: Claude (supervisor)
  • Created at: 2026-04-12
  • Revision status: initial
  • Future revision rule: this document may be revised only by the user or by an explicitly authorized supervisor revision