4.4 KiB
4.4 KiB
Implementation Plan
Scope For This Implementation
-
Included:
- Project scaffold (
pyproject.toml,src/stoned_ai/,tests/) - TTS layer wrapping Arena's Kokoro backend
- Cleaning engine for AI CLI output noise
- AI backend abstraction supporting Codex and Gemini CLI backends
- Web server with SSE delivery
- Host view (
/host) with text input, send, voice selection, session control - Broadcast view (
/broadcast) styled for OBS browser source capture - WAV audio serving for both views
- Per-speaker voice assignment (host voice + AI voice)
install.shscript and~/.local/bin/stoned-weblink
- Project scaffold (
-
Excluded from initial build:
- Claude API backend (Phase 2)
- Visual avatar or waveform animation overlay
- YouTube chat integration
- Persistent conversation logging (nice to have, not required for launch)
- Mobile-responsive host view (desktop only for now)
Phases
Phase 1: Arena Core Enhancements
- Objective: Update the Arena project to support human speakers and the /broadcast view.
- Files affected (in the Arena project):
src/arena/agents.py— addhumanrunner.src/arena/web.py— add/broadcastroute and UI message input.src/arena/core.py— ensurerun_conversationhandles the human backend.
- Risks: Breaking existing AI-vs-AI modes in Arena.
- Exit criteria: Arena can successfully pause a conversation loop to wait for human input from the Web UI.
Phase 2: Stoned Mode Presets
- Objective: Define the specific prompt envelopes and cleaning rules for the Stoned.AI show.
- Files likely affected:
src/arena/core.py— addMODE_STONEDpreset.src/arena/clean.py— update regex for high-quality show output.
- Exit criteria: Jason can select "Stoned Mode" in Arena and it loads the correct speaker labels and roles.
Phase 3: OBS Broadcast View
- Objective: Build the read-only, styled broadcast page within Arena.
- Files likely affected:
src/arena/web.py
- Exit criteria:
/broadcastis accessible and OBS captures audio/video end-to-end.
Phase 5: Claude API Backend (Post-Launch)
- Objective: Add a Claude backend using the
anthropicSDK as an alternative to Codex/Gemini. - Files likely affected:
src/stoned_ai/ai.pypyproject.toml(addanthropicdependency)
- Risks: Requires a valid
ANTHROPIC_API_KEYenvironment variable onsvc-ai. Must not break existing Codex/Gemini backends. - Exit criteria: The host view offers a Claude model option. A full conversation runs using the Claude API backend.
Order Of Operations
- Create
pyproject.tomland package scaffold. - Implement
tts.py(Kokoro wrapper). - Implement
clean.py(noise stripping for Codex and Gemini). - Implement
ai.py(Codex and Gemini backends). - Implement
web.py— server core, session state, SSE stream, WAV serving. - Implement
/hostview inweb.py. - Implement
/broadcastview inweb.py. - Write
scripts/install.sh. - Smoke test: full end-to-end conversation from host view to broadcast view.
- Verify OBS browser source audio capture.
Testing Expectations
- Unit tests:
tts.pyvoice listing.clean.pynoise stripping against fixture strings.ai.pyCLI argument construction (mock subprocess). - Integration tests: Full SSE event sequence from host message submit to broadcast card render. Requires a live Codex or Gemini CLI.
- Manual verification: OBS audio capture. Visual broadcast layout on stream. Per-speaker voice differentiation.
Documentation Expectations
README.mdmust be updated with usage instructions after Phase 2 is complete.docs/09-PROJECT-STATUS.mdmust be updated after each phase completes.docs/06-WORKER-HANDOFF.mdmust be updated before handing off to the implementation model.
Escalation Conditions
- Stop and raise a change request if:
pykokorocannot be imported without installing the full Arena package.- The Kokoro voice pipeline requires GPU on the current hardware and fails on CPU.
- OBS cannot capture audio from a browser source pointing at
svc-aiwithout additional configuration. - The Codex or Gemini CLI output format has changed in a way that breaks the cleaning engine.
Signature
- Document role: governing
- Created by: Claude (supervisor)
- Created at: 2026-04-12
- Revision status: initial
- Future revision rule: this document may be revised only by the user or by an explicitly authorized supervisor revision