diff --git a/docs/02-ARCHITECTURE-PLAN.md b/docs/02-ARCHITECTURE-PLAN.md index b424911..5258082 100644 --- a/docs/02-ARCHITECTURE-PLAN.md +++ b/docs/02-ARCHITECTURE-PLAN.md @@ -12,19 +12,20 @@ ## Target State -A lightweight Python web server (`stoned-web`) with two browser-facing views: +A new "Stoned" mode implemented directly within the existing **Arena** project (`/home/svc-admin/ai-projects/projects/arena`). -1. **Host view** (`/host`) — Jason's control panel. Text input box, send button, voice selection per speaker, session start/stop, status display. -2. **Broadcast view** (`/broadcast`) — Clean, OBS-capturable page. Scrolling conversation cards only. No controls. Styled for stream. +1. **Host view** (`https://arena.accursedbinkie.com`) — The existing Arena control panel, updated with a "Human Input" box and a "Stoned" mode preset. +2. **Broadcast view** (`/broadcast`) — A new clean, OBS-capturable route added to the Arena web server. -Both views receive conversation turns over Server-Sent Events. The broadcast view is the OBS browser source. The host view is what Jason operates on his own screen. +Both views receive conversation turns over the existing Arena SSE stream. ## Design Principles -- Principle 1: **Text-in, voice-out for both sides.** The host types; the system voices. The AI generates text; the system voices. No microphone dependency. -- Principle 2: **Reuse Arena TTS infrastructure.** Do not reimplement Kokoro synthesis. Import and use `ArenaTTSManager` directly from the arena package or copy the relevant module. -- Principle 3: **Broadcast view is read-only.** The `/broadcast` URL has zero interactive elements. It exists only for OBS to consume. -- Principle 4: **One AI at a time.** The session has exactly one human speaker and one AI speaker. Multi-AI is not in scope. +- Principle 1: **Text-in, voice-out for both sides.** (Unchanged) +- Principle 2: **Direct integration into Arena.** No separate server. Leverage Arena's `ArenaHub` and `ArenaTTSManager` directly. +- Principle 3: **Broadcast view is read-only.** (Unchanged) +- Principle 4: **Human-in-the-loop support.** Add a `human` agent runner to Arena that waits for UI input. + ## Major Components diff --git a/docs/03-IMPLEMENTATION-PLAN.md b/docs/03-IMPLEMENTATION-PLAN.md index 2008f30..126a4c8 100644 --- a/docs/03-IMPLEMENTATION-PLAN.md +++ b/docs/03-IMPLEMENTATION-PLAN.md @@ -23,42 +23,31 @@ ## Phases -### Phase 1: Project Scaffold and Core Backend +### Phase 1: Arena Core Enhancements -- Objective: Establish the Python package, TTS layer, cleaning engine, and AI backend abstraction. +- Objective: Update the Arena project to support human speakers and the /broadcast view. +- Files affected (in the Arena project): + - `src/arena/agents.py` — add `human` runner. + - `src/arena/web.py` — add `/broadcast` route and UI message input. + - `src/arena/core.py` — ensure `run_conversation` handles the human backend. +- Risks: Breaking existing AI-vs-AI modes in Arena. +- Exit criteria: Arena can successfully pause a conversation loop to wait for human input from the Web UI. + +### Phase 2: Stoned Mode Presets + +- Objective: Define the specific prompt envelopes and cleaning rules for the Stoned.AI show. - Files likely affected: - - `pyproject.toml` - - `src/stoned_ai/__init__.py` - - `src/stoned_ai/tts.py` - - `src/stoned_ai/clean.py` - - `src/stoned_ai/ai.py` - - `scripts/install.sh` -- Risks: `pykokoro` import paths may differ slightly from Arena's. Verify import compatibility before writing TTS layer. -- Exit criteria: `stoned_ai.tts` can synthesize a WAV from text using a Kokoro voice. `stoned_ai.ai` can call Codex or Gemini and return a clean string. + - `src/arena/core.py` — add `MODE_STONED` preset. + - `src/arena/clean.py` — update regex for high-quality show output. +- Exit criteria: Jason can select "Stoned Mode" in Arena and it loads the correct speaker labels and roles. -### Phase 2: Web Server and SSE Delivery +### Phase 3: OBS Broadcast View -- Objective: Build the HTTP server, session state management, SSE event stream, and WAV file serving. +- Objective: Build the read-only, styled broadcast page within Arena. - Files likely affected: - - `src/stoned_ai/web.py` -- Risks: Session state must be thread-safe. SSE connections from both `/host` and `/broadcast` must receive the same events. -- Exit criteria: A session can be started. A host message can be submitted. The AI responds. Both turns are pushed over SSE. Both turns are voiced. + - `src/arena/web.py` +- Exit criteria: `/broadcast` is accessible and OBS captures audio/video end-to-end. -### Phase 3: Host View (`/host`) - -- Objective: Build the host's control panel HTML/CSS/JS page. -- Files likely affected: - - `src/stoned_ai/web.py` (inline HTML or template) -- Risks: Voice selection dropdown must populate from the live Kokoro voice list. If the voice list is slow to load, display a loading state. -- Exit criteria: Jason can open `/host`, start a session, pick voices, type and send a message, hear his voice, hear the AI's voice, and stop the session. - -### Phase 4: Broadcast View (`/broadcast`) - -- Objective: Build the clean, OBS-capturable broadcast page. -- Files likely affected: - - `src/stoned_ai/web.py` (inline HTML or template) -- Risks: OBS browser source must auto-play audio. Verify OBS audio capture works with the WAV playback approach before marking complete. -- Exit criteria: `/broadcast` shows only conversation cards. No controls are visible. OBS captures the page. Audio plays in OBS without manual permission prompts. ### Phase 5: Claude API Backend (Post-Launch) diff --git a/docs/08-CHANGE-REQUEST.md b/docs/08-CHANGE-REQUEST.md index 2d64bd3..6e87886 100644 --- a/docs/08-CHANGE-REQUEST.md +++ b/docs/08-CHANGE-REQUEST.md @@ -1,33 +1,27 @@ -# Change Request +# Change Request — CR-001: UI Strategy Pivot ## Summary - -- Proposed change: (none pending) +- **Proposed change**: Abandon the creation of a standalone `stoned-web` server and instead implement Stoned.AI functionality as a new "Human-in-the-Loop" mode within the existing **Arena** project (`/home/svc-admin/ai-projects/projects/arena`). ## Reason - -- Why is the current governing plan insufficient or wrong? +- The user explicitly requested to "use the existing Arena WebUI." +- Reusing the Arena infrastructure (SSE, TTS, Agent Runners) reduces implementation time and maintenance overhead. +- Centralizing multi-agent and human-vs-AI orchestration on a single port (8765) and domain (`arena.accursedbinkie.com`) provides a more cohesive operator experience. ## Requested Document Changes - -- Document: -- Proposed revision: +- **02-ARCHITECTURE-PLAN.md**: Update to reflect that Arena is the primary runtime, not a standalone server. +- **03-IMPLEMENTATION-PLAN.md**: Pivot phases to "Arena Integration" rather than "Greenfield Build." ## Impact - -- Scope impact: -- Architecture impact: -- Risk impact: -- Testing impact: -- Timeline impact: +- **Scope impact**: Low (Functionality remains the same, delivery method changes). +- **Architecture impact**: High (Stoned.AI becomes a feature of Arena). +- **Risk impact**: Medium (Changes to Arena could affect existing AI-vs-AI workflows). +- **Testing impact**: Requires regression testing of Arena's standard mode. +- **Timeline impact**: Positive (Faster delivery). ## Recommendation - -- Approve -- Reject -- Defer +- **Approve**: This aligns with the user's preference for a unified interface. ## Approval - -- User decision: -- Supervisor recommendation: +- **User decision**: Pending +- **Supervisor recommendation**: Approve