AgoraIO-Community · digitallysavvy · Apr 24, 2026 · Apr 24, 2026 · Apr 24, 2026 · Apr 24, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -1,94 +1,101 @@
 # Agent Development Guide
 
-This document guides AI agents working on the Agora Conversational AI demo project.
+This guide is for coding agents making changes in `agent-quickstart-python`.
 
-## Project Overview
+## Start Here
 
-A real-time voice conversation application with AI agents, built with:
-- **Frontend**: Next.js 16 + React 19 + TypeScript + Agora Web SDK
-- **Backend**: Python FastAPI + Agora Conversational AI Agent SDK
+- Read [README.md](./README.md) for setup, supported run modes, and verification.
+- Use [ARCHITECTURE.md](./ARCHITECTURE.md) for system-level request flow.
+- Use module guides only when working inside that module:
+  - [web-client/AGENTS.md](./web-client/AGENTS.md)
+  - [server-python/AGENTS.md](./server-python/AGENTS.md)
 
-## Project Structure
+## Current System Shape
 
-```
-.
-├── web-client/           # Frontend application (Next.js + React)
-└── server-python/        # Backend service (FastAPI + Agora Agent SDK)
-```
+- Frontend: Next.js 16, React 19, TypeScript, `agora-rtc-react`, `agora-rtm`, `agora-agent-client-toolkit`, `agora-agent-uikit`
+- Local backend: Python FastAPI in `server-python`
+- Deployed web backend: Next route handlers in `web-client/app/api`
+- Auth: Token007 generated from `AGORA_APP_ID` and `AGORA_APP_CERTIFICATE`
+- Default agent config: managed Deepgram STT, OpenAI LLM, and MiniMax TTS
+
+## Supported Modes
+
+### Local Python-Backed Development
+
+- Run from the repo root with `bun run dev`
+- Root scripts start:
+  - FastAPI on `http://localhost:8000`
+  - Next.js on `http://localhost:3000`
+- In this mode, the web app still calls `/api/*`, but the Next route handlers proxy to the Python service through `AGENT_BACKEND_URL=http://localhost:8000`
+
+### Single-Target Web Deployment
+
+- Deploy `web-client` as a Next.js app
+- `/api/get_config`, `/api/v2/startAgent`, and `/api/v2/stopAgent` run inside the Next app
+- Do not assume a separate Python service exists in this mode
+
+## Routing Ownership
+
+- UI and RTC/RTM client lifecycle live in `web-client`
+- `/api/*` entrypoints for the web app live in `web-client/app/api`
+- Python agent lifecycle logic lives in `server-python/src`
+- For deployability changes, update both the README and architecture docs if the owner of `/api/*` changes
+
+## Key Files
+
+- `README.md`: setup, local vs deploy modes, troubleshooting, verification
+- `ARCHITECTURE.md`: top-level environment model
+- `web-client/src/components/app.tsx`: conversation UI shell
+- `web-client/src/hooks/useAgoraConnection.ts`: RTC, RTM, transcript, and token renewal lifecycle
+- `web-client/src/lib/server/agora.ts`: shared server-side token and agent helpers for Next route handlers
+- `server-python/src/server.py`: FastAPI entrypoints
+- `server-python/src/agent.py`: async Agora agent lifecycle wrapper
+
+## Working Rules
 
-## Quick Start
+- Prefer the smallest change that keeps local mode and deployed mode aligned.
+- Do not reintroduce `web-client/proxy.ts`; the current proxy fallback is route-local through `AGENT_BACKEND_URL`.
+- Do not assume Zustand or a separate client-side store exists.
+- Do not require third-party vendor API keys unless the code actually introduces a non-managed path.
+- Keep token expiry and renewal behavior aligned across the Python backend and Next route handlers.
+
+## Standard Commands
+
+From the repo root:
 
 ```bash
-# Install dependencies
 bun install
-
-# Start both frontend and backend
+bun run doctor
+bun run doctor:local
 bun run dev
+bun run verify
+bun run verify:local
+```
+
+Useful narrower checks:
+
+```bash
+bun run verify:web
+bun run verify:local:fastapi
+bun run verify:web:proxy
+bun run verify:backend
+```
 
-# Frontend only (port 3000)
-bun run frontend
+Inside `web-client/`, use:
 
-# Backend only (port 8000)
-bun run backend
+```bash
+bun run doctor
+bun run verify
 ```
 
-## Module-Specific Guides
-
-### Frontend (web-client/)
-- [web-client/AGENTS.md](./web-client/AGENTS.md) — AI assistant guide for frontend development
-- [web-client/ARCHITECTURE.md](./web-client/ARCHITECTURE.md) — Detailed frontend architecture
-
-### Backend (server-python/)
-- [server-python/AGENTS.md](./server-python/AGENTS.md) — AI assistant guide for backend development
-- [server-python/ARCHITECTURE.md](./server-python/ARCHITECTURE.md) — Backend architecture details
-- [server-python/README.md](./server-python/README.md) — Backend API documentation
-
-### System Architecture
-- [ARCHITECTURE.md](./ARCHITECTURE.md) — Overall system architecture and data flow
-
-## Key Technologies
-
-| Layer | Technologies |
-|-------|-------------|
-| Frontend | Next.js 16, React 19, TypeScript, Agora Web SDK (RTC + RTM), agora-agent-client-toolkit, Zustand, Tailwind CSS |
-| Backend | Python 3.8+, FastAPI, agora-agent-server-sdk, uvicorn |
-| Auth | Token007 (AccessToken2) — auto-generated from APP_ID + APP_CERTIFICATE |
-| Real-time | Agora RTC (audio) + RTM (messaging/transcription) |
-| AI Providers | Deepgram (ASR), OpenAI (LLM), ElevenLabs (TTS) |
-
-## Common Development Tasks
-
-### Working on Frontend
-See [web-client/AGENTS.md](./web-client/AGENTS.md) for:
-- UI component development
-- State management patterns (Zustand)
-- Agora SDK integration (RTC/RTM)
-- API client usage
-
-### Working on Backend
-See [server-python/AGENTS.md](./server-python/AGENTS.md) for:
-- API endpoint development
-- Agent lifecycle management (start/stop via AgentSession)
-- Token generation (`generate_convo_ai_token`)
-- ASR/LLM/TTS provider configuration
-
-### Cross-Module Changes
-1. Review [ARCHITECTURE.md](./ARCHITECTURE.md) for system overview and data flow
-2. Check both module-specific AGENTS.md files
-3. Verify API contracts — frontend calls `/api/*`, proxied to backend on port 8000
-4. Test token flow: backend generates Token007, frontend uses it for RTC/RTM
-
-## Important Notes
-
-- Never commit `.env.local` or credentials
-- Frontend proxies `/api/*` requests to backend via `web-client/proxy.ts`
-- Agent lifecycle is managed by backend (AgentSession), not frontend
-- All Agora SDK calls go through `useAgoraConnection.ts` hook on the frontend
-- Authentication uses Token007 (AccessToken2) — only `APP_ID` and `APP_CERTIFICATE` are needed
-- Backend uses `Agora(area=Area.US, ...)` client with auto Token007 auth
-
-## Reference Documentation
-
-- [Agora Conversational AI Docs](https://docs.agora.io/en/conversational-ai/overview)
-- [Next.js Docs](https://nextjs.org/docs)
-- [FastAPI Docs](https://fastapi.tiangolo.com/)
+## Done Criteria
+
+Before finishing a change:
+
+1. Run the narrowest relevant verification command.
+2. If the change affects the deployable web app, ensure `bun run verify:web` passes.
+3. If the change affects local Python-backed development, ensure `bun run verify:local` or the narrower `bun run verify:local:fastapi` / `bun run verify:web:proxy` / `bun run verify:backend` commands pass as appropriate.
+4. Treat `server-python/.env.local` as CLI-managed by default. If you change required env vars or setup steps, update both the root README and the module README.
+5. Update `README.md` or architecture docs when the developer workflow or request flow changes.
+
+`bun run verify:local:fastapi` exercises the real FastAPI route layer through Next, but with a fake agent implementation so the check stays deterministic and does not depend on a live managed-agent start.
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
@@ -1,81 +1,89 @@
 # Agora Conversational AI Demo — Architecture
 
-## System Architecture
+This quickstart supports two runtime environments. The UI is the same in both modes, but the owner of `/api/*` changes by environment.
+
+## Local Python-Backed Development
 
 ```
-┌─────────────────────────────────────────────────────────────┐
-│                         Frontend                             │
-│  Next.js 16 + React 19 + TypeScript + Agora Web SDK        │
-│  (Port 3000)                                                │
-└──────────────────┬──────────────────────────────────────────┘
-                   │ /api/* proxy (proxy.ts)
-                   ↓
-┌─────────────────────────────────────────────────────────────┐
-│                         Backend                              │
-│  Python FastAPI + Agora Agent SDK                           │
-│  (Port 8000)                                                │
-└──────────────────┬──────────────────────────────────────────┘
-                   │ REST API (Token007 auth)
-                   ↓
-┌─────────────────────────────────────────────────────────────┐
-│                    Agora Cloud Services                      │
-│  • RTC (Real-Time Communication — audio)                    │
-│  • RTM (Real-Time Messaging — subtitles/transcription)      │
-│  • Conversational AI Engine (ASR + LLM + TTS)               │
-└─────────────────────────────────────────────────────────────┘
+Browser
+  ↓
+Next.js app on :3000
+  ↓
+/api/* route handlers proxy through AGENT_BACKEND_URL
+  ↓
+FastAPI service on :8000
+  ↓
+Agora Cloud Services
 ```
 
-## Data Flow
+- `web-client` owns the browser UI and the `/api/*` entrypoints
+- `server-python` owns the actual token generation and agent start/stop logic
+- this is the mode used by `bun run dev`
+
+## Single-Target Web Deployment
+
+```
+Browser
+  ↓
+Next.js app
+  ↓
+/api/* route handlers run in-process
+  ↓
+Agora Cloud Services
+```
+
+- `web-client` owns both the UI and the deployed `/api/*` implementation
+- `server-python` is not required for this deployment path
+
+## Shared Conversation Flow
 
 ### 1. Connection
 
 ```
-User clicks "Start"
-  → Frontend: GET /api/get_config
-  → Backend: generate_convo_ai_token(app_id, app_certificate, channel, account)
-  → Frontend: Join RTC channel + Login RTM with token
+Frontend: GET /api/get_config
+  → Generate Token007 config for a user UID, agent UID, and channel
+  → Frontend joins RTC and logs into RTM
 ```
 
 ### 2. Agent Start
 
 ```
 Frontend: POST /api/v2/startAgent { channelName, rtcUid, userUid }
-  → Backend: Build AgoraAgent (Deepgram ASR + OpenAI LLM + ElevenLabs TTS)
-  → Backend: session.start() → agent_id
-  → Agent joins RTC channel → Frontend receives audio + RTM subtitles
+  → Build agent session
+  → Scope remote_uids to the requesting user
+  → Start session and return agent_id
 ```
 
 ### 3. Conversation
 
 ```
-User speaks → RTC audio → Agora Cloud
-  → Deepgram (ASR): audio → text
-  → OpenAI (LLM): text → response
-  → ElevenLabs (TTS): response → audio
-  → RTC audio + RTM subtitles → Frontend
+User audio → RTC
+  → Managed ASR, LLM, and TTS pipeline
+  → Agent audio + RTM transcript events
+  → UIKit transcript and visualizer in the web app
 ```
 
 ### 4. Agent Stop
 
 ```
 Frontend: POST /api/v2/stopAgent { agentId }
-  → Backend: session.stop()
-  → Agent leaves channel → Frontend cleanup
+  → Stop session directly or through stateless fallback
+  → Client cleans up RTC and RTM state
 ```
 
 ## API Endpoints
 
 | Endpoint | Method | Description |
 |----------|--------|-------------|
 | `/get_config` | GET | Generate connection config (Token007, channel, UIDs) |
-| `/v2/startAgent` | POST | Start AI agent |
-| `/v2/stopAgent` | POST | Stop agent by agent_id |
+| `/v2/startAgent` | POST | Start the agent session |
+| `/v2/stopAgent` | POST | Stop the agent by `agent_id` |
 
-Frontend calls these as `/api/*`, proxied to backend via `web-client/proxy.ts`.
+Frontend calls these as `/api/*`. In local Python mode, the Next handlers proxy to `AGENT_BACKEND_URL`; in Vercel they run in-process inside the Next app.
 
 ## Authentication
 
-Token007 (AccessToken2) — generated from `APP_ID` + `APP_CERTIFICATE` only. No API_KEY/API_SECRET needed. The SDK handles token generation and API auth internally.
+Token007 (AccessToken2) — generated from `AGORA_APP_ID` + `AGORA_APP_CERTIFICATE` only. No API_KEY/API_SECRET needed. The SDK handles token generation and API auth internally.
 
 ## Detailed Documentation