j.foucher 2169c58cd7 Update memory: latency HUD status + future server-side metrics TODO
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 18:47:11 +01:00

4.7 KiB
Raw Permalink Blame History

Project Memory PS_AI_Agent

This file is committed to the repository so it is available on any machine. Claude Code reads it automatically at session start (via the auto-memory system) when the working directory is inside this repo. Keep it under ~180 lines lines beyond 200 are truncated by the system.


Project Location

  • Repo root: <repo_root>/ (wherever this is cloned)
  • UE5 project: <repo_root>/Unreal/PS_AI_Agent/
  • .uproject: <repo_root>/Unreal/PS_AI_Agent/PS_AI_Agent.uproject
  • Engine: Unreal Engine 5.5 — Win64 primary target
  • Default test map: /Game/TestMap.TestMap

Plugins

Plugin Path Purpose
Convai (reference) <repo_root>/ConvAI/Convai/ gRPC + protobuf streaming to Convai API. Used as architectural reference.
PS_AI_ConvAgent <repo_root>/Unreal/PS_AI_Agent/Plugins/PS_AI_ConvAgent/ Main plugin — ElevenLabs Conversational AI, posture, gaze, lip sync, facial expressions.

User Preferences

  • Plugin naming: PS_AI_ConvAgent (renamed from PS_AI_Agent_ElevenLabs)
  • Save memory frequently during long sessions
  • Git remote is a private server — no public exposure risk
  • Full original ask + intent: see .claude/project_context.md

Current Branch & Work

  • Branch: main
  • Recent merges: feature/multi-player-shared-agent merged to main

Latency Debug HUD (just implemented)

  • Separate bDebugLatency property + CVar ps.ai.ConvAgent.Debug.Latency
  • All metrics anchored to GenerationStartTime (agent_response_started event)
  • Metrics: Gen>Audio (LLM+TTS), Pre-buffer, Gen>Ear (user-perceived)
  • Reset per turn in HandleAgentResponseStarted()
  • DrawLatencyHUD() separate from DrawDebugHUD()

Future: Server-Side Latency from ElevenLabs API

TODO — high-value improvement parked for later:

  • GET /v1/convai/conversations/{conversation_id} returns:
    • conversation_turn_metrics with elapsed_time per metric (STT, LLM, TTS breakdown!)
    • tool_latency_secs, step_latency_secs, rag_latency_secs
    • time_in_call_secs per message
  • ping WS event has ping_ms (network round-trip) — could display on HUD
  • vad_score WS event (0.0-1.0) — could detect real speech start client-side
  • Docs: https://elevenlabs.io/docs/api-reference/conversations/get

Multi-Player Shared Agent — Key Design

  • Old model: exclusive lock (one player per agent via NetConversatingPawn)
  • New model: shared array (NetConnectedPawns) + active speaker (NetActiveSpeakerPawn)
  • Speaker arbitration: server-side with SpeakerSwitchHysteresis (0.3s) + SpeakerIdleTimeout (3.0s)
  • In standalone (≤1 player): speaker arbitration bypassed, audio sent directly to WebSocket
  • Internal mic (WASAPI thread): direct WebSocket send, no game-thread state access
  • GetCurrentBlendshapes() thread-safe via ThreadSafeBlendshapes snapshot + BlendshapeLock

Key UE5 Plugin Patterns

  • Settings object: UCLASS(config=Engine, defaultconfig) inheriting UObject, registered via ISettingsModule
  • WebSocket: FWebSocketsModule::Get().CreateWebSocket(URL, TEXT(""), Headers)
  • Audio capture: Audio::FAudioCapture::OpenAudioCaptureStream() (UE 5.3+)
    • Callback arrives on background thread — marshal to game thread
  • Procedural audio playback: USoundWaveProcedural + OnSoundWaveProceduralUnderflow
  • Resample mic audio to 16000 Hz mono before sending to ElevenLabs
  • TArray::RemoveAt(idx, count, EAllowShrinking::No) — bool overload deprecated in UE 5.5

ElevenLabs WebSocket Protocol Notes

  • ALL frames are binary — bind ONLY OnRawMessage; NEVER bind OnMessage (text)
  • Binary frame discrimination: peek byte[0] → '{' (0x7B) = JSON, else = raw PCM audio
  • Pong: {"type":"pong","event_id":N}event_id is top-level, NOT nested
  • user_transcript arrives AFTER agent_response_started in Server VAD mode
  • MUST send conversation_initiation_client_data immediately after WS connect

API Keys / Secrets

  • ElevenLabs API key: Project Settings → Plugins → ElevenLabs AI Agent
  • Saved to DefaultEngine.inistripped before every commit

Claude Memory Files in This Repo

File Contents
.claude/MEMORY.md This file — project structure, patterns, status
.claude/elevenlabs_plugin.md Plugin file map, ElevenLabs WS protocol, design decisions
.claude/elevenlabs_api_reference.md Full ElevenLabs API reference (WS messages, REST, signed URL)
.claude/project_context.md Original ask, intent, short/long-term goals
.claude/session_log_2026-02-19.md Session record: steps, commits, technical decisions
.claude/PS_AI_Agent_ElevenLabs_Documentation.md User-facing Markdown reference doc