PS_AI_Agent

Author	SHA1	Message	Date
j.foucher	f7f0b0c45b	Fix voice input: resampler stereo bug, remove invalid turn mode, cleanup Three bugs prevented voice input from working: 1. ResampleTo16000() treated NumFrames as total samples, dividing by channel count again — losing half the audio data with stereo input. The corrupted audio was unrecognizable to ElevenLabs VAD/STT. 2. Sent nonexistent "client_vad" turn mode in session init. The API has no turn.mode field; replaced with turn_timeout parameter. 3. Sent user_activity with every audio chunk, which resets the turn timeout timer and prevents the server from taking its turn. Also: send audio chunks as compact JSON, add message type debug logging, send conversation_initiation_client_data on connect. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 08:05:39 +01:00
j.foucher	91cf5b1bb4	Fix audio chunk size: accumulate mic audio to 100ms before sending WASAPI fires mic callbacks every ~5ms (158 bytes at 16kHz 16-bit mono). ElevenLabs VAD/STT requires a minimum of ~100ms (3200 bytes) per chunk. Tiny fragments arrived at the server but were never processed, so the agent never transcribed or responded to user speech. Fix: OnMicrophoneDataCaptured now appends to MicAccumulationBuffer and only calls SendAudioChunk once >= 3200 bytes are accumulated. StopListening flushes any remaining bytes before sending UserTurnEnd so the final words of an utterance are never discarded. HandleDisconnected also clears the buffer to prevent stale data on reconnect. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 18:41:58 +01:00
j.foucher	c39c70e3e4	Add sln file + change to unreal blueprint for testing	2026-02-19 17:23:57 +01:00
j.foucher	99017f4067	Update test_AI_Actor blueprint asset Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 14:56:06 +01:00
j.foucher	483456728d	Fix: distinguish binary audio frames from binary JSON frames ElevenLabs sends two kinds of binary WebSocket frames: 1. JSON control messages (starts with '{') — decode as UTF-8, route to OnWsMessage 2. Raw PCM audio (binary, does not start with '{') — broadcast directly as audio Previously all binary frames were decoded as UTF-8 JSON, causing "Failed to parse WebSocket message as JSON" for every audio frame. Fix: peek at first byte of assembled frame buffer: - '{' → UTF-8 JSON path (null-terminated, routed to existing message handler) - anything else → raw PCM path (broadcast directly to OnAudioReceived) Also: improved "Failed to parse JSON" log to show first 80 chars of message, and added verbose hex dump of binary audio frame prefix for diagnostics. Compiles cleanly on UE 5.5 Win64. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 13:54:34 +01:00
j.foucher	669c503d06	Fix: handle binary WebSocket frames from ElevenLabs ElevenLabs sends all JSON messages as binary WS frames, not text frames. The OnRawMessage callback receives them; we were logging them as warnings and discarding the data entirely — causing no events to fire at all. Fix: accumulate binary frame fragments (BytesRemaining > 0 = more coming), reassemble into a complete buffer, decode as UTF-8 JSON string, then route through the existing OnWsMessage text handler unchanged. Added BinaryFrameBuffer (TArray<uint8>) to proxy header for accumulation. Compiles cleanly on UE 5.5 Win64. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 13:51:47 +01:00
j.foucher	b489d1174c	Add SendTextMessage to agent component and WebSocket proxy Sends {"type":"user_message","text":"..."} to the ElevenLabs API. Agent responds with audio + text exactly as if it heard spoken input. Useful for testing without a microphone and for text-only NPC interactions. Available in Blueprint on UElevenLabsConversationalAgentComponent. Compiles cleanly on UE 5.5 Win64. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 13:49:03 +01:00
j.foucher	ae2c9b92e8	Fix 3 WebSocket protocol bugs found by API cross-check Bug 1 — Transcript handler: wrong type string + wrong JSON fields - type was "transcript", API sends "user_transcript" - event key was "transcript_event", API uses "user_transcription_event" - field was "message", API uses "user_transcript" - removed non-existent "speaker"/"is_final" fields; speaker is always "user" Bug 2 — Pong format: event_id must be top-level, not nested in pong_event - Was: {"type":"pong","pong_event":{"event_id":1}} - Fixed: {"type":"pong","event_id":1} Bug 3 — Client turn mode: user_turn_start/end don't exist in the API - SendUserTurnStart now sends {"type":"user_activity"} (correct API message) - SendUserTurnEnd now a no-op with log (no explicit end message in API) - Renamed constants in ElevenLabsDefinitions.h accordingly Also added AgentResponseCorrection and ConversationClientData constants. Compiles cleanly on UE 5.5 Win64. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 13:43:08 +01:00
j.foucher	dbd61615a9	Add TestMap, test actor asset, update DefaultEngine.ini and memory - DefaultEngine.ini: set GameDefaultMap + EditorStartupMap to TestMap (API key stripped — set locally via Project Settings, not committed) - Content/TestMap.umap: initial test level - Content/test_AI_Actor.uasset: initial test actor - .claude/MEMORY.md: document API key handling, add memory file index, note private git server and TestMap as default map Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 13:40:08 +01:00
j.foucher	bb1a857e86	Fix compile errors in PS_AI_Agent_ElevenLabs plugin - Remove WebSockets from .uplugin (it is a module, not a plugin) - Add AudioCapture plugin dependency to .uplugin - Fix FOnAudioCaptureFunction: use OpenAudioCaptureStream (not deprecated OpenDefaultCaptureStream) and correct callback signature (const void* per UE 5.3+) - Cast void* to float* inside OnAudioGenerate for float sample processing - Fix TArray::RemoveAt: use EAllowShrinking::No instead of deprecated bool overload Plugin now compiles cleanly with no errors or warnings on UE 5.5 / Win64. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 13:02:11 +01:00
j.foucher	f0055e85ed	Add PS_AI_Agent_ElevenLabs plugin (initial implementation) Adds a new UE5.5 plugin integrating the ElevenLabs Conversational AI Agent via WebSocket. No gRPC or third-party libs required. Plugin components: - UElevenLabsSettings: API key + Agent ID in Project Settings - UElevenLabsWebSocketProxy: full WS session lifecycle, JSON message handling, ping/pong keepalive, Base64 PCM audio send/receive - UElevenLabsConversationalAgentComponent: ActorComponent for NPC voice conversation, orchestrates mic capture -> WS -> procedural audio playback - UElevenLabsMicrophoneCaptureComponent: wraps Audio::FAudioCapture, resamples to 16kHz mono, dispatches on game thread Also adds .claude/ memory files (project context, plugin notes, patterns) so Claude Code can restore full context on any machine after a git pull. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 12:57:48 +01:00
j.foucher	457cd803c1	Initial Import	2026-02-19 12:36:36 +01:00

1 2

62 Commits