Previous approach used TurnEndTime (from StopListening) which was never set in Server VAD mode. Now all latency measurements are anchored to TurnStartTime, captured when the first user_transcript arrives from the ElevenLabs server — the earliest client-side confirmation of user speech. Timeline: [user speaks] → STT → user_transcript(=T0) → agent_response_started → audio → playback Metrics shown: - Transcript>Gen: T0 → generation start (LLM think time) - Gen>Audio: generation start → first audio chunk (LLM + TTS) - Total: T0 → first audio chunk (full pipeline) - Pre-buffer: wait before playback - End-to-Ear: T0 → playback starts (user-perceived) Removed UserSpeechMs (unmeasurable without client-side VAD). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Description
Eleven Labs Integration for Unreal Engine 5.5
Languages
C++
90%
Python
4.8%
HTML
4.3%
Batchfile
0.6%
C#
0.3%