user_transcript arrives AFTER agent_response_started in Server VAD mode (the server detects end of speech via VAD, starts generating immediately, and STT completes later). This caused Transcript>Gen to show stale values (19s) and Total < Gen>Audio (impossible). Now all metrics are anchored to GenerationStartTime (agent_response_started), which is the closest client-side proxy for "user stopped speaking": - Gen>Audio: generation start → first audio chunk (LLM + TTS) - Pre-buffer: wait before playback - Gen>Ear: generation start → playback starts (user-perceived) Removed STTToGenMs, TotalMs, EndToEarMs, UserSpeechMs (all depended on unreliable timestamps). Simpler, always correct, 3 clear metrics. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Description
Eleven Labs Integration for Unreal Engine 5.5
Languages
C++
90%
Python
4.8%
HTML
4.3%
Batchfile
0.6%
C#
0.3%