1. StartConversationWithSelectedAgent: remove early return when WebSocket
is already connected (persistent mode). Always call ServerJoinConversation
so the pawn is added to NetConnectedPawns and bNetIsConversing is set.
2. ServerSendMicAudioFromPlayer: bypass speaker arbitration in standalone
mode (<=1 connected pawn). Send audio directly to avoid silent drops
caused by pawn not being in NetConnectedPawns array. Add warning logs
for multi-player drops to aid debugging.
3. OnMicrophoneDataCaptured: restore direct WebSocketProxy->SendAudioChunk
on the server path. This callback runs on the WASAPI audio thread —
accessing game-thread state (NetConnectedPawns, LastSpeakTime) was
causing undefined behavior. Internal mic is always the local player,
no speaker arbitration needed.
4. StopListening flush: send directly to WebSocket (active speaker already
established, no arbitration needed for the tail of the current turn).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GetCurrentBlendshapes() was copying CurrentBlendshapes on the anim worker
thread while the game thread mutated it (TSet::UnhashElements crash).
Use a snapshot pattern: game thread copies to ThreadSafeBlendshapes under
FCriticalSection at end of TickComponent, anim node reads the snapshot.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace exclusive single-player agent lock with shared multi-player model:
- NetConversatingPawn/Player → NetConnectedPawns array + NetActiveSpeakerPawn
- Server-side speaker arbitration with hysteresis (0.3s) prevents gaze ping-pong
- Speaker idle timeout (3.0s) clears active speaker after silence
- Agent gaze follows the active speaker via replicated OnRep_ActiveSpeaker
- New ServerJoinConversation/ServerLeaveConversation RPCs (idempotent join/leave)
- Backward-compatible: old ServerRequest/Release delegate to new Join/Leave
- InteractionComponent no longer skips occupied agents
- DrawDebugHUD shows connected player count and active speaker
- All mic audio paths (FeedExternalAudio, OnMicCapture, StopListening flush) route
through speaker arbitration
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- InteractionComponent: GetPawnViewPoint() now prefers UCameraComponent
over PlayerController::GetPlayerViewPoint() — fixes agent selection
in VR where the pawn root stays at spawn while the HMD moves freely
- GazeComponent: ResolveTargetPosition() uses camera first for locally
controlled pawns (VR HMD / FPS eye position), falls back to bone
chain for NPCs and remote players in multiplayer
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix DisplayTime=0.0f causing flicker on all debug HUDs (now 1.0f)
- Add per-component CVars (ps.ai.ConvAgent.Debug.*) for console debug toggle
- Add MicrophoneCapture debug HUD with VU meter (RMS/peak/dB bar)
- InteractionComponent reuses existing MicrophoneCaptureComponent on pawn
instead of always creating a new one
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Sync body animation with actual audio playback via new OnAudioPlaybackStarted
delegate instead of OnAgentStartedSpeaking (accounts for pre-buffer delay)
- Fix stale pre-buffer broadcasts by cancelling bPreBuffering on silence detection
and guarding pre-buffer timeout with bAgentSpeaking check
- Smooth body crossfade using FInterpTo instead of linear interpolation
- Add conversation lock in EvaluateBestAgent: keep agent selected during active
conversation regardless of view cone (distance-only check prevents deselect
flicker on fast camera turns)
- Broadcast OnAgentDisconnected in persistent session EndConversation so all
expression components (body, facial, lip sync, gaze) properly deactivate
when the player leaves the interaction zone
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace FInterpConstantTo with FInterpTo (exponential ease-out) in all 4 anim components
- Apply SmoothStep to crossfade alphas for smooth animation transitions
- Add SpeechBlendAlpha to LipSync: fades out mouth curves when not speaking,
letting FacialExpression emotion curves (including mouth) show through
- Remove LipSync AnimNode grace period zeroing to avoid overwriting emotion curves
- Revert BodyExpression to override mode (additive broken with full-pose anims)
- Default ExcludeBones = neck_01 to prevent Gaze/Posture conflicts
- Fix mid-crossfade animation pop in BodyExpression SwitchToNewAnim
- Add Neutral emotion fallback in Body and Facial expression components
- Add SendTextToSelectedAgent convenience method on InteractionComponent
- Add debug HUD display for BodyExpression component
- Update CoreRedirects for Posture → Gaze rename
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add BodyExpression system: emotion-driven body animations with per-emotion
anim lists (Idle/Normal/Medium/Extreme), random selection, auto-cycle on
loop complete, crossfade transitions, upper-body-only or full-body mode
- Replace bExternalMicManagement with auto-detecting ShouldUseExternalMic()
- Add bAutoTargetEyes to GazeComponent: auto-aim at target's eye bones
(MetaHuman), with fallback chain (eyes > head > FallbackEyeHeight)
- Hide bActive from Details panel on all 4 anim components (read-only,
code-managed): FacialExpression, Gaze, LipSync, BodyExpression
- Remove misleading mh_arkit_mapping_pose warning from LipSync
- Add bUpperBodyOnly toggle to BodyExpression AnimNode
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three issues prevented posture re-activation on re-entry:
- StartConversation skipped bNetIsConversing/NetConversatingPawn in standalone
(guarded by NM_Standalone check) so ApplyConversationPosture always deactivated
- SetSelectedAgent required !IsConnected() to auto-start conversation, but in
persistent mode the WS stays connected → StartConversation never called on re-entry
- AttachPostureTarget never set Posture->bActive = true, relying solely on
ApplyConversationPosture which could be skipped due to the above condition
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When true (default), the first StartConversation() opens the WebSocket
and EndConversation() only stops the mic/posture — the WebSocket stays
open until EndPlay. The agent remembers the full conversation context.
When false, each Start/EndConversation opens/closes the WebSocket (previous behavior).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Make bAgentGenerating, bWaitingForAgentResponse, bWaitingForResponse,
bFirstAudioResponseLogged, bAgentResponseStartedFired (std::atomic<bool>)
and LastInterruptEventId (std::atomic<int32>) for thread-safety
- Add WebSocket auto-reconnection with exponential backoff (1s→30s cap,
max 5 attempts), distinguishing intentional vs unexpected disconnects
- Add FillCurrentEyeCurves() zero-allocation method using FindOrAdd()
to eliminate per-frame TMap heap allocation in anim thread
- Replace AudioQueue RemoveAt(0,N) with read-offset pattern — O(1) per
underflow callback, periodic compaction when offset > half buffer
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previous fix only smoothed cascade inputs (head/eyes) via SmoothedBodyYaw
but the body mesh still followed the raw replicated actor rotation which
jumped at ~30Hz network rate. Now the client calls SetActorRotation with
the smoothed yaw so the mesh visually interpolates. Replication overwrites
on next network update; SmoothedBodyYaw absorbs the correction smoothly.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Server-only AddActorWorldRotation (HasAuthority guard) prevents
client/server tug-of-war. Client interpolates toward replicated
rotation via SmoothedBodyYaw (angle-aware FInterpTo at 3x body
speed) to eliminate step artifacts from ~30Hz network updates.
Cascade (DeltaYaw, head, eyes) uses SmoothedBodyYaw on all machines.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The pelvis bone rotation approach didn't work in practice. Reverts to
the previous AddActorWorldRotation() body tracking (replicated actor
rotation). Thread-safety fix and deprecated IsValid() removal are kept.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- FacialExpressionComponent: add FCriticalSection around emotion curves
to prevent race between TickComponent (game thread) and anim worker
thread — root cause of EXCEPTION_ACCESS_VIOLATION at Evaluate_AnyThread
- Remove deprecated FBlendedCurve::IsValid() guards (UE 5.5: always true)
- Body tracking: replace AddActorWorldRotation() with animation-only
pelvis bone rotation via AnimNode — eliminates replication tug-of-war
that caused client-side saccades when server overwrote local rotation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Create OpusEncoder on ALL machines (was Authority-only) — clients now
encode mic audio, server decodes it; server still encodes agent audio
- FeedExternalAudio / OnMicrophoneDataCaptured: Opus-encode accumulated
PCM buffer before sending via ServerRelayMicAudio RPC on client path
(~200 bytes/100ms instead of 3200 bytes = ~16 Kbits/s vs 256 Kbits/s)
- ServerRelayMicAudio_Implementation: auto-detect Opus (size < raw chunk)
and decode back to PCM before forwarding to WebSocket
- Add public DecompressMicAudio() helper for clean API access from
InteractionComponent relay without exposing private Opus members
- Graceful fallback: if Opus unavailable, raw PCM is sent/received as before
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Attach AudioPlaybackComponent to owner's root component for proper
3D world positioning (was unattached = stuck at origin)
- Enable default inline spatialization with 15m falloff distance when
no external SoundAttenuation asset is set
- External SoundAttenuation asset still overrides the default if set
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Auto-end conversation on deselection: when bAutoStartConversation is
true and the player walks out of range, EndConversation() is called
so the NPC becomes available for other players
- Server-side posture: add ApplyConversationPosture() helper called
from ServerRequestConversation, ServerReleaseConversation,
EndConversation (Authority path), and HandleDisconnected — fixes
NPC not tracking the client on the listen server (OnRep never fires
on Authority)
- Guard DetachPostureTarget: only clear TargetActor if it matches our
pawn, preventing the server IC from overwriting posture set by the
conversation system for a remote client
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Silence gate: skip sending silent mic audio over network RPCs on clients
(~256 Kbits/s saved when not speaking, fixes chaotic teleporting)
- Lazy init: defer InteractionComponent mic creation from BeginPlay to
TickComponent with IsLocallyControlled guard (fixes "No owning connection"
from server-side replicas of remote pawns)
- Body tracking: use bNetIsConversing as fallback for IsConnected() on
clients where WebSocket doesn't exist
- EvaluateBestAgent: null-check NetConversatingPawn before comparison
- MicCaptureComponent: use TWeakObjectPtr in AsyncTask lambda to prevent
FMRSWRecursiveAccessDetector race on component destruction
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
In a listen server, the server-side copy of a remote client's pawn also
has an InteractionComponent that ticks. This caused a race condition:
the server-side tick would start conversations using GetFirstPlayerController()
(= server's PC), setting NetConversatingPawn to the server's pawn instead
of the client's. The client's relay RPC arrived too late and was rejected
because bNetIsConversing was already true.
Fix: disable tick and skip mic creation in BeginPlay for non-locally-controlled
pawns. The client handles all interaction locally via relay RPCs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
UE5 clients cannot call Server RPCs on actors they don't own. NPC actors
are server-owned, causing "No owning connection" errors when remote clients
try to start conversations, send mic audio, or interrupt agents.
Solution: relay all client→NPC RPCs through the InteractionComponent on
the player's pawn (which IS owned by the client). The relay forwards
commands to the NPC's ElevenLabsComponent on the server side.
Changes:
- InteractionComponent: add Server relay RPCs (Start/End conversation,
mic audio, text message, interrupt) and Client relay RPCs
(ConversationStarted/Failed) with GetLifetimeReplicatedProps
- ElevenLabsComponent: implement FindLocalRelayComponent(), route all
client-side calls through relay (StartConversation, EndConversation,
SendTextMessage, InterruptAgent, FeedExternalAudio, mic capture)
- Fix HandleConnected/ServerRequestConversation to route Client RPCs
through the player pawn's relay instead of the NPC (no owning connection)
- Fix StartListening/FeedExternalAudio/StopListening to accept
bNetIsConversing on clients (WebSocket only exists on server)
- Fix EvaluateBestAgent to use NetConversatingPawn instead of
NetConversatingPlayer (NULL on remote clients due to bOnlyRelevantToOwner)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PostureComponent starts with bActive=false and waits for
OnConversationConnected, which only fires on the server (WebSocket).
Remote clients never got bActive=true, so CurrentActiveAlpha stayed
at 0 and all head/eye rotation was zeroed out.
Now OnRep_ConversationState also sets Posture->bActive alongside
TargetActor, matching what was already done for FacialExpression
and LipSync components.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Logs NetConversatingPawn, PostureComponent availability, and TargetActor
assignment to diagnose why head tracking doesn't work on remote clients.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raw PCM is split into 32KB sub-chunks for network transmission.
On the client, these sub-chunks arrive nearly simultaneously,
triggering the "second chunk arrived" fast-path which cancelled
the pre-buffer after ~10ms instead of the intended 2000ms.
Now the fast-path only applies on Authority (server) where
chunks represent genuine separate TTS batches. Clients always
wait the full pre-buffer duration via TickComponent timer.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PlayerControllers have bOnlyRelevantToOwner=true, so NetConversatingPlayer
was always nullptr on remote clients. Added NetConversatingPawn (APawn*)
which IS replicated to all clients. OnRep_ConversationState now uses
NetConversatingPawn as the posture TargetActor, enabling head/eye
tracking on all clients.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
UE5 limits replicated TArrays to 65535 elements. ElevenLabs sends
audio chunks up to ~72K bytes, exceeding this limit when sent as
raw PCM. Split into 32000-byte sub-chunks (1s of 16kHz 16-bit mono)
before calling MulticastReceiveAgentAudio.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
UE5 5.5's FVoiceModule returns NULL encoder/decoder, breaking
Opus-based audio replication. This adds a transparent fallback:
- Server sends raw PCM when OpusEncoder is null (~32KB/s, fine for LAN)
- Client accepts raw PCM when OpusDecoder is null
- Changed MulticastReceiveAgentAudio from Unreliable to Reliable
to handle larger uncompressed payloads without packet loss
- Added OnlineSubsystemNull config for future Opus compatibility
- Removed premature bAgentSpeaking=true from MulticastAgentStartedSpeaking
to fix race condition with audio initialization
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Log at InitOpusCodec: FVoiceModule availability, encoder/decoder creation, net role
- Log at HandleAudioReceived: warn once if encoder is null or role is not Authority
- These fire unconditionally (not behind bDebug) to catch the root cause
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Logs compressed audio size in HandleAudioReceived to diagnose
why MulticastReceiveAgentAudio (Unreliable) never reaches clients.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- OnRep_ConversationState now sets PostureComponent TargetActor from
replicated NetConversatingPlayer so remote clients see head/eyes tracking
- Activate FacialExpressionComponent and LipSyncComponent on remote clients
(OnAgentConnected never fires on clients since WebSocket is server-only)
- Fix audio race condition: MulticastAgentStartedSpeaking no longer sets
bAgentSpeaking prematurely, letting EnqueueAgentAudio handle the full
first-chunk initialization (pre-buffer, Play(), state reset)
- Add diagnostic logging to MulticastReceiveAgentAudio for silent failures
(OpusDecoder invalid, LOD culling, decode failure)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces UPS_AI_ConvAgent_AgentConfig_ElevenLabs data asset to encapsulate
full agent configuration (voice, LLM, prompt, language, emotions) with a
custom Detail Customization providing:
- Voice/TTS Model/LLM/Language pickers with Fetch buttons (ElevenLabs API)
- LLM latency hints in dropdown (~250ms, ~700ms, etc.)
- Create/Update/Fetch Agent buttons for REST API CRUD
- Auto-fetch on editor open, auto-select first voice for new assets
- Prompt fragment management (language, multilingual, emotion tool)
- Smart defaults: gemini-2.5-flash LLM, eleven_turbo_v2_5 TTS, English
- Speed range expanded to 0.7-1.95 (was 0.7-1.2)
- bAutoStartConversation + StartConversationWithSelectedAgent() on InteractionComponent
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Saved/ is not staged in packaged builds, so Content/Certificates/ is the
only reliable location. Simplified code by removing Android-specific
writable fallback (Content/ works on all platforms with NonUFS staging).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>