136 Commits

Author SHA1 Message Date
11c473f3ab Merge branch 'main' of https://git.polymorph.fr/j.foucher/PS_AI_Agent 2026-03-02 12:39:23 +01:00
0b07a0493f ini 2026-03-02 12:39:04 +01:00
82b134bcc3 Resolve some Bugs 2026-03-02 12:37:28 +01:00
259a77f9f6 Add Agent Config data asset system with ElevenLabs editor integration
Introduces UPS_AI_ConvAgent_AgentConfig_ElevenLabs data asset to encapsulate
full agent configuration (voice, LLM, prompt, language, emotions) with a
custom Detail Customization providing:
- Voice/TTS Model/LLM/Language pickers with Fetch buttons (ElevenLabs API)
- LLM latency hints in dropdown (~250ms, ~700ms, etc.)
- Create/Update/Fetch Agent buttons for REST API CRUD
- Auto-fetch on editor open, auto-select first voice for new assets
- Prompt fragment management (language, multilingual, emotion tool)
- Smart defaults: gemini-2.5-flash LLM, eleven_turbo_v2_5 TTS, English
- Speed range expanded to 0.7-1.95 (was 0.7-1.2)
- bAutoStartConversation + StartConversationWithSelectedAgent() on InteractionComponent

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 20:51:05 +01:00
8175375c28 remove unwanted old plugins 2026-03-01 17:18:30 +01:00
275065f5aa Revert SSL cert path to Content/Certificates for packaged build staging
Saved/ is not staged in packaged builds, so Content/Certificates/ is the
only reliable location. Simplified code by removing Android-specific
writable fallback (Content/ works on all platforms with NonUFS staging).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 16:42:16 +01:00
33ec54150f Add bActive with smooth blend and auto-activation to all 3 AnimNode components
- Posture, FacialExpression, LipSync: bActive + ActivationBlendDuration for
  smooth alpha blend in/out (linear interp via FInterpConstantTo).
- Auto-activation: components bind to OnAgentConnected/OnAgentDisconnected,
  starting inactive and blending in when conversation begins.
- Without an agent component, bActive defaults to true (backward compatible).
- Add BlueprintFunctionLibrary with SetPostProcessAnimBlueprint helper
  (wraps UE5.5 SetOverridePostProcessAnimBP for per-instance BP setup).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 15:24:57 +01:00
5fcd98ba73 Add Android platform support to plugin
Whitelist Android in .uplugin Runtime module and handle
read-only APK paths for SSL certificate copy on Android
(ProjectSavedDir fallback). No change to Win64 behavior.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 14:14:30 +01:00
8fcb8b6f30 Gate operational logs behind bDebug/bVerboseLogging
Only init, deinit, errors and critical warnings remain unconditional.
All turn-by-turn logs (mic open/close, agent generating/speaking/stopped,
emotion changes, audio chunks, text/viseme processing, latency, bone
resolution, selection, registration) are now gated behind component
bDebug or Settings->bVerboseLogging.

10 files, ~125 lines removed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 13:28:59 +01:00
09ca2215a0 Remove dead code across plugin: unused members, constants, includes
- EL_LOG() function + JsonUtilities.h include (WebSocketProxy)
- SilenceBuffer, bSpeculativeTurn (ElevenLabsComponent + proxy)
- bSignedURLMode, SignedURLEndpoint (plugin settings)
- bHasPendingAnalysis (LipSyncComponent)
- bForceSelectionActive (InteractionComponent)
- UserActivity, InternalTentativeAgent constants (Definitions)
- Engine/AssetManager.h includes (EmotionPoseMap, LipSyncPoseMap)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 13:03:20 +01:00
5323da6908 no message 2026-03-01 12:46:26 +01:00
350558ba50 v3.0.1: Packaged build fixes, conversation-triggered body tracking, diagnostic cleanup
Packaged build fixes:
- Use EvaluateCurveData() for emotion and lip sync curves (works with
  compressed/cooked data instead of raw FloatCurves)
- Add FMemMark wrapper for game-thread curve evaluation (FBlendedCurve
  uses TMemStackAllocator)
- Lazy binding in AnimNodes and LipSyncComponent for packaged build
  component initialization order
- SetIsReplicatedByDefault(true) instead of SetIsReplicated(true)
- Load AudioCapture module explicitly in plugin StartupModule
- Bundle cacert.pem for SSL in packaged builds
- Add DirectoriesToAlwaysStageAsNonUFS for certificates
- Add EmotionPoseMap and LipSyncPoseMap .cpp implementations

Body tracking:
- Body tracking now activates on conversation start (HandleAgentResponseStarted)
  instead of on selection, creating a natural notice→engage two-step:
  eyes+head track on selection, body turns when agent starts responding
- SendTextMessage also enables body tracking for text input

Cleanup:
- Remove all temporary [DIAG] and [BODY] debug logs
- Gate PostureComponent periodic debug log behind bDebug flag
- Remove obsolete editor-time curve caches (CachedCurveData, RebuildCurveCache,
  FPS_AI_ConvAgent_CachedEmotionCurves, FPS_AI_ConvAgent_CachedPoseCurves)
  from EmotionPoseMap and LipSyncPoseMap — no longer needed since
  EvaluateCurveData() reads compressed curves directly at runtime

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-01 12:45:11 +01:00
21298e01b0 Add LAN test scripts: Package, Host, Join batch files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:41:58 +01:00
765da966a2 Add Listen Server networking: exclusive NPC lock, Opus audio broadcast, LOD
- Replicate conversation state (bNetIsConversing, NetConversatingPlayer) for exclusive NPC locking
- Opus encode TTS audio on server, multicast to all clients for shared playback
- Replicate emotion state (OnRep) so clients compute facial expressions locally
- Multicast speaking/interrupted/text events so lip sync and posture run locally
- Route mic audio via Server RPC (client→server→ElevenLabs WebSocket)
- LOD: cull audio beyond 30m, skip lip sync beyond 15m for non-speaker clients
- Auto-detect player disconnection and release NPC on authority
- InteractionComponent: skip occupied NPCs, auto-start conversation on selection
- No changes to LipSync, Posture, FacialExpression, MicCapture or AnimNodes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 11:04:07 +01:00
453450d7eb Commit Scene + BP 2026-02-27 18:38:16 +01:00
301efee982 Enable body tracking on both voice and text input
Body tracking is now activated by ElevenLabsComponent directly
in StartListening() and SendTextMessage(), instead of being
managed by InteractionComponent. This ensures the agent turns
its body toward the player on any form of conversation input.

InteractionComponent still disables body tracking on deselection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 18:33:08 +01:00
92a8b70a7f Split posture: eyes+head on selection, body on conversation start
PostureComponent gains bEnableBodyTracking flag. When false, only
head and eyes track the target — body stays frozen.

InteractionComponent now:
- On posture attach: sets TargetActor but disables body tracking
  (agent notices player with eyes+head only)
- On StartListening: enables body tracking (agent fully engages)
- On StopListening: disables body tracking

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 18:26:27 +01:00
23e216b211 Add bAutoManageListening to InteractionComponent
Automatically calls StartListening/StopListening on the agent's
ElevenLabsComponent on selection/deselection. Enabled by default.
Disable for manual control (e.g. push-to-talk).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 18:08:16 +01:00
6c081e1207 Fix: body keeps orientation on target loss, fix close-range selection
PostureComponent: body no longer returns to original yaw when
TargetActor is cleared — only head and eyes return to neutral.

InteractionComponent: add AgentEyeLevelOffset (default 150cm) so
the view cone check targets chest height instead of feet, preventing
selection loss at close range.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 17:49:03 +01:00
c6aed472f9 Add auto posture management to InteractionComponent
InteractionComponent now automatically sets/clears the agent's
PostureComponent TargetActor on selection/deselection, with
configurable attach/detach delays and a master toggle for manual control.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-27 17:39:05 +01:00
b1bac66256 move some file 2026-02-27 16:42:09 +01:00
88a824897e WIP multi agent 2026-02-27 09:38:39 +01:00
f24c9355eb Fix: remove embedded worktree from index, add to .gitignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:38:41 +01:00
a3adf10e6f Add per-component debug system, fix BP categories, enable plugin Content
- Add bDebug + DebugVerbosity (0-3) to all 5 components
- Gate verbose/diagnostic logs behind debug checks; keep key events always visible
- Unify BP categories to "PS AI ConvAgent|..." (was "PS_AI_Agent|...")
- Enable CanContainContent in .uplugin for plugin-specific assets
- Remove unused template/animation content assets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 18:38:12 +01:00
b3bfabc52d Rename: unify prefix PS_AI_Agent_ → PS_AI_ConvAgent_
Aligns all class/struct/enum/delegate prefixes with the module name
PS_AI_ConvAgent. Removes redundant Conv_ from ElevenLabsComponent
(PS_AI_Agent_Conv_ElevenLabsComponent → PS_AI_ConvAgent_ElevenLabsComponent).
UI strings now use "PS AI ConvAgent". CoreRedirects updated for both
ElevenLabs* and PS_AI_Agent_* legacy references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:33:47 +01:00
eeb0b1172b Rename plugin ElevenLabs → PS_AI_ConvAgent for multi-backend support
Separates generic systems (lip sync, posture, facial expression, mic)
from ElevenLabs-specific ones (conv agent, websocket, settings).
Generic classes now use PS_AI_Agent_ prefix; ElevenLabs-specific keep
_ElevenLabs suffix. CoreRedirects in DefaultEngine.ini ensure existing
.uasset references load correctly. Binaries/Intermediate deleted for
full rebuild.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 17:19:35 +01:00
8a4622f348 Posture/LipSync: update default values from production tuning
- HeadAnimationCompensation: 1.0 → 0.9
- EyeAnimationCompensation: 1.0 → 0.6
- BodyDriftCompensation: 0.0 → 0.8
- AmplitudeScale: 0.75 → 1.0
- EnvelopeAttackMs: 15 → 10 (range recentered to 1–50)
- EnvelopeReleaseMs: 100 → 50 (range recentered to 10–200)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:42:09 +01:00
fd0e2a2d58 Posture: body drift compensation + debug gaze lines + ARKit eye range calibration
- Add BodyDriftCompensation parameter (0→1) to counter-rotate head when
  body animation moves the torso (bow, lean). Drift correction applied
  AFTER swing-twist on the first chain bone only, preventing the
  correction from being stripped by tilt decomposition or compounded
  through the chain.
- Add bDrawDebugGaze toggle: draws per-eye debug lines from Face mesh
  FACIAL_L/R_Eye bones (cyan = desired direction, green = actual bone
  Z-axis gaze) to visually verify eye contact accuracy.
- Cache Face mesh separately from Body mesh for correct eye bone transforms.
- Use eye bone midpoint (Face mesh) as EyeOrigin for pitch calculation
  instead of head bone, fixing vertical offset.
- Calibrate ARKit eye ranges: horizontal 30→40, vertical 20→35 to match
  MetaHuman actual eye deflection per curve unit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 16:37:52 +01:00
9eca5a3cfe Posture: smooth eye compensation blend + separate head/eye AnimationCompensation
- Fix eye compensation to use smooth Lerp blend instead of binary switch.
  Uses Curve.Get() to read animation's existing CTRL values, then
  Lerp(animCTRL, postureCTRL, Comp) for proportional blending.
- Split AnimationCompensation into HeadAnimationCompensation and
  EyeAnimationCompensation for independent control per layer.
- Fix CTRL curve naming: use correct MetaHuman format
  (CTRL_expressions_eyeLook{Dir}{L/R}) with proper In/Out→Left/Right mapping.
- Add eye diagnostic modes (disabled) for pipeline debugging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 15:00:12 +01:00
9d281c2e03 Posture: animation compensation, emotion lip sync, head jump fixes
Feature 1 — AnimationCompensation: posture overrides head/neck orientation
while body animations still play. Extracts animation contribution vs
reference pose, Slerps toward identity proportionally to compensation.

Feature 2 — Emotion-aware lip sync: expression curves (smile, frown,
sneer) blend additively with facial emotion during speech. Shape curves
(jaw, tongue) remain 100% lip sync. EmotionExpressionBlend property
controls blend ratio (default 0.5).

Bug fixes — Head rotation 1-2 frame jump prevention:
- Keep last cached rotation when PostureComponent is momentarily invalid
- Normalize composed quaternion (TurnQuat * NodQuat)
- Clamp DeltaTime to 50ms to prevent FInterpTo spikes during GC
- Guard against double cascade overflow (body + eye) in same frame
- EnforceShortestPath dot-product check before all Slerp operations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 12:43:18 +01:00
d2e904144a begin posture with animation 2026-02-26 08:57:37 +01:00
cdb118a83e Add some debug ressource 2026-02-25 15:07:59 +01:00
8df6967ee5 Posture: neck bone chain distribution + swing-twist axis fix + log cleanup
- Add NeckBoneChain config (FElevenLabsNeckBoneEntry USTRUCT) to distribute
  head rotation across multiple bones (e.g. neck_01=0.25, neck_02=0.35,
  head=0.40) for more natural neck arc movement.
- Swing-twist now uses bone's actual tilt axis (BoneRot.RotateVector) instead
  of hardcoded FVector::RightVector — accounts for MetaHuman ~11.5° reference
  pose rotation on head bone.
- Clean up diagnostic logging: reduced to Log level, removed verbose per-frame
  quaternion dumps.
- Backward compatible: empty NeckBoneChain falls back to single HeadBoneName.

KNOWN ISSUE: diagonal tilt is more visible with neck chain — needs investigation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 15:05:21 +01:00
c3abc420cf map 2026-02-25 14:45:33 +01:00
81bcd67428 Merge posture fixes: cascade two-step, ARKit normalization, diagonal tilt
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 14:44:56 +01:00
250ae5e04d Posture: fix cascade two-step, ARKit normalization, diagonal tilt
- Fix two-step body animation: head overflow now checked against
  TargetBodyWorldYaw (body TARGET) instead of current body position,
  preventing head overcompensation during body interpolation.
- Fix ARKit eye curve normalization: curves now normalized by fixed
  physical range (30°/20°) instead of configurable MaxEye thresholds.
  MaxEye controls cascade threshold only, not visual deflection.
- Fix diagonal head tilt: full FQuat pipeline (no FRotator round-trip)
  + swing-twist decomposition using bone's actual tilt axis (accounts
  for MetaHuman ~11.5° reference pose rotation on head bone).
- Clean up diagnostic logging: reduced to Log level, removed verbose
  per-frame quaternion dumps from investigation phase.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 14:41:03 +01:00
7c235106d6 Add Uasset and fox for postures 2026-02-25 13:58:23 +01:00
780101d389 Posture: relative sticky cascade + thread safety + quaternion head rotation
Rewrite posture system as a relative cascade (Eyes → Head → Body) with
persistent sticky targets. Each layer stays put until the layer below
overflows its max angle, then realigns fully toward the target.

Key changes:
- Thread safety: FCriticalSection protects AnimNode shared data
- Persistent TargetHeadYaw/Pitch: overflow checked against target (not
  interpolating current) so head completes full realignment
- Persistent TargetBodyWorldYaw: body only moves when head+eyes combined
  range is exceeded (sticky, same pattern as head)
- Quaternion head rotation: compose independent NodQuat × TurnQuat to
  avoid diagonal coupling that FRotator causes
- Eye curves: negate CurrentEyeYaw for correct ARKit convention
- AnimNode: enhanced logging, axis diagnostic define (disabled)
- Remove old percentage/degree activation thresholds (max angles serve
  as natural thresholds in the relative cascade)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-25 13:28:08 +01:00
7373959d8b v2.4.0: Posture component — multi-layer look-at (body/head/eyes)
New ElevenLabsPostureComponent with 3-layer rotation distribution:
- Body (60%): rotates entire actor (yaw only) via SetActorRotation
- Head (20%): direct bone transform in AnimNode
- Eyes (10%): 8 ARKit eye look curves (eyeLookUp/Down/In/Out L/R)

Features:
- Configurable rotation percentages and angle limits per layer
- Smooth FInterpTo interpolation with per-layer speeds
- TargetActor + TargetOffset for any actor type (no skeleton required)
- Smooth return to neutral when TargetActor is cleared
- Blue AnimGraph node in ElevenLabs category

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 21:16:57 +01:00
88c175909e v2.3.0: Separate emotion data asset + code cleanup
- Create dedicated UElevenLabsEmotionPoseMap data asset for emotions
- FacialExpressionComponent now uses EmotionPoseMap (not LipSyncPoseMap)
- Remove emotion data (FElevenLabsEmotionPoseSet, EmotionPoses TMap) from LipSyncPoseMap
- Add ElevenLabsEmotionPoseMapFactory for Content Browser asset creation
- Standardize log format to [T+Xs] [Turn N] in ConversationalAgentComponent
- Remove unused #include "Engine/World.h"
- Simplify collision log to single line

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 20:17:04 +01:00
f6541bd7e2 v2.2.0: Separate AnimNode for facial expressions + real-time emotion anim playback
- New AnimNode_ElevenLabsFacialExpression: independent AnimBP node for emotion expressions
- New AnimGraphNode (amber color) in ElevenLabs category for AnimBP editor
- Emotion AnimSequences now play in real-time (looping) instead of static pose at t=0
- Smooth crossfade between emotions with configurable duration
- LipSync AnimNode skips near-zero curves so emotion base layer shows through during silence
- Removed emotion merge from LipSyncComponent (now handled by AnimBP node ordering)
- Removed verbose per-tick VISEME debug log
- Two-layer AnimBP chain: FacialExpression → LipSync → mh_arkit_mapping_pose

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 19:13:18 +01:00
949efd5578 metahuman datas 2026-02-24 18:13:56 +01:00
f57bb65297 WIP: Emotion facial expressions + client tool support
- ElevenLabs client tool call/result WebSocket support (set_emotion)
- EElevenLabsEmotion + EElevenLabsEmotionIntensity enums
- Emotion poses in PoseMap data asset (Normal/Medium/Extreme per emotion)
- Standalone ElevenLabsFacialExpressionComponent (separated from LipSync)
- Two-layer architecture: emotion base (eyes/brows/cheeks) + lip sync on top (mouth)
- Smooth emotion blending (~500ms configurable transitions)
- LipSync reads emotion curves from FacialExpressionComponent via GetCurrentEmotionCurves()

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 18:08:23 +01:00
e57be0a1d9 v2.1.0: Decoupled viseme timeline with quintic crossfade and tuned dead zone
Decouple viseme timing from 32ms audio chunks by introducing an independent
FVisemeTimelineEntry timeline evaluated at render framerate. Playback-time
envelope tracking from consumed queue frames replaces arrival-time-only
updates, with fast 40ms decay when queue is empty.

- Viseme subsampling caps at ~10/sec (100ms min) to prevent saccades
- Full-duration quintic smootherstep crossfade (C2 continuous, no hold phase)
- Dead zone lowered to 0.15 for cleaner silence transitions
- TotalActiveFramesSeen cumulative counter for accurate timeline scaling
- Absolute cursor preservation on timeline rebuild
- Moderate Lerp smoothing (attack 0.55, release 0.40)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 16:46:36 +01:00
aa3010bae8 v2.0.0: Syllable-core lip sync + pose-based phoneme mapping
Major rewrite of the lip sync system for natural, syllable-level
mouth animation using artist-crafted phoneme pose AnimSequences.

Pose system:
- ElevenLabsLipSyncPoseMap UDataAsset maps OVR visemes to AnimSequences
- Curve extraction at BeginPlay from MHF_AA, MHF_PBM, etc. poses
- 50+ CTRL_expressions curves per viseme for MetaHuman fidelity
- Editor factory for creating PoseMap assets via right-click menu

Syllable-core viseme filter:
- Only emit vowels (aa, E, ih, oh, ou) and lip-visible consonants
  (PP=P/B/M, FF=F/V, CH=CH/SH/J) — ~1-2 visemes per syllable
- Skip invisible tongue/throat consonants (DD, kk, SS, nn, RR, TH)
- Audio amplitude handles pauses naturally (no text-driven sil)
- Removes "frantic mouth" look from articulating every phoneme

French pronunciation fixes:
- Apostrophe handling: d'accord, s'il — char before ' not word-final
- E-muet fix for short words: je, de, le (≤2 chars) keep vowel
- Double consonant rules: cc, pp, ff, rr, bb, gg, dd

Silence detection (pose mode):
- Hybrid amplitude: per-window RMS for silence gating (32ms),
  chunk-level RMS for shape intensity (no jitter on 50+ curves)
- Intra-chunk pauses now visible (mouth properly closes during gaps)

Pose-mode optimizations:
- Skip spectral analysis (text visemes provide shape)
- Gentler viseme smoothing (attack ×0.7, release ×0.45)
- Skip blendshape double-smoothing (avoids vibration)
- Higher viseme weight threshold (0.05 vs 0.001)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 13:47:10 +01:00
6543bc6785 v1.9.1: Fix audio loss after interruption, instant audio stop, lip sync reset
- Fix event_id filtering bug: reset LastInterruptEventId when new generation
  starts, preventing all audio from being silently dropped after an interruption
- Match C++ sample API config: remove optimize_streaming_latency and
  custom_llm_extra_body overrides, send empty conversation_config_override
  in Server VAD mode (only send turn_timeout in Client mode)
- Instant audio stop on interruption: call ResetAudio() before Stop() to
  flush USoundWaveProcedural's internal ring buffer
- Lip sync reset on interruption/stop: bind OnAgentInterrupted (snap to
  neutral) and OnAgentStoppedSpeaking (clear queues) events
- Revert jitter buffer (replaced by pre-buffer approach, default 2000ms)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 09:48:56 +01:00
c2142f3e6b v1.9.0: Fix audio gaps, pre-buffer, and lip sync neutral pose
- Remove silence padding accumulation bug: QueueAudio'd silence was
  accumulating in USoundWaveProcedural's internal buffer during TTS gaps,
  delaying real audio by ~800ms. USoundWaveProcedural with
  INDEFINITELY_LOOPING_DURATION generates silence internally instead.
- Fix pre-buffer bypass: guard OnProceduralUnderflow with bPreBuffering
  check — the audio component never stops (INDEFINITELY_LOOPING_DURATION)
  so it was draining AudioQueue during pre-buffering, defeating it entirely.
- Audio pre-buffer default 2000ms (max 4000ms) to absorb ElevenLabs
  server-side TTS inter-chunk gaps (~2s between chunks confirmed).
- Add diagnostic timestamps [T+Xs] in HandleAudioReceived and
  AudioQueue DRY/recovered logs for debugging audio pipeline timing.
- Fix lip sync not returning to neutral: add snap-to-zero (< 0.01)
  in blendshape smoothing pass and clean up PreviousBlendshapes to
  prevent asymptotic Lerp residuals keeping mouth slightly open.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 20:37:23 +01:00
7dfffdbad8 Lip sync v2: text persistence across TTS chunks, audio pre-buffering, smoothing fixes
- Fix text erasure between TTS audio chunks (bFullTextReceived guard):
  partial text now persists across all chunks of the same utterance instead
  of being erased after chunk 1's queue empties
- Add audio pre-buffering (AudioPreBufferMs, default 250ms) to absorb TTS
  inter-chunk gaps and eliminate mid-sentence audio pauses
- Lip sync pauses viseme queue consumption during pre-buffer to stay in sync
- Inter-frame interpolation (lerp between consumed and next queued frame)
  for smoother mouth transitions instead of 32ms step-wise jumps
- Reduce double-smoothing (blendshape smooth 0.8→0.4, release 0.5→0.65)
- Adjust duration weights (vowels 2.0/1.7, plosives 0.8, silence 1.0)
- UI range refinement (AmplitudeScale 0.5-1.0, SmoothingSpeed 35-65)
- Silence padding capped at 512 samples (32ms) to prevent buffer accumulation
- Audio playback restart on buffer underrun during speech
- Optimized log levels (most debug→Verbose, kept key diagnostics at Log)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 19:34:36 +01:00
ce7a146ce9 Commit Lipsync WIP 2026-02-22 15:54:53 +01:00
224af6a27b WIP: Add ElevenLabsLipSyncComponent with spectral analysis lip sync
Real-time lip sync component that performs client-side spectral analysis
on the agent's PCM audio stream (ElevenLabs doesn't provide viseme data).

Pipeline: 512-point FFT (16kHz) → 5 frequency bands → 15 OVR visemes
→ ARKit blendshapes (MetaHuman compatible) → auto-apply morph targets.

Currently uses SetMorphTarget() which may be overridden by MetaHuman's
Face AnimBP — face animation not yet working. Debug logs added to
diagnose: audio flow, spectrum energy, morph target name matching.

Next steps: verify debug output, fix MetaHuman morph target override
(likely needs AnimBP integration like Convai approach).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-22 11:23:34 +01:00