v2.0.0: Syllable-core lip sync + pose-based phoneme mapping

Major rewrite of the lip sync system for natural, syllable-level
mouth animation using artist-crafted phoneme pose AnimSequences.

Pose system:
- ElevenLabsLipSyncPoseMap UDataAsset maps OVR visemes to AnimSequences
- Curve extraction at BeginPlay from MHF_AA, MHF_PBM, etc. poses
- 50+ CTRL_expressions curves per viseme for MetaHuman fidelity
- Editor factory for creating PoseMap assets via right-click menu

Syllable-core viseme filter:
- Only emit vowels (aa, E, ih, oh, ou) and lip-visible consonants
  (PP=P/B/M, FF=F/V, CH=CH/SH/J) — ~1-2 visemes per syllable
- Skip invisible tongue/throat consonants (DD, kk, SS, nn, RR, TH)
- Audio amplitude handles pauses naturally (no text-driven sil)
- Removes "frantic mouth" look from articulating every phoneme

French pronunciation fixes:
- Apostrophe handling: d'accord, s'il — char before ' not word-final
- E-muet fix for short words: je, de, le (≤2 chars) keep vowel
- Double consonant rules: cc, pp, ff, rr, bb, gg, dd

Silence detection (pose mode):
- Hybrid amplitude: per-window RMS for silence gating (32ms),
  chunk-level RMS for shape intensity (no jitter on 50+ curves)
- Intra-chunk pauses now visible (mouth properly closes during gaps)

Pose-mode optimizations:
- Skip spectral analysis (text visemes provide shape)
- Gentler viseme smoothing (attack ×0.7, release ×0.45)
- Skip blendshape double-smoothing (avoids vibration)
- Higher viseme weight threshold (0.05 vs 0.001)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
j.foucher 2026-02-24 13:47:10 +01:00
parent 6543bc6785
commit aa3010bae8
232 changed files with 645 additions and 107 deletions

Some files were not shown because too many files have changed in this diff Show More