Rename plugin ElevenLabs → PS_AI_ConvAgent for multi-backend support

Separates generic systems (lip sync, posture, facial expression, mic)
from ElevenLabs-specific ones (conv agent, websocket, settings).
Generic classes now use PS_AI_Agent_ prefix; ElevenLabs-specific keep
_ElevenLabs suffix. CoreRedirects in DefaultEngine.ini ensure existing
.uasset references load correctly. Binaries/Intermediate deleted for
full rebuild.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
j.foucher 2026-02-26 17:19:35 +01:00
parent 8a4622f348
commit eeb0b1172b
49 changed files with 1057 additions and 1004 deletions

View File

@ -1,5 +1,11 @@
# PS_AI_Agent — Notes pour Claude
## Plugin Rename (completed)
- Plugin: `PS_AI_Agent_ElevenLabs``PS_AI_ConvAgent`
- Generic classes: `ElevenLabs*``PS_AI_Agent_*` (e.g. `UPS_AI_Agent_PostureComponent`)
- ElevenLabs-specific classes keep `_ElevenLabs` suffix (e.g. `UPS_AI_Agent_Conv_ElevenLabsComponent`)
- CoreRedirects in DefaultEngine.ini handle .uasset references
## Posture System — Diagonal Tilt Bug (à corriger)
### Problème
@ -12,15 +18,17 @@ Le swing-twist est fait une seule fois sur le tip bone (head), puis le CleanOffs
Faire le swing-twist **PER-BONE** : pour chaque bone de la chaîne, composer le FractionalRot avec le bone's own rotation, swing-twist autour du bone's own tilt axis, puis appliquer le swing seulement.
### Fichiers concernés
- `AnimNode_ElevenLabsPosture.cpp` — Evaluate_AnyThread, section multi-bone chain
- `AnimNode_PS_AI_Agent_Posture.cpp` — Evaluate_AnyThread, section multi-bone chain
- Commit actuel: `8df6967` sur main
### Config actuelle neck chain (BP_Taro)
neck_01=0.25, neck_02=0.35, head=0.40
## Architecture Posture
- **ElevenLabsPostureComponent** (game thread) : cascade Eyes→Head→Body, produit FQuat + eye curves
- **AnimNode_ElevenLabsPosture** (anim thread) : applique rotation sur bone(s) + injecte curves
- **PS_AI_Agent_PostureComponent** (game thread) : cascade Eyes→Head→Body, produit FQuat + eye curves
- **AnimNode_PS_AI_Agent_Posture** (anim thread) : applique rotation sur bone(s) + injecte curves
- Pipeline 100% FQuat (pas de round-trip FRotator)
- Thread safety via FCriticalSection
- ARKit eye curves normalisées par range fixe (30°/20°), pas par MaxEye threshold
- ARKit eye curves normalisées par range fixe (40°/35°), pas par MaxEye threshold
- Body drift compensation: applied ONLY on first chain bone, AFTER swing-twist
- Debug gaze: per-eye lines from Face mesh using Z-axis (GetAxisZ())

View File

@ -80,6 +80,51 @@ FontDPI=72
+ActiveGameNameRedirects=(OldGameName="TP_Blank",NewGameName="/Script/PS_AI_Agent")
+ActiveGameNameRedirects=(OldGameName="/Script/TP_Blank",NewGameName="/Script/PS_AI_Agent")
[CoreRedirects]
; ── Plugin module rename ──
+PackageRedirects=(OldName="/Script/PS_AI_Agent_ElevenLabs", NewName="/Script/PS_AI_ConvAgent")
+PackageRedirects=(OldName="/Script/PS_AI_Agent_ElevenLabsEditor", NewName="/Script/PS_AI_ConvAgentEditor")
; ── Generic classes (ElevenLabs → PS_AI_Agent) ──
+ClassRedirects=(OldName="ElevenLabsLipSyncComponent", NewName="PS_AI_Agent_LipSyncComponent")
+ClassRedirects=(OldName="ElevenLabsFacialExpressionComponent", NewName="PS_AI_Agent_FacialExpressionComponent")
+ClassRedirects=(OldName="ElevenLabsPostureComponent", NewName="PS_AI_Agent_PostureComponent")
+ClassRedirects=(OldName="ElevenLabsMicrophoneCaptureComponent", NewName="PS_AI_Agent_MicrophoneCaptureComponent")
+ClassRedirects=(OldName="ElevenLabsLipSyncPoseMap", NewName="PS_AI_Agent_LipSyncPoseMap")
+ClassRedirects=(OldName="ElevenLabsEmotionPoseMap", NewName="PS_AI_Agent_EmotionPoseMap")
; ── ElevenLabs-specific classes ──
+ClassRedirects=(OldName="ElevenLabsConversationalAgentComponent", NewName="PS_AI_Agent_Conv_ElevenLabsComponent")
+ClassRedirects=(OldName="ElevenLabsWebSocketProxy", NewName="PS_AI_Agent_WebSocket_ElevenLabsProxy")
+ClassRedirects=(OldName="ElevenLabsSettings", NewName="PS_AI_Agent_Settings_ElevenLabs")
; ── AnimNode structs ──
+StructRedirects=(OldName="AnimNode_ElevenLabsLipSync", NewName="AnimNode_PS_AI_Agent_LipSync")
+StructRedirects=(OldName="AnimNode_ElevenLabsFacialExpression", NewName="AnimNode_PS_AI_Agent_FacialExpression")
+StructRedirects=(OldName="AnimNode_ElevenLabsPosture", NewName="AnimNode_PS_AI_Agent_Posture")
; ── AnimGraphNode classes ──
+ClassRedirects=(OldName="AnimGraphNode_ElevenLabsLipSync", NewName="AnimGraphNode_PS_AI_Agent_LipSync")
+ClassRedirects=(OldName="AnimGraphNode_ElevenLabsFacialExpression", NewName="AnimGraphNode_PS_AI_Agent_FacialExpression")
+ClassRedirects=(OldName="AnimGraphNode_ElevenLabsPosture", NewName="AnimGraphNode_PS_AI_Agent_Posture")
; ── Factory classes ──
+ClassRedirects=(OldName="ElevenLabsLipSyncPoseMapFactory", NewName="PS_AI_Agent_LipSyncPoseMapFactory")
+ClassRedirects=(OldName="ElevenLabsEmotionPoseMapFactory", NewName="PS_AI_Agent_EmotionPoseMapFactory")
; ── Structs ──
+StructRedirects=(OldName="ElevenLabsEmotionPoseSet", NewName="PS_AI_Agent_EmotionPoseSet")
+StructRedirects=(OldName="ElevenLabsNeckBoneEntry", NewName="PS_AI_Agent_NeckBoneEntry")
+StructRedirects=(OldName="ElevenLabsConversationInfo", NewName="PS_AI_Agent_ConversationInfo_ElevenLabs")
+StructRedirects=(OldName="ElevenLabsTranscriptSegment", NewName="PS_AI_Agent_TranscriptSegment_ElevenLabs")
+StructRedirects=(OldName="ElevenLabsClientToolCall", NewName="PS_AI_Agent_ClientToolCall_ElevenLabs")
; ── Enums ──
+EnumRedirects=(OldName="EElevenLabsConnectionState", NewName="EPS_AI_Agent_ConnectionState_ElevenLabs")
+EnumRedirects=(OldName="EElevenLabsTurnMode", NewName="EPS_AI_Agent_TurnMode_ElevenLabs")
+EnumRedirects=(OldName="EElevenLabsEmotion", NewName="EPS_AI_Agent_Emotion")
+EnumRedirects=(OldName="EElevenLabsEmotionIntensity", NewName="EPS_AI_Agent_EmotionIntensity")
[/Script/AndroidFileServerEditor.AndroidFileServerRuntimeSettings]
bEnablePlugin=True
bAllowNetworkConnection=True

View File

@ -19,7 +19,7 @@
]
},
{
"Name": "PS_AI_Agent_ElevenLabs",
"Name": "PS_AI_ConvAgent",
"Enabled": true
},
{

View File

@ -1,50 +0,0 @@
// Copyright ASTERION. All Rights Reserved.
#include "PS_AI_Agent_ElevenLabs.h"
#include "Developer/Settings/Public/ISettingsModule.h"
#include "UObject/UObjectGlobals.h"
#include "UObject/Package.h"
IMPLEMENT_MODULE(FPS_AI_Agent_ElevenLabsModule, PS_AI_Agent_ElevenLabs)
#define LOCTEXT_NAMESPACE "PS_AI_Agent_ElevenLabs"
void FPS_AI_Agent_ElevenLabsModule::StartupModule()
{
Settings = NewObject<UElevenLabsSettings>(GetTransientPackage(), "ElevenLabsSettings", RF_Standalone);
Settings->AddToRoot();
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
{
SettingsModule->RegisterSettings(
"Project", "Plugins", "ElevenLabsAIAgent",
LOCTEXT("SettingsName", "ElevenLabs AI Agent"),
LOCTEXT("SettingsDescription", "Configure the ElevenLabs Conversational AI Agent plugin"),
Settings);
}
}
void FPS_AI_Agent_ElevenLabsModule::ShutdownModule()
{
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
{
SettingsModule->UnregisterSettings("Project", "Plugins", "ElevenLabsAIAgent");
}
if (!GExitPurge)
{
Settings->RemoveFromRoot();
}
else
{
Settings = nullptr;
}
}
UElevenLabsSettings* FPS_AI_Agent_ElevenLabsModule::GetSettings() const
{
check(Settings);
return Settings;
}
#undef LOCTEXT_NAMESPACE

View File

@ -1,32 +0,0 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimGraphNode_ElevenLabsFacialExpression.h"
#define LOCTEXT_NAMESPACE "AnimNode_ElevenLabsFacialExpression"
FText UAnimGraphNode_ElevenLabsFacialExpression::GetNodeTitle(ENodeTitleType::Type TitleType) const
{
return LOCTEXT("NodeTitle", "ElevenLabs Facial Expression");
}
FText UAnimGraphNode_ElevenLabsFacialExpression::GetTooltipText() const
{
return LOCTEXT("Tooltip",
"Injects emotion expression curves from the ElevenLabs Facial Expression component.\n\n"
"Place this node BEFORE the ElevenLabs Lip Sync node in the MetaHuman Face AnimBP.\n"
"It outputs CTRL_expressions_* curves for eyes, eyebrows, cheeks, and mouth mood.\n"
"The Lip Sync node placed after will override mouth-area curves during speech.");
}
FString UAnimGraphNode_ElevenLabsFacialExpression::GetNodeCategory() const
{
return TEXT("ElevenLabs");
}
FLinearColor UAnimGraphNode_ElevenLabsFacialExpression::GetNodeTitleColor() const
{
// Warm amber to distinguish from Lip Sync (teal)
return FLinearColor(0.8f, 0.5f, 0.1f, 1.0f);
}
#undef LOCTEXT_NAMESPACE

View File

@ -1,32 +0,0 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimGraphNode_ElevenLabsLipSync.h"
#define LOCTEXT_NAMESPACE "AnimNode_ElevenLabsLipSync"
FText UAnimGraphNode_ElevenLabsLipSync::GetNodeTitle(ENodeTitleType::Type TitleType) const
{
return LOCTEXT("NodeTitle", "ElevenLabs Lip Sync");
}
FText UAnimGraphNode_ElevenLabsLipSync::GetTooltipText() const
{
return LOCTEXT("Tooltip",
"Injects lip sync animation curves from the ElevenLabs Lip Sync component.\n\n"
"Place this node BEFORE the Control Rig in the MetaHuman Face AnimBP.\n"
"It auto-discovers the component and outputs CTRL_expressions_* curves\n"
"that the MetaHuman Control Rig uses to drive facial bones.");
}
FString UAnimGraphNode_ElevenLabsLipSync::GetNodeCategory() const
{
return TEXT("ElevenLabs");
}
FLinearColor UAnimGraphNode_ElevenLabsLipSync::GetNodeTitleColor() const
{
// ElevenLabs brand-ish teal color
return FLinearColor(0.1f, 0.7f, 0.6f, 1.0f);
}
#undef LOCTEXT_NAMESPACE

View File

@ -1,33 +0,0 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimGraphNode_ElevenLabsPosture.h"
#define LOCTEXT_NAMESPACE "AnimNode_ElevenLabsPosture"
FText UAnimGraphNode_ElevenLabsPosture::GetNodeTitle(ENodeTitleType::Type TitleType) const
{
return LOCTEXT("NodeTitle", "ElevenLabs Posture");
}
FText UAnimGraphNode_ElevenLabsPosture::GetTooltipText() const
{
return LOCTEXT("Tooltip",
"Injects head rotation and eye gaze curves from the ElevenLabs Posture component.\n\n"
"Place this node AFTER the ElevenLabs Facial Expression node and\n"
"BEFORE the ElevenLabs Lip Sync node in the MetaHuman Face AnimBP.\n\n"
"The component distributes look-at rotation across body (actor yaw),\n"
"head (bone rotation), and eyes (ARKit curves) for a natural look-at effect.");
}
FString UAnimGraphNode_ElevenLabsPosture::GetNodeCategory() const
{
return TEXT("ElevenLabs");
}
FLinearColor UAnimGraphNode_ElevenLabsPosture::GetNodeTitleColor() const
{
// Cool blue to distinguish from Facial Expression (amber) and Lip Sync (teal)
return FLinearColor(0.2f, 0.4f, 0.9f, 1.0f);
}
#undef LOCTEXT_NAMESPACE

View File

@ -1,29 +0,0 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsEmotionPoseMapFactory.h"
#include "ElevenLabsEmotionPoseMap.h"
#include "AssetTypeCategories.h"
UElevenLabsEmotionPoseMapFactory::UElevenLabsEmotionPoseMapFactory()
{
SupportedClass = UElevenLabsEmotionPoseMap::StaticClass();
bCreateNew = true;
bEditAfterNew = true;
}
UObject* UElevenLabsEmotionPoseMapFactory::FactoryCreateNew(
UClass* Class, UObject* InParent, FName Name, EObjectFlags Flags,
UObject* Context, FFeedbackContext* Warn)
{
return NewObject<UElevenLabsEmotionPoseMap>(InParent, Class, Name, Flags);
}
FText UElevenLabsEmotionPoseMapFactory::GetDisplayName() const
{
return FText::FromString(TEXT("ElevenLabs Emotion Pose Map"));
}
uint32 UElevenLabsEmotionPoseMapFactory::GetMenuCategories() const
{
return EAssetTypeCategories::Misc;
}

View File

@ -1,29 +0,0 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsLipSyncPoseMapFactory.h"
#include "ElevenLabsLipSyncPoseMap.h"
#include "AssetTypeCategories.h"
UElevenLabsLipSyncPoseMapFactory::UElevenLabsLipSyncPoseMapFactory()
{
SupportedClass = UElevenLabsLipSyncPoseMap::StaticClass();
bCreateNew = true;
bEditAfterNew = true;
}
UObject* UElevenLabsLipSyncPoseMapFactory::FactoryCreateNew(
UClass* Class, UObject* InParent, FName Name, EObjectFlags Flags,
UObject* Context, FFeedbackContext* Warn)
{
return NewObject<UElevenLabsLipSyncPoseMap>(InParent, Class, Name, Flags);
}
FText UElevenLabsLipSyncPoseMapFactory::GetDisplayName() const
{
return FText::FromString(TEXT("ElevenLabs Lip Sync Pose Map"));
}
uint32 UElevenLabsLipSyncPoseMapFactory::GetMenuCategories() const
{
return EAssetTypeCategories::Misc;
}

View File

@ -1,16 +0,0 @@
// Copyright ASTERION. All Rights Reserved.
#include "Modules/ModuleManager.h"
/**
* Editor module for PS_AI_Agent_ElevenLabs plugin.
* Provides AnimGraph node(s) for the ElevenLabs Lip Sync system.
*/
class FPS_AI_Agent_ElevenLabsEditorModule : public IModuleInterface
{
public:
virtual void StartupModule() override {}
virtual void ShutdownModule() override {}
};
IMPLEMENT_MODULE(FPS_AI_Agent_ElevenLabsEditorModule, PS_AI_Agent_ElevenLabsEditor)

View File

@ -1,31 +0,0 @@
// Copyright ASTERION. All Rights Reserved.
#pragma once
#include "CoreMinimal.h"
#include "AnimGraphNode_Base.h"
#include "AnimNode_ElevenLabsFacialExpression.h"
#include "AnimGraphNode_ElevenLabsFacialExpression.generated.h"
/**
* AnimGraph editor node for the ElevenLabs Facial Expression AnimNode.
*
* This node appears in the AnimBP graph editor under the "ElevenLabs" category.
* Place it BEFORE the ElevenLabs Lip Sync node in the MetaHuman Face AnimBP.
* It auto-discovers the ElevenLabsFacialExpressionComponent on the owning Actor
* and injects CTRL_expressions_* curves for emotion-driven facial expressions.
*/
UCLASS()
class UAnimGraphNode_ElevenLabsFacialExpression : public UAnimGraphNode_Base
{
GENERATED_BODY()
UPROPERTY(EditAnywhere, Category = "Settings")
FAnimNode_ElevenLabsFacialExpression Node;
// UAnimGraphNode_Base interface
virtual FText GetNodeTitle(ENodeTitleType::Type TitleType) const override;
virtual FText GetTooltipText() const override;
virtual FString GetNodeCategory() const override;
virtual FLinearColor GetNodeTitleColor() const override;
};

View File

@ -2,12 +2,12 @@
"FileVersion": 3,
"Version": 1,
"VersionName": "1.0.0",
"FriendlyName": "PS AI Agent - ElevenLabs",
"Description": "Integrates ElevenLabs Conversational AI Agent into Unreal Engine 5.5. Supports real-time voice conversation via WebSocket, microphone capture, and audio playback.",
"FriendlyName": "PS AI Conversational Agent",
"Description": "Conversational AI Agent framework for Unreal Engine 5.5. Supports lip sync, facial expressions, posture/look-at, and pluggable backends (ElevenLabs, etc.).",
"Category": "AI",
"CreatedBy": "ASTERION",
"CreatedByURL": "",
"DocsURL": "https://elevenlabs.io/docs/conversational-ai",
"DocsURL": "",
"MarketplaceURL": "",
"SupportURL": "",
"CanContainContent": false,
@ -16,7 +16,7 @@
"Installed": false,
"Modules": [
{
"Name": "PS_AI_Agent_ElevenLabs",
"Name": "PS_AI_ConvAgent",
"Type": "Runtime",
"LoadingPhase": "PreDefault",
"PlatformAllowList": [
@ -26,7 +26,7 @@
]
},
{
"Name": "PS_AI_Agent_ElevenLabsEditor",
"Name": "PS_AI_ConvAgentEditor",
"Type": "UncookedOnly",
"LoadingPhase": "Default",
"PlatformAllowList": [

View File

@ -2,9 +2,9 @@
using UnrealBuildTool;
public class PS_AI_Agent_ElevenLabs : ModuleRules
public class PS_AI_ConvAgent : ModuleRules
{
public PS_AI_Agent_ElevenLabs(ReadOnlyTargetRules Target) : base(Target)
public PS_AI_ConvAgent(ReadOnlyTargetRules Target) : base(Target)
{
DefaultBuildSettings = BuildSettingsVersion.Latest;
PCHUsage = PCHUsageMode.UseExplicitOrSharedPCHs;
@ -18,7 +18,7 @@ public class PS_AI_Agent_ElevenLabs : ModuleRules
// JSON serialization for WebSocket message payloads
"Json",
"JsonUtilities",
// WebSocket for ElevenLabs Conversational AI real-time API
// WebSocket for conversational AI real-time API
"WebSockets",
// HTTP for REST calls (agent metadata, auth, etc.)
"HTTP",

View File

@ -1,22 +1,22 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimNode_ElevenLabsFacialExpression.h"
#include "ElevenLabsFacialExpressionComponent.h"
#include "AnimNode_PS_AI_Agent_FacialExpression.h"
#include "PS_AI_Agent_FacialExpressionComponent.h"
#include "Components/SkeletalMeshComponent.h"
#include "Animation/AnimInstanceProxy.h"
#include "GameFramework/Actor.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsFacialExprAnimNode, Log, All);
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_FacialExprAnimNode, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// FAnimNode_Base interface
// ─────────────────────────────────────────────────────────────────────────────
void FAnimNode_ElevenLabsFacialExpression::Initialize_AnyThread(const FAnimationInitializeContext& Context)
void FAnimNode_PS_AI_Agent_FacialExpression::Initialize_AnyThread(const FAnimationInitializeContext& Context)
{
BasePose.Initialize(Context);
// Find the ElevenLabsFacialExpressionComponent on the owning actor.
// Find the PS_AI_Agent_FacialExpressionComponent on the owning actor.
// This runs during initialization (game thread) so actor access is safe.
FacialExpressionComponent.Reset();
CachedEmotionCurves.Reset();
@ -27,19 +27,19 @@ void FAnimNode_ElevenLabsFacialExpression::Initialize_AnyThread(const FAnimation
{
if (AActor* Owner = SkelMesh->GetOwner())
{
UElevenLabsFacialExpressionComponent* Comp =
Owner->FindComponentByClass<UElevenLabsFacialExpressionComponent>();
UPS_AI_Agent_FacialExpressionComponent* Comp =
Owner->FindComponentByClass<UPS_AI_Agent_FacialExpressionComponent>();
if (Comp)
{
FacialExpressionComponent = Comp;
UE_LOG(LogElevenLabsFacialExprAnimNode, Log,
TEXT("ElevenLabs Facial Expression AnimNode bound to component on %s."),
UE_LOG(LogPS_AI_Agent_FacialExprAnimNode, Log,
TEXT("PS AI Agent Facial Expression AnimNode bound to component on %s."),
*Owner->GetName());
}
else
{
UE_LOG(LogElevenLabsFacialExprAnimNode, Warning,
TEXT("No ElevenLabsFacialExpressionComponent found on %s. "
UE_LOG(LogPS_AI_Agent_FacialExprAnimNode, Warning,
TEXT("No PS_AI_Agent_FacialExpressionComponent found on %s. "
"Add the component alongside the Conversational Agent."),
*Owner->GetName());
}
@ -48,12 +48,12 @@ void FAnimNode_ElevenLabsFacialExpression::Initialize_AnyThread(const FAnimation
}
}
void FAnimNode_ElevenLabsFacialExpression::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
void FAnimNode_PS_AI_Agent_FacialExpression::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
{
BasePose.CacheBones(Context);
}
void FAnimNode_ElevenLabsFacialExpression::Update_AnyThread(const FAnimationUpdateContext& Context)
void FAnimNode_PS_AI_Agent_FacialExpression::Update_AnyThread(const FAnimationUpdateContext& Context)
{
BasePose.Update(Context);
@ -68,7 +68,7 @@ void FAnimNode_ElevenLabsFacialExpression::Update_AnyThread(const FAnimationUpda
}
}
void FAnimNode_ElevenLabsFacialExpression::Evaluate_AnyThread(FPoseContext& Output)
void FAnimNode_PS_AI_Agent_FacialExpression::Evaluate_AnyThread(FPoseContext& Output)
{
// Evaluate the upstream pose (pass-through)
BasePose.Evaluate(Output);
@ -84,9 +84,9 @@ void FAnimNode_ElevenLabsFacialExpression::Evaluate_AnyThread(FPoseContext& Outp
}
}
void FAnimNode_ElevenLabsFacialExpression::GatherDebugData(FNodeDebugData& DebugData)
void FAnimNode_PS_AI_Agent_FacialExpression::GatherDebugData(FNodeDebugData& DebugData)
{
FString DebugLine = FString::Printf(TEXT("ElevenLabs Facial Expression (%d curves)"), CachedEmotionCurves.Num());
FString DebugLine = FString::Printf(TEXT("PS AI Agent Facial Expression (%d curves)"), CachedEmotionCurves.Num());
DebugData.AddDebugItem(DebugLine);
BasePose.GatherDebugData(DebugData);
}

View File

@ -1,22 +1,22 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimNode_ElevenLabsLipSync.h"
#include "ElevenLabsLipSyncComponent.h"
#include "AnimNode_PS_AI_Agent_LipSync.h"
#include "PS_AI_Agent_LipSyncComponent.h"
#include "Components/SkeletalMeshComponent.h"
#include "Animation/AnimInstanceProxy.h"
#include "GameFramework/Actor.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsAnimNode, Log, All);
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_LipSyncAnimNode, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// FAnimNode_Base interface
// ─────────────────────────────────────────────────────────────────────────────
void FAnimNode_ElevenLabsLipSync::Initialize_AnyThread(const FAnimationInitializeContext& Context)
void FAnimNode_PS_AI_Agent_LipSync::Initialize_AnyThread(const FAnimationInitializeContext& Context)
{
BasePose.Initialize(Context);
// Find the ElevenLabsLipSyncComponent on the owning actor.
// Find the PS_AI_Agent_LipSyncComponent on the owning actor.
// This runs during initialization (game thread) so actor access is safe.
LipSyncComponent.Reset();
CachedCurves.Reset();
@ -27,19 +27,19 @@ void FAnimNode_ElevenLabsLipSync::Initialize_AnyThread(const FAnimationInitializ
{
if (AActor* Owner = SkelMesh->GetOwner())
{
UElevenLabsLipSyncComponent* Comp =
Owner->FindComponentByClass<UElevenLabsLipSyncComponent>();
UPS_AI_Agent_LipSyncComponent* Comp =
Owner->FindComponentByClass<UPS_AI_Agent_LipSyncComponent>();
if (Comp)
{
LipSyncComponent = Comp;
UE_LOG(LogElevenLabsAnimNode, Log,
TEXT("ElevenLabs Lip Sync AnimNode bound to component on %s."),
UE_LOG(LogPS_AI_Agent_LipSyncAnimNode, Log,
TEXT("PS AI Agent Lip Sync AnimNode bound to component on %s."),
*Owner->GetName());
}
else
{
UE_LOG(LogElevenLabsAnimNode, Warning,
TEXT("No ElevenLabsLipSyncComponent found on %s. "
UE_LOG(LogPS_AI_Agent_LipSyncAnimNode, Warning,
TEXT("No PS_AI_Agent_LipSyncComponent found on %s. "
"Add the component alongside the Conversational Agent."),
*Owner->GetName());
}
@ -48,12 +48,12 @@ void FAnimNode_ElevenLabsLipSync::Initialize_AnyThread(const FAnimationInitializ
}
}
void FAnimNode_ElevenLabsLipSync::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
void FAnimNode_PS_AI_Agent_LipSync::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
{
BasePose.CacheBones(Context);
}
void FAnimNode_ElevenLabsLipSync::Update_AnyThread(const FAnimationUpdateContext& Context)
void FAnimNode_PS_AI_Agent_LipSync::Update_AnyThread(const FAnimationUpdateContext& Context)
{
BasePose.Update(Context);
@ -69,7 +69,7 @@ void FAnimNode_ElevenLabsLipSync::Update_AnyThread(const FAnimationUpdateContext
}
}
void FAnimNode_ElevenLabsLipSync::Evaluate_AnyThread(FPoseContext& Output)
void FAnimNode_PS_AI_Agent_LipSync::Evaluate_AnyThread(FPoseContext& Output)
{
// Evaluate the upstream pose (pass-through)
BasePose.Evaluate(Output);
@ -87,9 +87,9 @@ void FAnimNode_ElevenLabsLipSync::Evaluate_AnyThread(FPoseContext& Output)
}
}
void FAnimNode_ElevenLabsLipSync::GatherDebugData(FNodeDebugData& DebugData)
void FAnimNode_PS_AI_Agent_LipSync::GatherDebugData(FNodeDebugData& DebugData)
{
FString DebugLine = FString::Printf(TEXT("ElevenLabs Lip Sync (%d curves)"), CachedCurves.Num());
FString DebugLine = FString::Printf(TEXT("PS AI Agent Lip Sync (%d curves)"), CachedCurves.Num());
DebugData.AddDebugItem(DebugLine);
BasePose.GatherDebugData(DebugData);
}

View File

@ -1,12 +1,12 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimNode_ElevenLabsPosture.h"
#include "ElevenLabsPostureComponent.h"
#include "AnimNode_PS_AI_Agent_Posture.h"
#include "PS_AI_Agent_PostureComponent.h"
#include "Components/SkeletalMeshComponent.h"
#include "Animation/AnimInstanceProxy.h"
#include "GameFramework/Actor.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsPostureAnimNode, Log, All);
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_PostureAnimNode, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// ARKit → MetaHuman CTRL eye curve mapping.
@ -77,7 +77,7 @@ static const TMap<FName, FName>& GetARKitToCTRLEyeMap()
// FAnimNode_Base interface
// ─────────────────────────────────────────────────────────────────────────────
void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializeContext& Context)
void FAnimNode_PS_AI_Agent_Posture::Initialize_AnyThread(const FAnimationInitializeContext& Context)
{
BasePose.Initialize(Context);
@ -103,20 +103,20 @@ void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializ
{
if (AActor* Owner = SkelMesh->GetOwner())
{
UElevenLabsPostureComponent* Comp =
Owner->FindComponentByClass<UElevenLabsPostureComponent>();
UPS_AI_Agent_PostureComponent* Comp =
Owner->FindComponentByClass<UPS_AI_Agent_PostureComponent>();
if (Comp)
{
PostureComponent = Comp;
HeadBoneName = Comp->GetHeadBoneName();
// Cache neck bone chain configuration
const TArray<FElevenLabsNeckBoneEntry>& Chain = Comp->GetNeckBoneChain();
const TArray<FPS_AI_Agent_NeckBoneEntry>& Chain = Comp->GetNeckBoneChain();
if (Chain.Num() > 0)
{
ChainBoneNames.Reserve(Chain.Num());
ChainBoneWeights.Reserve(Chain.Num());
for (const FElevenLabsNeckBoneEntry& Entry : Chain)
for (const FPS_AI_Agent_NeckBoneEntry& Entry : Chain)
{
if (!Entry.BoneName.IsNone() && Entry.Weight > 0.0f)
{
@ -126,8 +126,8 @@ void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializ
}
}
UE_LOG(LogElevenLabsPostureAnimNode, Log,
TEXT("ElevenLabs Posture AnimNode: Owner=%s Mesh=%s Head=%s Eyes=%s Chain=%d HeadComp=%.1f EyeComp=%.1f DriftComp=%.1f"),
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
TEXT("PS AI Agent Posture AnimNode: Owner=%s Mesh=%s Head=%s Eyes=%s Chain=%d HeadComp=%.1f EyeComp=%.1f DriftComp=%.1f"),
*Owner->GetName(), *SkelMesh->GetName(),
bApplyHeadRotation ? TEXT("ON") : TEXT("OFF"),
bApplyEyeCurves ? TEXT("ON") : TEXT("OFF"),
@ -138,8 +138,8 @@ void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializ
}
else
{
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
TEXT("No ElevenLabsPostureComponent found on %s."),
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT("No PS_AI_Agent_PostureComponent found on %s."),
*Owner->GetName());
}
}
@ -147,7 +147,7 @@ void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializ
}
}
void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
void FAnimNode_PS_AI_Agent_Posture::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
{
BasePose.CacheBones(Context);
@ -187,7 +187,7 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
: FQuat::Identity;
ChainRefPoseRotations.Add(RefRot);
UE_LOG(LogElevenLabsPostureAnimNode, Log,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
TEXT(" Chain bone [%d] '%s' → index %d (weight=%.2f)"),
i, *ChainBoneNames[i].ToString(), CompactIdx.GetInt(),
ChainBoneWeights[i]);
@ -196,7 +196,7 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
{
ChainBoneIndices.Add(FCompactPoseBoneIndex(INDEX_NONE));
ChainRefPoseRotations.Add(FQuat::Identity);
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT(" Chain bone [%d] '%s' NOT FOUND in skeleton!"),
i, *ChainBoneNames[i].ToString());
}
@ -217,20 +217,20 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
? RefPose[MeshIndex].GetRotation()
: FQuat::Identity;
UE_LOG(LogElevenLabsPostureAnimNode, Log,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
TEXT("Head bone '%s' resolved to index %d."),
*HeadBoneName.ToString(), HeadBoneIndex.GetInt());
}
else
{
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT("Head bone '%s' NOT FOUND in skeleton. Available bones:"),
*HeadBoneName.ToString());
const int32 NumBones = FMath::Min(RefSkeleton.GetNum(), 10);
for (int32 i = 0; i < NumBones; ++i)
{
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT(" [%d] %s"), i, *RefSkeleton.GetBoneName(i).ToString());
}
}
@ -258,13 +258,13 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
? RefPose[MeshIdx].GetRotation()
: FQuat::Identity;
UE_LOG(LogElevenLabsPostureAnimNode, Log,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
TEXT("Eye bone '%s' resolved to index %d."),
*BoneName.ToString(), OutIndex.GetInt());
}
else
{
UE_LOG(LogElevenLabsPostureAnimNode, Log,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
TEXT("Eye bone '%s' not found in skeleton (OK for Body AnimBP)."),
*BoneName.ToString());
}
@ -311,7 +311,7 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
RefAccumAboveChain = RefAccumAboveChain * RefRot;
}
UE_LOG(LogElevenLabsPostureAnimNode, Log,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
TEXT("Body drift: %d ancestor bones above chain. RefAccum=(%s)"),
AncestorBoneIndices.Num(),
*RefAccumAboveChain.Rotator().ToCompactString());
@ -320,7 +320,7 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
}
}
void FAnimNode_ElevenLabsPosture::Update_AnyThread(const FAnimationUpdateContext& Context)
void FAnimNode_PS_AI_Agent_Posture::Update_AnyThread(const FAnimationUpdateContext& Context)
{
BasePose.Update(Context);
@ -383,7 +383,7 @@ static FQuat ComputeCompensatedBoneRot(
return (CompensatedContrib * RefPoseRot).GetNormalized();
}
void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
void FAnimNode_PS_AI_Agent_Posture::Evaluate_AnyThread(FPoseContext& Output)
{
// Evaluate the upstream pose (pass-through)
BasePose.Evaluate(Output);
@ -398,7 +398,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
const bool bHasEyeBones = (LeftEyeBoneIndex.GetInt() != INDEX_NONE);
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT("[%s] Posture Evaluate: HeadComp=%.2f EyeComp=%.2f DriftComp=%.2f Valid=%s HeadRot=(%s) Eyes=%d Chain=%d Ancestors=%d"),
NodeRole,
CachedHeadCompensation,
@ -420,7 +420,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
const FQuat Delta = AnimRot * RefRot.Inverse();
const FRotator DeltaRot = Delta.Rotator();
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT(" Chain[0] '%s' AnimDelta from RefPose: Y=%.2f P=%.2f R=%.2f (this gets removed at Comp=1)"),
*ChainBoneNames[0].ToString(),
DeltaRot.Yaw, DeltaRot.Pitch, DeltaRot.Roll);
@ -454,7 +454,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
}
if (++EyeDiagLogCounter % 300 == 1)
{
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT("[EYE DIAG MODE 1] Forcing CTRL_expressions_eyeLookUpL=1.0 | Left eye should look UP if Control Rig reads CTRL curves"));
}
@ -473,7 +473,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
}
if (++EyeDiagLogCounter % 300 == 1)
{
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT("[EYE DIAG MODE 2] Forcing ARKit eyeLookUpLeft=1.0 | Left eye should look UP if mh_arkit_mapping_pose drives eyes"));
}
@ -494,7 +494,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
Output.Curve.Set(ZeroCTRL, 0.0f);
if (++EyeDiagLogCounter % 300 == 1)
{
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT("[EYE DIAG MODE 3] Forcing FACIAL_L_Eye bone -25° pitch | Left eye should look UP if bone rotation drives eyes"));
}
#endif
@ -576,7 +576,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
#if !UE_BUILD_SHIPPING
if (EvalDebugFrameCounter % 300 == 1 && CachedEyeCurves.Num() > 0)
{
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT(" Eyes: Comp=%.2f → %s (anim weight=%.0f%%, posture weight=%.0f%%)"),
Comp,
Comp > 0.001f ? TEXT("BLEND") : TEXT("PASSTHROUGH"),
@ -633,7 +633,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
if (++DiagLogCounter % 90 == 0)
{
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT("DIAG Phase %d: %s | Timer=%.1f"), Phase, PhaseName, DiagTimer);
}
}
@ -695,7 +695,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
if (EvalDebugFrameCounter % 300 == 1)
{
const FRotator CorrRot = FullCorrection.Rotator();
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
TEXT(" DriftCorrection: Y=%.1f P=%.1f R=%.1f | Comp=%.2f"),
CorrRot.Yaw, CorrRot.Pitch, CorrRot.Roll,
CachedBodyDriftCompensation);
@ -784,11 +784,11 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
#endif
}
void FAnimNode_ElevenLabsPosture::GatherDebugData(FNodeDebugData& DebugData)
void FAnimNode_PS_AI_Agent_Posture::GatherDebugData(FNodeDebugData& DebugData)
{
const FRotator DebugRot = CachedHeadRotation.Rotator();
FString DebugLine = FString::Printf(
TEXT("ElevenLabs Posture (eyes: %d, head: Y=%.1f P=%.1f, chain: %d, headComp: %.1f, eyeComp: %.1f, driftComp: %.1f)"),
TEXT("PS AI Agent Posture (eyes: %d, head: Y=%.1f P=%.1f, chain: %d, headComp: %.1f, eyeComp: %.1f, driftComp: %.1f)"),
CachedEyeCurves.Num(),
DebugRot.Yaw, DebugRot.Pitch,
ChainBoneIndices.Num(),

View File

@ -1,19 +1,19 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsConversationalAgentComponent.h"
#include "ElevenLabsMicrophoneCaptureComponent.h"
#include "PS_AI_Agent_ElevenLabs.h"
#include "PS_AI_Agent_Conv_ElevenLabsComponent.h"
#include "PS_AI_Agent_MicrophoneCaptureComponent.h"
#include "PS_AI_ConvAgent.h"
#include "Components/AudioComponent.h"
#include "Sound/SoundWaveProcedural.h"
#include "GameFramework/Actor.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsAgent, Log, All);
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_Conv_ElevenLabs, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Constructor
// ─────────────────────────────────────────────────────────────────────────────
UElevenLabsConversationalAgentComponent::UElevenLabsConversationalAgentComponent()
UPS_AI_Agent_Conv_ElevenLabsComponent::UPS_AI_Agent_Conv_ElevenLabsComponent()
{
PrimaryComponentTick.bCanEverTick = true;
// Tick is used only to detect silence (agent stopped speaking).
@ -24,19 +24,19 @@ UElevenLabsConversationalAgentComponent::UElevenLabsConversationalAgentComponent
// ─────────────────────────────────────────────────────────────────────────────
// Lifecycle
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::BeginPlay()
void UPS_AI_Agent_Conv_ElevenLabsComponent::BeginPlay()
{
Super::BeginPlay();
InitAudioPlayback();
}
void UElevenLabsConversationalAgentComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
void UPS_AI_Agent_Conv_ElevenLabsComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
{
EndConversation();
Super::EndPlay(EndPlayReason);
}
void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELevelTick TickType,
void UPS_AI_Agent_Conv_ElevenLabsComponent::TickComponent(float DeltaTime, ELevelTick TickType,
FActorComponentTickFunction* ThisTickFunction)
{
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
@ -50,7 +50,7 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
{
bWaitingForAgentResponse = false;
const double T = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Warning,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
TEXT("[T+%.2fs] [Turn %d] Response timeout — server did not start generating after %.1fs. Firing OnAgentResponseTimeout."),
T, LastClosedTurnIndex, WaitTime);
OnAgentResponseTimeout.Broadcast();
@ -69,7 +69,7 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
bAgentGenerating = false;
GeneratingTickCount = 0;
const double T = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Warning,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
TEXT("[T+%.2fs] [Turn %d] Generating timeout (10s) — server generated but no audio arrived. Clearing bAgentGenerating."),
T, LastClosedTurnIndex);
}
@ -89,7 +89,7 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
{
bPreBuffering = false;
const double Tpb = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Pre-buffer timeout (%dms). Starting playback."),
Tpb, LastClosedTurnIndex, AudioPreBufferMs);
if (AudioPlaybackComponent && !AudioPlaybackComponent->IsPlaying())
@ -146,7 +146,7 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
if (bHardTimeoutFired)
{
const double Tht = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Warning,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
TEXT("[T+%.2fs] [Turn %d] Agent silence hard-timeout (10s) without agent_response — declaring agent stopped."),
Tht, LastClosedTurnIndex);
}
@ -157,31 +157,31 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
// ─────────────────────────────────────────────────────────────────────────────
// Control
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::StartConversation()
void UPS_AI_Agent_Conv_ElevenLabsComponent::StartConversation()
{
if (!WebSocketProxy)
{
WebSocketProxy = NewObject<UElevenLabsWebSocketProxy>(this);
WebSocketProxy = NewObject<UPS_AI_Agent_WebSocket_ElevenLabsProxy>(this);
WebSocketProxy->OnConnected.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleConnected);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleConnected);
WebSocketProxy->OnDisconnected.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleDisconnected);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleDisconnected);
WebSocketProxy->OnError.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleError);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleError);
WebSocketProxy->OnAudioReceived.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleAudioReceived);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAudioReceived);
WebSocketProxy->OnTranscript.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleTranscript);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleTranscript);
WebSocketProxy->OnAgentResponse.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleAgentResponse);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponse);
WebSocketProxy->OnInterrupted.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleInterrupted);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleInterrupted);
WebSocketProxy->OnAgentResponseStarted.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleAgentResponseStarted);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponseStarted);
WebSocketProxy->OnAgentResponsePart.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleAgentResponsePart);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponsePart);
WebSocketProxy->OnClientToolCall.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleClientToolCall);
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleClientToolCall);
}
// Pass configuration to the proxy before connecting.
@ -191,7 +191,7 @@ void UElevenLabsConversationalAgentComponent::StartConversation()
WebSocketProxy->Connect(AgentID);
}
void UElevenLabsConversationalAgentComponent::EndConversation()
void UPS_AI_Agent_Conv_ElevenLabsComponent::EndConversation()
{
StopListening();
// ISSUE-4: StopListening() may set bWaitingForAgentResponse=true (normal turn end path).
@ -207,11 +207,11 @@ void UElevenLabsConversationalAgentComponent::EndConversation()
}
}
void UElevenLabsConversationalAgentComponent::StartListening()
void UPS_AI_Agent_Conv_ElevenLabsComponent::StartListening()
{
if (!IsConnected())
{
UE_LOG(LogElevenLabsAgent, Warning, TEXT("StartListening: not connected."));
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning, TEXT("StartListening: not connected."));
return;
}
@ -230,14 +230,14 @@ void UElevenLabsConversationalAgentComponent::StartListening()
{
if (bAgentSpeaking && bAllowInterruption)
{
UE_LOG(LogElevenLabsAgent, Log, TEXT("StartListening: interrupting agent (speaking) to allow user to speak."));
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("StartListening: interrupting agent (speaking) to allow user to speak."));
InterruptAgent();
// InterruptAgent → StopAgentAudio clears bAgentSpeaking / bAgentGenerating,
// so we fall through and open the microphone immediately.
}
else
{
UE_LOG(LogElevenLabsAgent, Log, TEXT("StartListening ignored: agent is %s%s — will listen after agent finishes."),
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("StartListening ignored: agent is %s%s — will listen after agent finishes."),
bAgentGenerating ? TEXT("generating") : TEXT("speaking"),
(bAgentSpeaking && !bAllowInterruption) ? TEXT(" (interruption disabled)") : TEXT(""));
return;
@ -248,19 +248,19 @@ void UElevenLabsConversationalAgentComponent::StartListening()
bIsListening = true;
TurnStartTime = FPlatformTime::Seconds();
if (TurnMode == EElevenLabsTurnMode::Client)
if (TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Client)
{
WebSocketProxy->SendUserTurnStart();
}
// Find the microphone component on our owner actor, or create one.
UElevenLabsMicrophoneCaptureComponent* Mic =
GetOwner()->FindComponentByClass<UElevenLabsMicrophoneCaptureComponent>();
UPS_AI_Agent_MicrophoneCaptureComponent* Mic =
GetOwner()->FindComponentByClass<UPS_AI_Agent_MicrophoneCaptureComponent>();
if (!Mic)
{
Mic = NewObject<UElevenLabsMicrophoneCaptureComponent>(GetOwner(),
TEXT("ElevenLabsMicrophone"));
Mic = NewObject<UPS_AI_Agent_MicrophoneCaptureComponent>(GetOwner(),
TEXT("PS_AI_Agent_Microphone"));
Mic->RegisterComponent();
}
@ -268,13 +268,13 @@ void UElevenLabsConversationalAgentComponent::StartListening()
// up if StartListening is called more than once without a matching StopListening.
Mic->OnAudioCaptured.RemoveAll(this);
Mic->OnAudioCaptured.AddUObject(this,
&UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured);
&UPS_AI_Agent_Conv_ElevenLabsComponent::OnMicrophoneDataCaptured);
// Echo suppression: point the mic at our atomic bAgentSpeaking flag so it skips
// capture entirely (before resampling) while the agent is speaking.
// In Server VAD + interruption mode, disable echo suppression so the server
// receives the user's voice even during agent playback — the server's own VAD
// handles echo filtering and interruption detection.
if (TurnMode == EElevenLabsTurnMode::Server && bAllowInterruption)
if (TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Server && bAllowInterruption)
{
Mic->EchoSuppressFlag = nullptr;
}
@ -285,16 +285,16 @@ void UElevenLabsConversationalAgentComponent::StartListening()
Mic->StartCapture();
const double T = TurnStartTime - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Log, TEXT("[T+%.2fs] [Turn %d] Mic opened — user speaking."), T, TurnIndex);
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("[T+%.2fs] [Turn %d] Mic opened — user speaking."), T, TurnIndex);
}
void UElevenLabsConversationalAgentComponent::StopListening()
void UPS_AI_Agent_Conv_ElevenLabsComponent::StopListening()
{
if (!bIsListening) return;
bIsListening = false;
if (UElevenLabsMicrophoneCaptureComponent* Mic =
GetOwner() ? GetOwner()->FindComponentByClass<UElevenLabsMicrophoneCaptureComponent>() : nullptr)
if (UPS_AI_Agent_MicrophoneCaptureComponent* Mic =
GetOwner() ? GetOwner()->FindComponentByClass<UPS_AI_Agent_MicrophoneCaptureComponent>() : nullptr)
{
Mic->StopCapture();
Mic->OnAudioCaptured.RemoveAll(this);
@ -316,7 +316,7 @@ void UElevenLabsConversationalAgentComponent::StopListening()
{
if (MicAccumulationBuffer.Num() > 0)
{
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("StopListening: discarding %d bytes of accumulated mic audio (collision — server is mid-generation)."),
MicAccumulationBuffer.Num());
}
@ -328,7 +328,7 @@ void UElevenLabsConversationalAgentComponent::StopListening()
MicAccumulationBuffer.Reset();
}
if (WebSocketProxy && TurnMode == EElevenLabsTurnMode::Client)
if (WebSocketProxy && TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Client)
{
WebSocketProxy->SendUserTurnEnd();
}
@ -349,7 +349,7 @@ void UElevenLabsConversationalAgentComponent::StopListening()
const double TurnDuration = TurnStartTime > 0.0 ? TurnEndTime - TurnStartTime : 0.0;
if (bWaitingForAgentResponse)
{
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Mic closed — user spoke %.2fs. Waiting for server response (timeout %.0fs)..."),
T, TurnIndex, TurnDuration, ResponseTimeoutSeconds);
}
@ -357,23 +357,23 @@ void UElevenLabsConversationalAgentComponent::StopListening()
{
// Collision avoidance: StopListening was called from HandleAgentResponseStarted
// while server was already generating — no need to wait or time out.
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Mic closed (collision avoidance) — user spoke %.2fs. Server is already generating Turn %d response."),
T, TurnIndex, TurnDuration, LastClosedTurnIndex);
}
}
void UElevenLabsConversationalAgentComponent::SendTextMessage(const FString& Text)
void UPS_AI_Agent_Conv_ElevenLabsComponent::SendTextMessage(const FString& Text)
{
if (!IsConnected())
{
UE_LOG(LogElevenLabsAgent, Warning, TEXT("SendTextMessage: not connected. Call StartConversation() first."));
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning, TEXT("SendTextMessage: not connected. Call StartConversation() first."));
return;
}
WebSocketProxy->SendTextMessage(Text);
}
void UElevenLabsConversationalAgentComponent::InterruptAgent()
void UPS_AI_Agent_Conv_ElevenLabsComponent::InterruptAgent()
{
bWaitingForAgentResponse = false; // Interrupting — no response expected from previous turn.
if (WebSocketProxy) WebSocketProxy->SendInterrupt();
@ -383,26 +383,26 @@ void UElevenLabsConversationalAgentComponent::InterruptAgent()
// ─────────────────────────────────────────────────────────────────────────────
// State queries
// ─────────────────────────────────────────────────────────────────────────────
bool UElevenLabsConversationalAgentComponent::IsConnected() const
bool UPS_AI_Agent_Conv_ElevenLabsComponent::IsConnected() const
{
return WebSocketProxy && WebSocketProxy->IsConnected();
}
const FElevenLabsConversationInfo& UElevenLabsConversationalAgentComponent::GetConversationInfo() const
const FPS_AI_Agent_ConversationInfo_ElevenLabs& UPS_AI_Agent_Conv_ElevenLabsComponent::GetConversationInfo() const
{
static FElevenLabsConversationInfo Empty;
static FPS_AI_Agent_ConversationInfo_ElevenLabs Empty;
return WebSocketProxy ? WebSocketProxy->GetConversationInfo() : Empty;
}
// ─────────────────────────────────────────────────────────────────────────────
// WebSocket event handlers
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::HandleConnected(const FElevenLabsConversationInfo& Info)
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleConnected(const FPS_AI_Agent_ConversationInfo_ElevenLabs& Info)
{
SessionStartTime = FPlatformTime::Seconds();
TurnIndex = 0;
LastClosedTurnIndex = 0;
UE_LOG(LogElevenLabsAgent, Log, TEXT("[T+0.00s] Agent connected. ConversationID=%s"), *Info.ConversationID);
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("[T+0.00s] Agent connected. ConversationID=%s"), *Info.ConversationID);
OnAgentConnected.Broadcast(Info);
// In Client turn mode (push-to-talk), the user controls listening manually via
@ -410,15 +410,15 @@ void UElevenLabsConversationalAgentComponent::HandleConnected(const FElevenLabsC
// permanently and interfere with push-to-talk — the T-release StopListening()
// would close the mic that auto-start opened, leaving the user unable to speak.
// Only auto-start in Server VAD mode where the mic stays open the whole session.
if (bAutoStartListening && TurnMode == EElevenLabsTurnMode::Server)
if (bAutoStartListening && TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Server)
{
StartListening();
}
}
void UElevenLabsConversationalAgentComponent::HandleDisconnected(int32 StatusCode, const FString& Reason)
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleDisconnected(int32 StatusCode, const FString& Reason)
{
UE_LOG(LogElevenLabsAgent, Log, TEXT("Agent disconnected. Code=%d Reason=%s"), StatusCode, *Reason);
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("Agent disconnected. Code=%d Reason=%s"), StatusCode, *Reason);
// ISSUE-13: stop audio playback and clear the queue if the WebSocket drops while the
// agent is speaking. Without this the audio component kept playing buffered PCM after
@ -432,8 +432,8 @@ void UElevenLabsConversationalAgentComponent::HandleDisconnected(int32 StatusCod
GeneratingTickCount = 0;
TurnIndex = 0;
LastClosedTurnIndex = 0;
CurrentEmotion = EElevenLabsEmotion::Neutral;
CurrentEmotionIntensity = EElevenLabsEmotionIntensity::Medium;
CurrentEmotion = EPS_AI_Agent_Emotion::Neutral;
CurrentEmotionIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
{
FScopeLock Lock(&MicSendLock);
MicAccumulationBuffer.Reset();
@ -441,13 +441,13 @@ void UElevenLabsConversationalAgentComponent::HandleDisconnected(int32 StatusCod
OnAgentDisconnected.Broadcast(StatusCode, Reason);
}
void UElevenLabsConversationalAgentComponent::HandleError(const FString& ErrorMessage)
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleError(const FString& ErrorMessage)
{
UE_LOG(LogElevenLabsAgent, Error, TEXT("Agent error: %s"), *ErrorMessage);
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Error, TEXT("Agent error: %s"), *ErrorMessage);
OnAgentError.Broadcast(ErrorMessage);
}
void UElevenLabsConversationalAgentComponent::HandleAudioReceived(const TArray<uint8>& PCMData)
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAudioReceived(const TArray<uint8>& PCMData)
{
const double T = FPlatformTime::Seconds() - SessionStartTime;
const int32 NumSamples = PCMData.Num() / sizeof(int16);
@ -457,7 +457,7 @@ void UElevenLabsConversationalAgentComponent::HandleAudioReceived(const TArray<u
FScopeLock Lock(&AudioQueueLock);
QueueBefore = AudioQueue.Num() / sizeof(int16);
}
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Audio chunk received: %d samples (%.0fms) | AudioQueue before: %d samples (%.0fms)"),
T, LastClosedTurnIndex, NumSamples, DurationMs,
QueueBefore, (static_cast<float>(QueueBefore) / 16000.0f) * 1000.0f);
@ -467,7 +467,7 @@ void UElevenLabsConversationalAgentComponent::HandleAudioReceived(const TArray<u
OnAgentAudioData.Broadcast(PCMData);
}
void UElevenLabsConversationalAgentComponent::HandleTranscript(const FElevenLabsTranscriptSegment& Segment)
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleTranscript(const FPS_AI_Agent_TranscriptSegment_ElevenLabs& Segment)
{
if (bEnableUserTranscript)
{
@ -475,7 +475,7 @@ void UElevenLabsConversationalAgentComponent::HandleTranscript(const FElevenLabs
}
}
void UElevenLabsConversationalAgentComponent::HandleAgentResponse(const FString& ResponseText)
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponse(const FString& ResponseText)
{
// The server sends agent_response when the full text response is complete.
// This is our reliable signal that no more TTS audio chunks will follow.
@ -488,14 +488,14 @@ void UElevenLabsConversationalAgentComponent::HandleAgentResponse(const FString&
}
}
void UElevenLabsConversationalAgentComponent::HandleInterrupted()
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleInterrupted()
{
bWaitingForAgentResponse = false; // Interrupted — no response expected from previous turn.
StopAgentAudio();
OnAgentInterrupted.Broadcast();
}
void UElevenLabsConversationalAgentComponent::HandleAgentResponseStarted()
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponseStarted()
{
// The server has started generating a response (first agent_chat_response_part).
// Set bAgentGenerating BEFORE StopListening so that any StartListening call
@ -512,29 +512,29 @@ void UElevenLabsConversationalAgentComponent::HandleAgentResponseStarted()
// In Server VAD + interruption mode, keep the mic open so the server can
// detect if the user speaks over the agent and send an interruption event.
// The server handles echo filtering and VAD — we just keep streaming audio.
if (TurnMode == EElevenLabsTurnMode::Server && bAllowInterruption)
if (TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Server && bAllowInterruption)
{
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Agent generating — mic stays open (Server VAD + interruption). (%.2fs after turn end)"),
T, LastClosedTurnIndex, LatencyFromTurnEnd);
}
else
{
// Collision: server generating while mic was open — stop mic without flushing.
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Collision — mic was open, stopping. (%.2fs after turn end)"),
T, LastClosedTurnIndex, LatencyFromTurnEnd);
StopListening();
}
}
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Agent generating. (%.2fs after turn end)"),
T, LastClosedTurnIndex, LatencyFromTurnEnd);
OnAgentStartedGenerating.Broadcast();
}
void UElevenLabsConversationalAgentComponent::HandleAgentResponsePart(const FString& PartialText)
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponsePart(const FString& PartialText)
{
if (bEnableAgentPartialResponse)
{
@ -542,55 +542,55 @@ void UElevenLabsConversationalAgentComponent::HandleAgentResponsePart(const FStr
}
}
void UElevenLabsConversationalAgentComponent::HandleClientToolCall(const FElevenLabsClientToolCall& ToolCall)
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleClientToolCall(const FPS_AI_Agent_ClientToolCall_ElevenLabs& ToolCall)
{
// Built-in handler for the "set_emotion" tool: parse emotion + intensity, auto-respond, broadcast.
if (ToolCall.ToolName == TEXT("set_emotion"))
{
// Parse emotion
EElevenLabsEmotion NewEmotion = EElevenLabsEmotion::Neutral;
EPS_AI_Agent_Emotion NewEmotion = EPS_AI_Agent_Emotion::Neutral;
const FString* EmotionStr = ToolCall.Parameters.Find(TEXT("emotion"));
if (EmotionStr)
{
const FString Lower = EmotionStr->ToLower();
if (Lower == TEXT("joy") || Lower == TEXT("happy") || Lower == TEXT("happiness"))
NewEmotion = EElevenLabsEmotion::Joy;
NewEmotion = EPS_AI_Agent_Emotion::Joy;
else if (Lower == TEXT("sadness") || Lower == TEXT("sad"))
NewEmotion = EElevenLabsEmotion::Sadness;
NewEmotion = EPS_AI_Agent_Emotion::Sadness;
else if (Lower == TEXT("anger") || Lower == TEXT("angry"))
NewEmotion = EElevenLabsEmotion::Anger;
NewEmotion = EPS_AI_Agent_Emotion::Anger;
else if (Lower == TEXT("surprise") || Lower == TEXT("surprised"))
NewEmotion = EElevenLabsEmotion::Surprise;
NewEmotion = EPS_AI_Agent_Emotion::Surprise;
else if (Lower == TEXT("fear") || Lower == TEXT("afraid") || Lower == TEXT("scared"))
NewEmotion = EElevenLabsEmotion::Fear;
NewEmotion = EPS_AI_Agent_Emotion::Fear;
else if (Lower == TEXT("disgust") || Lower == TEXT("disgusted"))
NewEmotion = EElevenLabsEmotion::Disgust;
NewEmotion = EPS_AI_Agent_Emotion::Disgust;
else if (Lower == TEXT("neutral"))
NewEmotion = EElevenLabsEmotion::Neutral;
NewEmotion = EPS_AI_Agent_Emotion::Neutral;
else
UE_LOG(LogElevenLabsAgent, Warning, TEXT("Unknown emotion '%s', defaulting to Neutral."), **EmotionStr);
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning, TEXT("Unknown emotion '%s', defaulting to Neutral."), **EmotionStr);
}
// Parse intensity (default: medium)
EElevenLabsEmotionIntensity NewIntensity = EElevenLabsEmotionIntensity::Medium;
EPS_AI_Agent_EmotionIntensity NewIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
const FString* IntensityStr = ToolCall.Parameters.Find(TEXT("intensity"));
if (IntensityStr)
{
const FString Lower = IntensityStr->ToLower();
if (Lower == TEXT("low") || Lower == TEXT("subtle") || Lower == TEXT("light"))
NewIntensity = EElevenLabsEmotionIntensity::Low;
NewIntensity = EPS_AI_Agent_EmotionIntensity::Low;
else if (Lower == TEXT("medium") || Lower == TEXT("moderate") || Lower == TEXT("normal"))
NewIntensity = EElevenLabsEmotionIntensity::Medium;
NewIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
else if (Lower == TEXT("high") || Lower == TEXT("strong") || Lower == TEXT("extreme") || Lower == TEXT("intense"))
NewIntensity = EElevenLabsEmotionIntensity::High;
NewIntensity = EPS_AI_Agent_EmotionIntensity::High;
else
UE_LOG(LogElevenLabsAgent, Warning, TEXT("Unknown intensity '%s', defaulting to Medium."), **IntensityStr);
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning, TEXT("Unknown intensity '%s', defaulting to Medium."), **IntensityStr);
}
CurrentEmotion = NewEmotion;
CurrentEmotionIntensity = NewIntensity;
const double T = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Log, TEXT("[T+%.2fs] Agent emotion changed to: %s (%s)"),
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("[T+%.2fs] Agent emotion changed to: %s (%s)"),
T, *UEnum::GetValueAsString(NewEmotion), *UEnum::GetValueAsString(NewIntensity));
OnAgentEmotionChanged.Broadcast(NewEmotion, NewIntensity);
@ -616,21 +616,21 @@ void UElevenLabsConversationalAgentComponent::HandleClientToolCall(const FEleven
// ─────────────────────────────────────────────────────────────────────────────
// Audio playback
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::InitAudioPlayback()
void UPS_AI_Agent_Conv_ElevenLabsComponent::InitAudioPlayback()
{
AActor* Owner = GetOwner();
if (!Owner) return;
// USoundWaveProcedural lets us push raw PCM data at runtime.
ProceduralSoundWave = NewObject<USoundWaveProcedural>(this);
ProceduralSoundWave->SetSampleRate(ElevenLabsAudio::SampleRate);
ProceduralSoundWave->NumChannels = ElevenLabsAudio::Channels;
ProceduralSoundWave->SetSampleRate(PS_AI_Agent_Audio_ElevenLabs::SampleRate);
ProceduralSoundWave->NumChannels = PS_AI_Agent_Audio_ElevenLabs::Channels;
ProceduralSoundWave->Duration = INDEFINITELY_LOOPING_DURATION;
ProceduralSoundWave->SoundGroup = SOUNDGROUP_Voice;
ProceduralSoundWave->bLooping = false;
// Create the audio component attached to the owner.
AudioPlaybackComponent = NewObject<UAudioComponent>(Owner, TEXT("ElevenLabsAudioPlayback"));
AudioPlaybackComponent = NewObject<UAudioComponent>(Owner, TEXT("PS_AI_Agent_Audio_ElevenLabsPlayback"));
AudioPlaybackComponent->RegisterComponent();
AudioPlaybackComponent->bAutoActivate = false;
AudioPlaybackComponent->SetSound(ProceduralSoundWave);
@ -638,10 +638,10 @@ void UElevenLabsConversationalAgentComponent::InitAudioPlayback()
// When the procedural sound wave needs more audio data, pull from our queue.
ProceduralSoundWave->OnSoundWaveProceduralUnderflow =
FOnSoundWaveProceduralUnderflow::CreateUObject(
this, &UElevenLabsConversationalAgentComponent::OnProceduralUnderflow);
this, &UPS_AI_Agent_Conv_ElevenLabsComponent::OnProceduralUnderflow);
}
void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
void UPS_AI_Agent_Conv_ElevenLabsComponent::OnProceduralUnderflow(
USoundWaveProcedural* InProceduralWave, const int32 SamplesRequired)
{
FScopeLock Lock(&AudioQueueLock);
@ -672,7 +672,7 @@ void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
{
bQueueWasDry = false;
const double T = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] AudioQueue recovered — feeding real data again (%d bytes remaining)."),
T, LastClosedTurnIndex, AudioQueue.Num());
}
@ -684,7 +684,7 @@ void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
{
bQueueWasDry = true;
const double T = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Warning,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
TEXT("[T+%.2fs] [Turn %d] AudioQueue DRY — waiting for next TTS chunk (requested %d samples)."),
T, LastClosedTurnIndex, SamplesRequired);
}
@ -702,7 +702,7 @@ void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
}
}
void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uint8>& PCMData)
void UPS_AI_Agent_Conv_ElevenLabsComponent::EnqueueAgentAudio(const TArray<uint8>& PCMData)
{
{
FScopeLock Lock(&AudioQueueLock);
@ -721,7 +721,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
const double T = AgentSpeakStart - SessionStartTime;
const double LatencyFromTurnEnd = TurnEndTime > 0.0 ? AgentSpeakStart - TurnEndTime : 0.0;
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Agent speaking — first audio chunk. (%.2fs after turn end)"),
T, LastClosedTurnIndex, LatencyFromTurnEnd);
@ -735,7 +735,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
bPreBuffering = true;
PreBufferStartTime = FPlatformTime::Seconds();
const double Tpb2 = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Pre-buffering %dms before starting playback."),
Tpb2, LastClosedTurnIndex, AudioPreBufferMs);
}
@ -752,7 +752,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
const double NowPb = FPlatformTime::Seconds();
const double BufferedMs = (NowPb - PreBufferStartTime) * 1000.0;
const double Tpb3 = NowPb - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Pre-buffer: second chunk arrived (%.0fms buffered). Starting playback."),
Tpb3, LastClosedTurnIndex, BufferedMs);
if (AudioPlaybackComponent && !AudioPlaybackComponent->IsPlaying())
@ -768,7 +768,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
if (AudioPlaybackComponent && !AudioPlaybackComponent->IsPlaying())
{
const double Tbr = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsAgent, Warning,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
TEXT("[T+%.2fs] [Turn %d] Audio component stopped during speech (buffer underrun). Restarting playback."),
Tbr, LastClosedTurnIndex);
AudioPlaybackComponent->Play();
@ -778,7 +778,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
}
}
void UElevenLabsConversationalAgentComponent::StopAgentAudio()
void UPS_AI_Agent_Conv_ElevenLabsComponent::StopAgentAudio()
{
// Flush the ProceduralSoundWave's internal buffer BEFORE stopping.
// QueueAudio() pushes data into the wave's internal ring buffer during
@ -824,7 +824,7 @@ void UElevenLabsConversationalAgentComponent::StopAgentAudio()
const double T = Now - SessionStartTime;
const double AgentSpokeDuration = AgentSpeakStart > 0.0 ? Now - AgentSpeakStart : 0.0;
const double TotalTurnDuration = TurnEndTime > 0.0 ? Now - TurnEndTime : 0.0;
UE_LOG(LogElevenLabsAgent, Log,
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
TEXT("[T+%.2fs] [Turn %d] Agent stopped speaking (spoke %.2fs, full turn round-trip %.2fs)."),
T, LastClosedTurnIndex, AgentSpokeDuration, TotalTurnDuration);
@ -835,7 +835,7 @@ void UElevenLabsConversationalAgentComponent::StopAgentAudio()
// ─────────────────────────────────────────────────────────────────────────────
// Microphone → WebSocket
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured(const TArray<float>& FloatPCM)
void UPS_AI_Agent_Conv_ElevenLabsComponent::OnMicrophoneDataCaptured(const TArray<float>& FloatPCM)
{
if (!IsConnected() || !bIsListening) return;
@ -844,7 +844,7 @@ void UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured(const TAr
// which would confuse the server's VAD and STT.
// In Server VAD + interruption mode, keep sending audio so the server can
// detect the user speaking over the agent and trigger an interruption.
if (bAgentSpeaking && !(TurnMode == EElevenLabsTurnMode::Server && bAllowInterruption))
if (bAgentSpeaking && !(TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Server && bAllowInterruption))
{
return;
}
@ -867,7 +867,7 @@ void UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured(const TAr
}
}
TArray<uint8> UElevenLabsConversationalAgentComponent::FloatPCMToInt16Bytes(const TArray<float>& FloatPCM)
TArray<uint8> UPS_AI_Agent_Conv_ElevenLabsComponent::FloatPCMToInt16Bytes(const TArray<float>& FloatPCM)
{
TArray<uint8> Out;
Out.Reserve(FloatPCM.Num() * 2);

View File

@ -1,18 +1,18 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsFacialExpressionComponent.h"
#include "ElevenLabsConversationalAgentComponent.h"
#include "ElevenLabsEmotionPoseMap.h"
#include "PS_AI_Agent_FacialExpressionComponent.h"
#include "PS_AI_Agent_Conv_ElevenLabsComponent.h"
#include "PS_AI_Agent_EmotionPoseMap.h"
#include "Animation/AnimSequence.h"
#include "Animation/AnimData/IAnimationDataModel.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsFacialExpr, Log, All);
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_FacialExpr, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Construction
// ─────────────────────────────────────────────────────────────────────────────
UElevenLabsFacialExpressionComponent::UElevenLabsFacialExpressionComponent()
UPS_AI_Agent_FacialExpressionComponent::UPS_AI_Agent_FacialExpressionComponent()
{
PrimaryComponentTick.bCanEverTick = true;
PrimaryComponentTick.TickGroup = TG_PrePhysics;
@ -22,32 +22,32 @@ UElevenLabsFacialExpressionComponent::UElevenLabsFacialExpressionComponent()
// BeginPlay / EndPlay
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsFacialExpressionComponent::BeginPlay()
void UPS_AI_Agent_FacialExpressionComponent::BeginPlay()
{
Super::BeginPlay();
AActor* Owner = GetOwner();
if (!Owner)
{
UE_LOG(LogElevenLabsFacialExpr, Warning, TEXT("No owner actor — facial expressions disabled."));
UE_LOG(LogPS_AI_Agent_FacialExpr, Warning, TEXT("No owner actor — facial expressions disabled."));
return;
}
// Find and bind to agent component
auto* Agent = Owner->FindComponentByClass<UElevenLabsConversationalAgentComponent>();
auto* Agent = Owner->FindComponentByClass<UPS_AI_Agent_Conv_ElevenLabsComponent>();
if (Agent)
{
AgentComponent = Agent;
Agent->OnAgentEmotionChanged.AddDynamic(
this, &UElevenLabsFacialExpressionComponent::OnEmotionChanged);
this, &UPS_AI_Agent_FacialExpressionComponent::OnEmotionChanged);
UE_LOG(LogElevenLabsFacialExpr, Log,
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
TEXT("Facial expression bound to agent component on %s."), *Owner->GetName());
}
else
{
UE_LOG(LogElevenLabsFacialExpr, Warning,
TEXT("No ElevenLabsConversationalAgentComponent found on %s — "
UE_LOG(LogPS_AI_Agent_FacialExpr, Warning,
TEXT("No PS_AI_Agent_Conv_ElevenLabsComponent found on %s — "
"facial expression will not respond to emotion changes."),
*Owner->GetName());
}
@ -63,17 +63,17 @@ void UElevenLabsFacialExpressionComponent::BeginPlay()
{
ActivePlaybackTime = 0.0f;
CrossfadeAlpha = 1.0f; // No crossfade needed on startup
UE_LOG(LogElevenLabsFacialExpr, Log,
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
TEXT("Auto-started default emotion anim: %s"), *ActiveAnim->GetName());
}
}
void UElevenLabsFacialExpressionComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
void UPS_AI_Agent_FacialExpressionComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
{
if (AgentComponent.IsValid())
{
AgentComponent->OnAgentEmotionChanged.RemoveDynamic(
this, &UElevenLabsFacialExpressionComponent::OnEmotionChanged);
this, &UPS_AI_Agent_FacialExpressionComponent::OnEmotionChanged);
}
Super::EndPlay(EndPlayReason);
@ -83,11 +83,11 @@ void UElevenLabsFacialExpressionComponent::EndPlay(const EEndPlayReason::Type En
// Validation
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsFacialExpressionComponent::ValidateEmotionPoses()
void UPS_AI_Agent_FacialExpressionComponent::ValidateEmotionPoses()
{
if (!EmotionPoseMap || EmotionPoseMap->EmotionPoses.Num() == 0)
{
UE_LOG(LogElevenLabsFacialExpr, Log,
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
TEXT("No emotion poses assigned in EmotionPoseMap — facial expressions disabled."));
return;
}
@ -95,13 +95,13 @@ void UElevenLabsFacialExpressionComponent::ValidateEmotionPoses()
int32 AnimCount = 0;
for (const auto& EmotionPair : EmotionPoseMap->EmotionPoses)
{
const FElevenLabsEmotionPoseSet& PoseSet = EmotionPair.Value;
const FPS_AI_Agent_EmotionPoseSet& PoseSet = EmotionPair.Value;
if (PoseSet.Normal) ++AnimCount;
if (PoseSet.Medium) ++AnimCount;
if (PoseSet.Extreme) ++AnimCount;
}
UE_LOG(LogElevenLabsFacialExpr, Log,
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
TEXT("=== Emotion poses: %d emotions, %d anim slots available ==="),
EmotionPoseMap->EmotionPoses.Num(), AnimCount);
}
@ -110,21 +110,21 @@ void UElevenLabsFacialExpressionComponent::ValidateEmotionPoses()
// Find AnimSequence for emotion + intensity (with fallback)
// ─────────────────────────────────────────────────────────────────────────────
UAnimSequence* UElevenLabsFacialExpressionComponent::FindAnimForEmotion(
EElevenLabsEmotion Emotion, EElevenLabsEmotionIntensity Intensity) const
UAnimSequence* UPS_AI_Agent_FacialExpressionComponent::FindAnimForEmotion(
EPS_AI_Agent_Emotion Emotion, EPS_AI_Agent_EmotionIntensity Intensity) const
{
if (!EmotionPoseMap) return nullptr;
const FElevenLabsEmotionPoseSet* PoseSet = EmotionPoseMap->EmotionPoses.Find(Emotion);
const FPS_AI_Agent_EmotionPoseSet* PoseSet = EmotionPoseMap->EmotionPoses.Find(Emotion);
if (!PoseSet) return nullptr;
// Direct match
UAnimSequence* Anim = nullptr;
switch (Intensity)
{
case EElevenLabsEmotionIntensity::Low: Anim = PoseSet->Normal; break;
case EElevenLabsEmotionIntensity::Medium: Anim = PoseSet->Medium; break;
case EElevenLabsEmotionIntensity::High: Anim = PoseSet->Extreme; break;
case EPS_AI_Agent_EmotionIntensity::Low: Anim = PoseSet->Normal; break;
case EPS_AI_Agent_EmotionIntensity::Medium: Anim = PoseSet->Medium; break;
case EPS_AI_Agent_EmotionIntensity::High: Anim = PoseSet->Extreme; break;
}
if (Anim) return Anim;
@ -141,7 +141,7 @@ UAnimSequence* UElevenLabsFacialExpressionComponent::FindAnimForEmotion(
// Evaluate all FloatCurves from an AnimSequence at a given time
// ─────────────────────────────────────────────────────────────────────────────
TMap<FName, float> UElevenLabsFacialExpressionComponent::EvaluateAnimCurves(
TMap<FName, float> UPS_AI_Agent_FacialExpressionComponent::EvaluateAnimCurves(
UAnimSequence* AnimSeq, float Time) const
{
TMap<FName, float> CurveValues;
@ -167,8 +167,8 @@ TMap<FName, float> UElevenLabsFacialExpressionComponent::EvaluateAnimCurves(
// Emotion change handler
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsFacialExpressionComponent::OnEmotionChanged(
EElevenLabsEmotion Emotion, EElevenLabsEmotionIntensity Intensity)
void UPS_AI_Agent_FacialExpressionComponent::OnEmotionChanged(
EPS_AI_Agent_Emotion Emotion, EPS_AI_Agent_EmotionIntensity Intensity)
{
if (Emotion == ActiveEmotion && Intensity == ActiveEmotionIntensity)
return; // No change
@ -190,7 +190,7 @@ void UElevenLabsFacialExpressionComponent::OnEmotionChanged(
// Begin crossfade
CrossfadeAlpha = 0.0f;
UE_LOG(LogElevenLabsFacialExpr, Log,
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
TEXT("Emotion changed: %s (%s) — anim: %s, crossfading over %.1fs..."),
*UEnum::GetValueAsString(Emotion), *UEnum::GetValueAsString(Intensity),
NewAnim ? *NewAnim->GetName() : TEXT("(none)"), EmotionBlendDuration);
@ -200,7 +200,7 @@ void UElevenLabsFacialExpressionComponent::OnEmotionChanged(
// Tick — play emotion animation and crossfade
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsFacialExpressionComponent::TickComponent(
void UPS_AI_Agent_FacialExpressionComponent::TickComponent(
float DeltaTime, ELevelTick TickType, FActorComponentTickFunction* ThisTickFunction)
{
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
@ -286,7 +286,7 @@ void UElevenLabsFacialExpressionComponent::TickComponent(
// Mouth curve classification
// ─────────────────────────────────────────────────────────────────────────────
bool UElevenLabsFacialExpressionComponent::IsMouthCurve(const FName& CurveName)
bool UPS_AI_Agent_FacialExpressionComponent::IsMouthCurve(const FName& CurveName)
{
const FString Name = CurveName.ToString().ToLower();
return Name.Contains(TEXT("jaw"))

View File

@ -1,9 +1,9 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsLipSyncComponent.h"
#include "ElevenLabsFacialExpressionComponent.h"
#include "ElevenLabsLipSyncPoseMap.h"
#include "ElevenLabsConversationalAgentComponent.h"
#include "PS_AI_Agent_LipSyncComponent.h"
#include "PS_AI_Agent_FacialExpressionComponent.h"
#include "PS_AI_Agent_LipSyncPoseMap.h"
#include "PS_AI_Agent_Conv_ElevenLabsComponent.h"
#include "Components/SkeletalMeshComponent.h"
#include "Engine/SkeletalMesh.h"
#include "Animation/MorphTarget.h"
@ -11,13 +11,13 @@
#include "Animation/AnimData/IAnimationDataModel.h"
#include "GameFramework/Actor.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsLipSync, Log, All);
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_LipSync, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Static data
// ─────────────────────────────────────────────────────────────────────────────
const TArray<FName> UElevenLabsLipSyncComponent::VisemeNames = {
const TArray<FName> UPS_AI_Agent_LipSyncComponent::VisemeNames = {
FName("sil"), FName("PP"), FName("FF"), FName("TH"), FName("DD"),
FName("kk"), FName("CH"), FName("SS"), FName("nn"), FName("RR"),
FName("aa"), FName("E"), FName("ih"), FName("oh"), FName("ou")
@ -43,7 +43,7 @@ static const TSet<FName> ExpressionCurveNames = {
// OVR Viseme → ARKit blendshape mapping.
// Each viseme activates a combination of ARKit morph targets with specific weights.
// These values are tuned for MetaHuman faces and can be adjusted per project.
TMap<FName, TMap<FName, float>> UElevenLabsLipSyncComponent::CreateVisemeToBlendshapeMap()
TMap<FName, TMap<FName, float>> UPS_AI_Agent_LipSyncComponent::CreateVisemeToBlendshapeMap()
{
TMap<FName, TMap<FName, float>> Map;
@ -189,14 +189,14 @@ TMap<FName, TMap<FName, float>> UElevenLabsLipSyncComponent::CreateVisemeToBlend
return Map;
}
const TMap<FName, TMap<FName, float>> UElevenLabsLipSyncComponent::VisemeToBlendshapeMap =
UElevenLabsLipSyncComponent::CreateVisemeToBlendshapeMap();
const TMap<FName, TMap<FName, float>> UPS_AI_Agent_LipSyncComponent::VisemeToBlendshapeMap =
UPS_AI_Agent_LipSyncComponent::CreateVisemeToBlendshapeMap();
// ─────────────────────────────────────────────────────────────────────────────
// Constructor / Destructor
// ─────────────────────────────────────────────────────────────────────────────
UElevenLabsLipSyncComponent::UElevenLabsLipSyncComponent()
UPS_AI_Agent_LipSyncComponent::UPS_AI_Agent_LipSyncComponent()
{
PrimaryComponentTick.bCanEverTick = true;
PrimaryComponentTick.TickInterval = 1.0f / 60.0f; // 60 fps for smooth animation
@ -211,13 +211,13 @@ UElevenLabsLipSyncComponent::UElevenLabsLipSyncComponent()
SmoothedVisemes.FindOrAdd(FName("sil")) = 1.0f;
}
UElevenLabsLipSyncComponent::~UElevenLabsLipSyncComponent() = default;
UPS_AI_Agent_LipSyncComponent::~UPS_AI_Agent_LipSyncComponent() = default;
// ─────────────────────────────────────────────────────────────────────────────
// Lifecycle
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::BeginPlay()
void UPS_AI_Agent_LipSyncComponent::BeginPlay()
{
Super::BeginPlay();
@ -226,54 +226,54 @@ void UElevenLabsLipSyncComponent::BeginPlay()
Settings.FFTSize = Audio::FSpectrumAnalyzerSettings::EFFTSize::Medium_512;
Settings.WindowType = Audio::EWindowType::Hann;
SpectrumAnalyzer = MakeUnique<Audio::FSpectrumAnalyzer>(
Settings, static_cast<float>(ElevenLabsAudio::SampleRate));
Settings, static_cast<float>(PS_AI_Agent_Audio_ElevenLabs::SampleRate));
// Auto-discover the agent component on the same actor
AActor* Owner = GetOwner();
if (!Owner) return;
UElevenLabsConversationalAgentComponent* Agent =
Owner->FindComponentByClass<UElevenLabsConversationalAgentComponent>();
UPS_AI_Agent_Conv_ElevenLabsComponent* Agent =
Owner->FindComponentByClass<UPS_AI_Agent_Conv_ElevenLabsComponent>();
if (Agent)
{
AgentComponent = Agent;
AudioDataHandle = Agent->OnAgentAudioData.AddUObject(
this, &UElevenLabsLipSyncComponent::OnAudioChunkReceived);
this, &UPS_AI_Agent_LipSyncComponent::OnAudioChunkReceived);
// Bind to text response delegates for text-driven lip sync.
// Partial text (streaming) provides text BEFORE audio arrives.
// Full text provides the complete sentence (arrives just after audio).
Agent->OnAgentPartialResponse.AddDynamic(
this, &UElevenLabsLipSyncComponent::OnPartialTextReceived);
this, &UPS_AI_Agent_LipSyncComponent::OnPartialTextReceived);
Agent->OnAgentTextResponse.AddDynamic(
this, &UElevenLabsLipSyncComponent::OnTextResponseReceived);
this, &UPS_AI_Agent_LipSyncComponent::OnTextResponseReceived);
// Bind to interruption/stop events so lip sync resets immediately
// when the agent is cut off or finishes speaking.
Agent->OnAgentInterrupted.AddDynamic(
this, &UElevenLabsLipSyncComponent::OnAgentInterrupted);
this, &UPS_AI_Agent_LipSyncComponent::OnAgentInterrupted);
Agent->OnAgentStoppedSpeaking.AddDynamic(
this, &UElevenLabsLipSyncComponent::OnAgentStopped);
this, &UPS_AI_Agent_LipSyncComponent::OnAgentStopped);
// Enable partial response streaming if not already enabled
Agent->bEnableAgentPartialResponse = true;
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Lip sync bound to agent component on %s (audio + text + interruption)."), *Owner->GetName());
}
else
{
UE_LOG(LogElevenLabsLipSync, Warning,
TEXT("No ElevenLabsConversationalAgentComponent found on %s. Lip sync will not work."),
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
TEXT("No PS_AI_Agent_Conv_ElevenLabsComponent found on %s. Lip sync will not work."),
*Owner->GetName());
}
// Cache the facial expression component for emotion-aware blending.
CachedFacialExprComp = Owner->FindComponentByClass<UElevenLabsFacialExpressionComponent>();
CachedFacialExprComp = Owner->FindComponentByClass<UPS_AI_Agent_FacialExpressionComponent>();
if (CachedFacialExprComp.IsValid())
{
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Facial expression component found — emotion blending enabled (blend=%.2f)."),
EmotionExpressionBlend);
}
@ -286,7 +286,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
Owner->GetComponents<USkeletalMeshComponent>(SkeletalMeshes);
// Log all found skeletal mesh components for debugging
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Found %d SkeletalMeshComponent(s) on %s:"),
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Found %d SkeletalMeshComponent(s) on %s:"),
SkeletalMeshes.Num(), *Owner->GetName());
for (const USkeletalMeshComponent* Mesh : SkeletalMeshes)
{
@ -297,7 +297,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
{
MorphCount = Mesh->GetSkeletalMeshAsset()->GetMorphTargets().Num();
}
UE_LOG(LogElevenLabsLipSync, Log, TEXT(" - '%s' (%d morph targets)"),
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT(" - '%s' (%d morph targets)"),
*Mesh->GetName(), MorphCount);
}
}
@ -308,7 +308,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
if (Mesh && Mesh->GetFName().ToString().Contains(TEXT("Face")))
{
TargetMesh = Mesh;
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Auto-detected face mesh by name: %s"), *Mesh->GetName());
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Auto-detected face mesh by name: %s"), *Mesh->GetName());
break;
}
}
@ -322,7 +322,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
&& Mesh->GetSkeletalMeshAsset()->GetMorphTargets().Num() > 0)
{
TargetMesh = Mesh;
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Auto-detected face mesh by morph targets: %s (%d morphs)"),
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Auto-detected face mesh by morph targets: %s (%d morphs)"),
*Mesh->GetName(), Mesh->GetSkeletalMeshAsset()->GetMorphTargets().Num());
break;
}
@ -337,7 +337,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
if (Mesh)
{
TargetMesh = Mesh;
UE_LOG(LogElevenLabsLipSync, Warning,
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
TEXT("No mesh with morph targets found. Using fallback: %s (lip sync may not work)"),
*Mesh->GetName());
break;
@ -347,7 +347,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
if (!TargetMesh)
{
UE_LOG(LogElevenLabsLipSync, Warning,
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
TEXT("No SkeletalMeshComponent found on %s. Set TargetMesh manually or use GetCurrentBlendshapes() in Blueprint."),
*Owner->GetName());
}
@ -357,7 +357,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
if (TargetMesh && TargetMesh->GetSkeletalMeshAsset())
{
const TArray<UMorphTarget*>& MorphTargets = TargetMesh->GetSkeletalMeshAsset()->GetMorphTargets();
UE_LOG(LogElevenLabsLipSync, Log, TEXT("TargetMesh '%s' has %d morph targets."),
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("TargetMesh '%s' has %d morph targets."),
*TargetMesh->GetName(), MorphTargets.Num());
// Log first 20 morph target names to verify ARKit naming
@ -374,7 +374,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
}
if (Count > 0)
{
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Morph target sample: %s"), *Names);
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Morph target sample: %s"), *Names);
}
// Verify our blendshape names exist as morph targets on this mesh
@ -390,7 +390,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
break;
}
}
UE_LOG(LogElevenLabsLipSync, Log, TEXT(" Morph target '%s': %s"),
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT(" Morph target '%s': %s"),
*TestName.ToString(), bFound ? TEXT("FOUND") : TEXT("NOT FOUND"));
}
}
@ -403,12 +403,12 @@ void UElevenLabsLipSyncComponent::BeginPlay()
if (MorphCount == 0)
{
bUseCurveMode = true;
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("No morph targets found — switching to MetaHuman curve mode (CTRL_expressions_*)."));
}
else
{
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Found %d morph targets — using standard SetMorphTarget mode."), MorphCount);
}
}
@ -422,14 +422,14 @@ void UElevenLabsLipSyncComponent::BeginPlay()
// Phoneme pose curve extraction
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAnimSequence* AnimSeq)
void UPS_AI_Agent_LipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAnimSequence* AnimSeq)
{
if (!AnimSeq) return;
const IAnimationDataModel* DataModel = AnimSeq->GetDataModel();
if (!DataModel)
{
UE_LOG(LogElevenLabsLipSync, Warning,
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
TEXT("Pose '%s' (%s): No DataModel available — skipping."),
*VisemeName.ToString(), *AnimSeq->GetName());
return;
@ -452,7 +452,7 @@ void UElevenLabsLipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAn
if (!bPosesUseCTRLNaming && PoseExtractedCurveMap.Num() == 0 && CurveValues.Num() == 1)
{
bPosesUseCTRLNaming = CurveName.ToString().StartsWith(TEXT("CTRL_"));
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Pose curve naming detected: %s (from curve '%s')"),
bPosesUseCTRLNaming ? TEXT("CTRL_expressions_*") : TEXT("ARKit / other"),
*CurveName.ToString());
@ -462,7 +462,7 @@ void UElevenLabsLipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAn
if (CurveValues.Num() > 0)
{
PoseExtractedCurveMap.Add(VisemeName, MoveTemp(CurveValues));
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Pose '%s' (%s): Extracted %d non-zero curves."),
*VisemeName.ToString(), *AnimSeq->GetName(),
PoseExtractedCurveMap[VisemeName].Num());
@ -471,25 +471,25 @@ void UElevenLabsLipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAn
{
// Still add an empty map so we know this viseme was assigned (silence pose)
PoseExtractedCurveMap.Add(VisemeName, TMap<FName, float>());
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Pose '%s' (%s): All curves are zero — neutral/silence pose."),
*VisemeName.ToString(), *AnimSeq->GetName());
}
}
void UElevenLabsLipSyncComponent::InitializePoseMappings()
void UPS_AI_Agent_LipSyncComponent::InitializePoseMappings()
{
PoseExtractedCurveMap.Reset();
bPosesUseCTRLNaming = false;
if (!PoseMap)
{
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("No PoseMap assigned — using hardcoded ARKit mapping."));
return;
}
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Loading pose map: %s"), *PoseMap->GetName());
// Map each PoseMap property to its corresponding OVR viseme name
@ -524,18 +524,18 @@ void UElevenLabsLipSyncComponent::InitializePoseMappings()
if (AssignedCount > 0)
{
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("=== Pose-based lip sync: %d/%d poses assigned, %d curve mappings extracted ==="),
AssignedCount, Mappings.Num(), PoseExtractedCurveMap.Num());
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Curve naming: %s"),
bPosesUseCTRLNaming ? TEXT("CTRL_expressions_* (MetaHuman native)") : TEXT("ARKit / standard"));
if (bPosesUseCTRLNaming)
{
UE_LOG(LogElevenLabsLipSync, Warning,
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
TEXT("IMPORTANT: Poses use CTRL_expressions_* curves. "
"Move the ElevenLabs Lip Sync AnimNode AFTER mh_arkit_mapping_pose "
"Move the PS AI Agent Lip Sync AnimNode AFTER mh_arkit_mapping_pose "
"in the Face AnimBP for correct results."));
}
@ -553,7 +553,7 @@ void UElevenLabsLipSyncComponent::InitializePoseMappings()
*Pair.Key.ToString(), Pair.Value);
if (++Count >= 8) { CurveList += TEXT(" ..."); break; }
}
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT(" Sample '%s': [%s]"),
*PoseEntry.Key.ToString(), *CurveList);
break; // Only log one sample
@ -562,13 +562,13 @@ void UElevenLabsLipSyncComponent::InitializePoseMappings()
}
else
{
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("No phoneme pose AnimSequences assigned — using hardcoded ARKit mapping."));
}
}
void UElevenLabsLipSyncComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
void UPS_AI_Agent_LipSyncComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
{
// Unbind from agent component
if (AgentComponent.IsValid())
@ -579,13 +579,13 @@ void UElevenLabsLipSyncComponent::EndPlay(const EEndPlayReason::Type EndPlayReas
AudioDataHandle.Reset();
}
AgentComponent->OnAgentPartialResponse.RemoveDynamic(
this, &UElevenLabsLipSyncComponent::OnPartialTextReceived);
this, &UPS_AI_Agent_LipSyncComponent::OnPartialTextReceived);
AgentComponent->OnAgentTextResponse.RemoveDynamic(
this, &UElevenLabsLipSyncComponent::OnTextResponseReceived);
this, &UPS_AI_Agent_LipSyncComponent::OnTextResponseReceived);
AgentComponent->OnAgentInterrupted.RemoveDynamic(
this, &UElevenLabsLipSyncComponent::OnAgentInterrupted);
this, &UPS_AI_Agent_LipSyncComponent::OnAgentInterrupted);
AgentComponent->OnAgentStoppedSpeaking.RemoveDynamic(
this, &UElevenLabsLipSyncComponent::OnAgentStopped);
this, &UPS_AI_Agent_LipSyncComponent::OnAgentStopped);
}
AgentComponent.Reset();
SpectrumAnalyzer.Reset();
@ -597,7 +597,7 @@ void UElevenLabsLipSyncComponent::EndPlay(const EEndPlayReason::Type EndPlayReas
// Tick — smooth visemes and apply morph targets
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::TickComponent(float DeltaTime, ELevelTick TickType,
void UPS_AI_Agent_LipSyncComponent::TickComponent(float DeltaTime, ELevelTick TickType,
FActorComponentTickFunction* ThisTickFunction)
{
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
@ -626,7 +626,7 @@ void UElevenLabsLipSyncComponent::TickComponent(float DeltaTime, ELevelTick Tick
// Timeout — start playback with spectral shapes as fallback
bWaitingForText = false;
PlaybackTimer = 0.0f;
UE_LOG(LogElevenLabsLipSync, Warning,
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
TEXT("Text wait timeout (%.0fms). Starting lip sync with spectral shapes (Queue=%d)."),
WaitElapsed * 1000.0, VisemeQueue.Num());
}
@ -913,18 +913,18 @@ void UElevenLabsLipSyncComponent::TickComponent(float DeltaTime, ELevelTick Tick
// Interruption / stop handlers
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::OnAgentInterrupted()
void UPS_AI_Agent_LipSyncComponent::OnAgentInterrupted()
{
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Agent interrupted — resetting lip sync to neutral."));
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Agent interrupted — resetting lip sync to neutral."));
ResetToNeutral();
}
void UElevenLabsLipSyncComponent::OnAgentStopped()
void UPS_AI_Agent_LipSyncComponent::OnAgentStopped()
{
// Don't clear text state here — it's already handled by TickComponent's
// "queue runs dry" logic which checks bFullTextReceived.
// Just clear the queues so the mouth returns to neutral immediately.
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Agent stopped speaking — clearing lip sync queues."));
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Agent stopped speaking — clearing lip sync queues."));
VisemeQueue.Reset();
AmplitudeQueue.Reset();
PlaybackTimer = 0.0f;
@ -938,7 +938,7 @@ void UElevenLabsLipSyncComponent::OnAgentStopped()
TotalActiveFramesSeen = 0;
}
void UElevenLabsLipSyncComponent::ResetToNeutral()
void UPS_AI_Agent_LipSyncComponent::ResetToNeutral()
{
// Clear all queued viseme and amplitude data
VisemeQueue.Reset();
@ -978,7 +978,7 @@ void UElevenLabsLipSyncComponent::ResetToNeutral()
// Audio analysis
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMData)
void UPS_AI_Agent_LipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMData)
{
if (!SpectrumAnalyzer) return;
@ -989,7 +989,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
static bool bFirstChunkLogged = false;
if (!bFirstChunkLogged)
{
UE_LOG(LogElevenLabsLipSync, Verbose, TEXT("First audio chunk received: %d bytes (%d samples)"), PCMData.Num(), NumSamples);
UE_LOG(LogPS_AI_Agent_LipSync, Verbose, TEXT("First audio chunk received: %d bytes (%d samples)"), PCMData.Num(), NumSamples);
bFirstChunkLogged = true;
}
@ -1275,7 +1275,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
if (PseudoCount > 0)
{
UE_LOG(LogElevenLabsLipSync, Verbose,
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
TEXT("Pseudo-speech: %d active frames (%d syllables)"),
PseudoCount, (PseudoCount + SyllableFrames - 1) / SyllableFrames);
}
@ -1309,7 +1309,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
if (FixedCount > 0)
{
UE_LOG(LogElevenLabsLipSync, Verbose,
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
TEXT("Late start fix: overrode %d leading silent frames with min amplitude %.2f"),
FixedCount, MinStartAmplitude);
}
@ -1325,7 +1325,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
else
ApplyTextVisemesToQueue();
PlaybackTimer = 0.0f;
UE_LOG(LogElevenLabsLipSync, Verbose,
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
TEXT("Text already available (%d visemes). Starting lip sync immediately."),
TextVisemeSequence.Num());
}
@ -1335,7 +1335,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
bWaitingForText = true;
WaitingForTextStartTime = FPlatformTime::Seconds();
PlaybackTimer = 0.0f;
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Waiting for text before starting lip sync (%d frames queued)."),
WindowsQueued);
}
@ -1349,7 +1349,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
ApplyTextVisemesToQueue();
}
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Audio chunk: %d samples → %d windows | Amp=[%.2f-%.2f] | Queue=%d (%.1fs) | TextVisemes=%d"),
NumSamples, WindowsQueued,
MinAmp, MaxAmp, VisemeQueue.Num(),
@ -1360,7 +1360,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
// Text-driven lip sync
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::OnPartialTextReceived(const FString& PartialText)
void UPS_AI_Agent_LipSyncComponent::OnPartialTextReceived(const FString& PartialText)
{
// If the previous utterance's full text was already received,
// this partial text belongs to a NEW utterance — start fresh.
@ -1378,7 +1378,7 @@ void UElevenLabsLipSyncComponent::OnPartialTextReceived(const FString& PartialTe
// Convert accumulated text to viseme sequence progressively
ConvertTextToVisemes(AccumulatedText);
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Partial text: \"%s\" → %d visemes (accumulated: \"%s\")"),
*PartialText, TextVisemeSequence.Num(), *AccumulatedText);
@ -1397,20 +1397,20 @@ void UElevenLabsLipSyncComponent::OnPartialTextReceived(const FString& PartialTe
bWaitingForText = false;
PlaybackTimer = 0.0f; // Start consuming now
const double WaitElapsed = FPlatformTime::Seconds() - WaitingForTextStartTime;
UE_LOG(LogElevenLabsLipSync, Verbose,
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
TEXT("Text arrived after %.0fms wait. Starting lip sync playback."),
WaitElapsed * 1000.0);
}
}
void UElevenLabsLipSyncComponent::OnTextResponseReceived(const FString& ResponseText)
void UPS_AI_Agent_LipSyncComponent::OnTextResponseReceived(const FString& ResponseText)
{
// Full text arrived — use it as the definitive source
bFullTextReceived = true;
AccumulatedText = ResponseText;
ConvertTextToVisemes(ResponseText);
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Full text: \"%s\" → %d visemes"), *ResponseText, TextVisemeSequence.Num());
// Apply to any remaining queued frames (or extend timeline in pose mode)
@ -1428,7 +1428,7 @@ void UElevenLabsLipSyncComponent::OnTextResponseReceived(const FString& Response
bWaitingForText = false;
PlaybackTimer = 0.0f;
const double WaitElapsed = FPlatformTime::Seconds() - WaitingForTextStartTime;
UE_LOG(LogElevenLabsLipSync, Verbose,
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
TEXT("Full text arrived after %.0fms wait. Starting lip sync playback."),
WaitElapsed * 1000.0);
}
@ -1443,7 +1443,7 @@ void UElevenLabsLipSyncComponent::OnTextResponseReceived(const FString& Response
VisSeq += V.ToString();
if (++Count >= 50) { VisSeq += TEXT(" ..."); break; }
}
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Viseme sequence (%d): [%s]"),
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Viseme sequence (%d): [%s]"),
TextVisemeSequence.Num(), *VisSeq);
}
@ -1453,7 +1453,7 @@ void UElevenLabsLipSyncComponent::OnTextResponseReceived(const FString& Response
// persists and corrupts the next utterance's partial text accumulation.
}
void UElevenLabsLipSyncComponent::ConvertTextToVisemes(const FString& Text)
void UPS_AI_Agent_LipSyncComponent::ConvertTextToVisemes(const FString& Text)
{
TextVisemeSequence.Reset();
@ -1835,7 +1835,7 @@ void UElevenLabsLipSyncComponent::ConvertTextToVisemes(const FString& Text)
}
}
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Syllable filter: %d raw → %d visible (removed %d invisible consonants/sil)"),
RawCount, Filtered.Num(), RawCount - Filtered.Num());
@ -1866,7 +1866,7 @@ static float GetVisemeDurationWeight(const FName& Viseme)
return 1.0f;
}
void UElevenLabsLipSyncComponent::ApplyTextVisemesToQueue()
void UPS_AI_Agent_LipSyncComponent::ApplyTextVisemesToQueue()
{
if (TextVisemeSequence.Num() == 0 || VisemeQueue.Num() == 0) return;
@ -1980,7 +1980,7 @@ void UElevenLabsLipSyncComponent::ApplyTextVisemesToQueue()
const float FinalRatio = static_cast<float>(ActiveFrames) / static_cast<float>(SeqNum);
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Applied %d syllable-visemes to %d active frames (of %d total) — %.1f frames/viseme (%.0fms/viseme)"),
SeqNum, ActiveFrames, VisemeQueue.Num(),
FinalRatio, FinalRatio * 32.0f);
@ -1990,7 +1990,7 @@ void UElevenLabsLipSyncComponent::ApplyTextVisemesToQueue()
// Decoupled viseme timeline (pose mode)
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::BuildVisemeTimeline()
void UPS_AI_Agent_LipSyncComponent::BuildVisemeTimeline()
{
if (TextVisemeSequence.Num() == 0) return;
@ -2063,14 +2063,14 @@ void UElevenLabsLipSyncComponent::BuildVisemeTimeline()
bVisemeTimelineActive = true;
bTextVisemesApplied = true;
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("Built viseme timeline: %d entries (from %d, max %d), audio=%.0fms, scale=%.2f → %.0fms/viseme avg"),
FinalSequence.Num(), TextVisemeSequence.Num(), MaxVisemes,
AudioDurationSec * 1000.0f, Scale,
(FinalSequence.Num() > 0) ? (CursorSec * 1000.0f / FinalSequence.Num()) : 0.0f);
}
void UElevenLabsLipSyncComponent::AnalyzeSpectrum()
void UPS_AI_Agent_LipSyncComponent::AnalyzeSpectrum()
{
if (!SpectrumAnalyzer) return;
@ -2086,14 +2086,14 @@ void UElevenLabsLipSyncComponent::AnalyzeSpectrum()
const float TotalEnergy = VoiceEnergy + F1Energy + F2Energy + F3Energy + SibilantEnergy;
UE_LOG(LogElevenLabsLipSync, Verbose,
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
TEXT("Spectrum: Total=%.4f F1=%.4f F2=%.4f F3=%.4f Sibilant=%.4f"),
TotalEnergy, F1Energy, F2Energy, F3Energy, SibilantEnergy);
EstimateVisemes(TotalEnergy, F1Energy, F2Energy, F3Energy, SibilantEnergy);
}
float UElevenLabsLipSyncComponent::GetBandEnergy(float LowFreq, float HighFreq, int32 NumSamples) const
float UPS_AI_Agent_LipSyncComponent::GetBandEnergy(float LowFreq, float HighFreq, int32 NumSamples) const
{
if (!SpectrumAnalyzer || NumSamples <= 0) return 0.0f;
@ -2113,7 +2113,7 @@ float UElevenLabsLipSyncComponent::GetBandEnergy(float LowFreq, float HighFreq,
// Viseme estimation from spectral analysis
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::EstimateVisemes(float TotalEnergy,
void UPS_AI_Agent_LipSyncComponent::EstimateVisemes(float TotalEnergy,
float F1Energy, float F2Energy, float F3Energy, float SibilantEnergy)
{
// Reset all visemes to zero
@ -2226,7 +2226,7 @@ void UElevenLabsLipSyncComponent::EstimateVisemes(float TotalEnergy,
// Viseme → ARKit blendshape mapping
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::MapVisemesToBlendshapes()
void UPS_AI_Agent_LipSyncComponent::MapVisemesToBlendshapes()
{
CurrentBlendshapes.Reset();
@ -2261,7 +2261,7 @@ void UElevenLabsLipSyncComponent::MapVisemesToBlendshapes()
if (!WarnedVisemes.Contains(VisemeName))
{
WarnedVisemes.Add(VisemeName);
UE_LOG(LogElevenLabsLipSync, Warning,
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
TEXT("No pose assigned for viseme '%s' — falling back to hardcoded ARKit mapping. "
"Mixing pose and hardcoded visemes may produce inconsistent results."),
*VisemeName.ToString());
@ -2325,7 +2325,7 @@ void UElevenLabsLipSyncComponent::MapVisemesToBlendshapes()
// Morph target application
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsLipSyncComponent::ApplyMorphTargets()
void UPS_AI_Agent_LipSyncComponent::ApplyMorphTargets()
{
if (!TargetMesh) return;
@ -2350,14 +2350,14 @@ void UElevenLabsLipSyncComponent::ApplyMorphTargets()
}
if (DebugStr.Len() > 0)
{
UE_LOG(LogElevenLabsLipSync, Verbose, TEXT("%s: %s"),
UE_LOG(LogPS_AI_Agent_LipSync, Verbose, TEXT("%s: %s"),
bUseCurveMode ? TEXT("Curves") : TEXT("Blendshapes"), *DebugStr);
}
}
if (bUseCurveMode)
{
// MetaHuman mode: curves are injected by the ElevenLabs Lip Sync AnimNode
// MetaHuman mode: curves are injected by the PS AI Agent Lip Sync AnimNode
// placed in the Face AnimBP (AnimGraph evaluation context).
// OverrideCurveValue() does NOT work because the AnimGraph resets curves
// before the Control Rig reads them. The custom AnimNode injects curves
@ -2365,9 +2365,9 @@ void UElevenLabsLipSyncComponent::ApplyMorphTargets()
static bool bLoggedOnce = false;
if (!bLoggedOnce)
{
UE_LOG(LogElevenLabsLipSync, Log,
UE_LOG(LogPS_AI_Agent_LipSync, Log,
TEXT("MetaHuman curve mode: lip sync curves will be injected by the "
"ElevenLabs Lip Sync AnimNode in the Face AnimBP. "
"PS AI Agent Lip Sync AnimNode in the Face AnimBP. "
"Ensure the node is placed before the Control Rig in the AnimGraph."));
bLoggedOnce = true;
}
@ -2387,7 +2387,7 @@ void UElevenLabsLipSyncComponent::ApplyMorphTargets()
// ARKit → MetaHuman curve name conversion
// ─────────────────────────────────────────────────────────────────────────────
FName UElevenLabsLipSyncComponent::ARKitToMetaHumanCurveName(const FName& ARKitName) const
FName UPS_AI_Agent_LipSyncComponent::ARKitToMetaHumanCurveName(const FName& ARKitName) const
{
// Check cache first to avoid per-frame string allocations
if (const FName* Cached = CurveNameCache.Find(ARKitName))
@ -2411,10 +2411,10 @@ FName UElevenLabsLipSyncComponent::ARKitToMetaHumanCurveName(const FName& ARKitN
FName Result = FName(*(TEXT("CTRL_expressions_") + Name));
// Cache for next frame
const_cast<UElevenLabsLipSyncComponent*>(this)->CurveNameCache.Add(ARKitName, Result);
const_cast<UPS_AI_Agent_LipSyncComponent*>(this)->CurveNameCache.Add(ARKitName, Result);
// Log once per curve name for debugging
UE_LOG(LogElevenLabsLipSync, Verbose, TEXT("Curve mapping: %s → %s"), *ARKitName.ToString(), *Result.ToString());
UE_LOG(LogPS_AI_Agent_LipSync, Verbose, TEXT("Curve mapping: %s → %s"), *ARKitName.ToString(), *Result.ToString());
return Result;
}

View File

@ -1,17 +1,17 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsMicrophoneCaptureComponent.h"
#include "ElevenLabsDefinitions.h"
#include "PS_AI_Agent_MicrophoneCaptureComponent.h"
#include "PS_AI_Agent_Definitions.h"
#include "AudioCaptureCore.h"
#include "Async/Async.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsMic, Log, All);
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_Mic, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Constructor
// ─────────────────────────────────────────────────────────────────────────────
UElevenLabsMicrophoneCaptureComponent::UElevenLabsMicrophoneCaptureComponent()
UPS_AI_Agent_MicrophoneCaptureComponent::UPS_AI_Agent_MicrophoneCaptureComponent()
{
PrimaryComponentTick.bCanEverTick = false;
}
@ -19,7 +19,7 @@ UElevenLabsMicrophoneCaptureComponent::UElevenLabsMicrophoneCaptureComponent()
// ─────────────────────────────────────────────────────────────────────────────
// Lifecycle
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsMicrophoneCaptureComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
void UPS_AI_Agent_MicrophoneCaptureComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
{
StopCapture();
Super::EndPlay(EndPlayReason);
@ -28,11 +28,11 @@ void UElevenLabsMicrophoneCaptureComponent::EndPlay(const EEndPlayReason::Type E
// ─────────────────────────────────────────────────────────────────────────────
// Capture control
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsMicrophoneCaptureComponent::StartCapture()
void UPS_AI_Agent_MicrophoneCaptureComponent::StartCapture()
{
if (bCapturing)
{
UE_LOG(LogElevenLabsMic, Warning, TEXT("StartCapture called while already capturing."));
UE_LOG(LogPS_AI_Agent_Mic, Warning, TEXT("StartCapture called while already capturing."));
return;
}
@ -47,7 +47,7 @@ void UElevenLabsMicrophoneCaptureComponent::StartCapture()
if (!AudioCapture.OpenAudioCaptureStream(DeviceParams, MoveTemp(CaptureCallback), 1024))
{
UE_LOG(LogElevenLabsMic, Error, TEXT("Failed to open default audio capture stream."));
UE_LOG(LogPS_AI_Agent_Mic, Error, TEXT("Failed to open default audio capture stream."));
return;
}
@ -57,36 +57,36 @@ void UElevenLabsMicrophoneCaptureComponent::StartCapture()
{
DeviceSampleRate = DeviceInfo.PreferredSampleRate;
DeviceChannels = DeviceInfo.InputChannels;
UE_LOG(LogElevenLabsMic, Log, TEXT("Capture device: %s | Rate=%d | Channels=%d"),
UE_LOG(LogPS_AI_Agent_Mic, Log, TEXT("Capture device: %s | Rate=%d | Channels=%d"),
*DeviceInfo.DeviceName, DeviceSampleRate, DeviceChannels);
}
AudioCapture.StartStream();
bCapturing = true;
UE_LOG(LogElevenLabsMic, Log, TEXT("Audio capture started."));
UE_LOG(LogPS_AI_Agent_Mic, Log, TEXT("Audio capture started."));
}
void UElevenLabsMicrophoneCaptureComponent::StopCapture()
void UPS_AI_Agent_MicrophoneCaptureComponent::StopCapture()
{
if (!bCapturing) return;
AudioCapture.StopStream();
AudioCapture.CloseStream();
bCapturing = false;
UE_LOG(LogElevenLabsMic, Log, TEXT("Audio capture stopped."));
UE_LOG(LogPS_AI_Agent_Mic, Log, TEXT("Audio capture stopped."));
}
// ─────────────────────────────────────────────────────────────────────────────
// Audio callback (background thread)
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsMicrophoneCaptureComponent::OnAudioGenerate(
void UPS_AI_Agent_MicrophoneCaptureComponent::OnAudioGenerate(
const void* InAudio, int32 NumFrames,
int32 InNumChannels, int32 InSampleRate,
double StreamTime, bool bOverflow)
{
if (bOverflow)
{
UE_LOG(LogElevenLabsMic, Verbose, TEXT("Audio capture buffer overflow."));
UE_LOG(LogPS_AI_Agent_Mic, Verbose, TEXT("Audio capture buffer overflow."));
}
// Echo suppression: skip resampling + broadcasting entirely when agent is speaking.
@ -129,11 +129,11 @@ void UElevenLabsMicrophoneCaptureComponent::OnAudioGenerate(
// ─────────────────────────────────────────────────────────────────────────────
// Resampling
// ─────────────────────────────────────────────────────────────────────────────
TArray<float> UElevenLabsMicrophoneCaptureComponent::ResampleTo16000(
TArray<float> UPS_AI_Agent_MicrophoneCaptureComponent::ResampleTo16000(
const float* InAudio, int32 NumFrames,
int32 InChannels, int32 InSampleRate)
{
const int32 TargetRate = ElevenLabsAudio::SampleRate; // 16000
const int32 TargetRate = PS_AI_Agent_Audio_ElevenLabs::SampleRate; // 16000
// --- Step 1: Downmix to mono ---
// NOTE: NumFrames is the number of audio frames (not total samples).

View File

@ -1,12 +1,12 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsPostureComponent.h"
#include "PS_AI_Agent_PostureComponent.h"
#include "Components/SkeletalMeshComponent.h"
#include "GameFramework/Actor.h"
#include "Math/UnrealMathUtility.h"
#include "DrawDebugHelpers.h"
DEFINE_LOG_CATEGORY(LogElevenLabsPosture);
DEFINE_LOG_CATEGORY(LogPS_AI_Agent_Posture);
// ── ARKit eye curve names ────────────────────────────────────────────────────
static const FName EyeLookUpLeft(TEXT("eyeLookUpLeft"));
@ -30,7 +30,7 @@ static constexpr float ARKitEyeRangeVertical = 35.0f;
// Construction
// ─────────────────────────────────────────────────────────────────────────────
UElevenLabsPostureComponent::UElevenLabsPostureComponent()
UPS_AI_Agent_PostureComponent::UPS_AI_Agent_PostureComponent()
{
PrimaryComponentTick.bCanEverTick = true;
PrimaryComponentTick.TickGroup = TG_PrePhysics;
@ -46,14 +46,14 @@ UElevenLabsPostureComponent::UElevenLabsPostureComponent()
// BeginPlay
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsPostureComponent::BeginPlay()
void UPS_AI_Agent_PostureComponent::BeginPlay()
{
Super::BeginPlay();
AActor* Owner = GetOwner();
if (!Owner)
{
UE_LOG(LogElevenLabsPosture, Warning, TEXT("No owner actor — posture disabled."));
UE_LOG(LogPS_AI_Agent_Posture, Warning, TEXT("No owner actor — posture disabled."));
return;
}
@ -82,11 +82,11 @@ void UElevenLabsPostureComponent::BeginPlay()
}
if (!CachedMesh.IsValid())
{
UE_LOG(LogElevenLabsPosture, Warning,
UE_LOG(LogPS_AI_Agent_Posture, Warning,
TEXT("No SkeletalMeshComponent found on %s — head bone lookup will be unavailable."),
*Owner->GetName());
}
UE_LOG(LogElevenLabsPosture, Log,
UE_LOG(LogPS_AI_Agent_Posture, Log,
TEXT("Mesh cache: Body=%s Face=%s"),
CachedMesh.IsValid() ? *CachedMesh->GetName() : TEXT("NONE"),
CachedFaceMesh.IsValid() ? *CachedFaceMesh->GetName() : TEXT("NONE"));
@ -96,7 +96,7 @@ void UElevenLabsPostureComponent::BeginPlay()
OriginalActorYaw = Owner->GetActorRotation().Yaw + MeshForwardYawOffset;
TargetBodyWorldYaw = Owner->GetActorRotation().Yaw;
UE_LOG(LogElevenLabsPosture, Log,
UE_LOG(LogPS_AI_Agent_Posture, Log,
TEXT("Posture initialized on %s. MeshOffset=%.0f OriginalYaw=%.0f MaxEye=%.0f/%.0f MaxHead=%.0f/%.0f"),
*Owner->GetName(), MeshForwardYawOffset, OriginalActorYaw,
MaxEyeHorizontal, MaxEyeVertical, MaxHeadYaw, MaxHeadPitch);
@ -106,7 +106,7 @@ void UElevenLabsPostureComponent::BeginPlay()
// Map eye angles to 8 ARKit eye curves
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsPostureComponent::UpdateEyeCurves(float EyeYaw, float EyePitch)
void UPS_AI_Agent_PostureComponent::UpdateEyeCurves(float EyeYaw, float EyePitch)
{
CurrentEyeCurves.Reset();
@ -176,7 +176,7 @@ void UElevenLabsPostureComponent::UpdateEyeCurves(float EyeYaw, float EyePitch)
// point shifts so small movements don't re-trigger higher layers.
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsPostureComponent::TickComponent(
void UPS_AI_Agent_PostureComponent::TickComponent(
float DeltaTime, ELevelTick TickType, FActorComponentTickFunction* ThisTickFunction)
{
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
@ -446,7 +446,7 @@ void UElevenLabsPostureComponent::TickComponent(
const float TgtYaw = FVector(Dir.X, Dir.Y, 0.0f).Rotation().Yaw;
const float Delta = FMath::FindDeltaAngleDegrees(FacingYaw, TgtYaw);
UE_LOG(LogElevenLabsPosture, Log,
UE_LOG(LogPS_AI_Agent_Posture, Log,
TEXT("Posture [%s -> %s]: Delta=%.1f | Head=%.1f/%.1f | Eyes=%.1f/%.1f | EyeGap=%.1f"),
*Owner->GetName(), *TargetActor->GetName(),
Delta,

View File

@ -1,7 +1,7 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsWebSocketProxy.h"
#include "PS_AI_Agent_ElevenLabs.h"
#include "PS_AI_Agent_WebSocket_ElevenLabsProxy.h"
#include "PS_AI_ConvAgent.h"
#include "WebSocketsModule.h"
#include "IWebSocket.h"
@ -10,7 +10,7 @@
#include "JsonUtilities.h"
#include "Misc/Base64.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsWS, Log, All);
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_WS_ElevenLabs, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Helpers
@ -24,18 +24,18 @@ static void EL_LOG(bool bVerbose, const TCHAR* Format, ...)
TCHAR Buffer[2048];
FCString::GetVarArgs(Buffer, UE_ARRAY_COUNT(Buffer), Format, Args);
va_end(Args);
UE_LOG(LogElevenLabsWS, Verbose, TEXT("%s"), Buffer);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("%s"), Buffer);
}
// ─────────────────────────────────────────────────────────────────────────────
// Connect / Disconnect
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::Connect(const FString& AgentIDOverride, const FString& APIKeyOverride)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::Connect(const FString& AgentIDOverride, const FString& APIKeyOverride)
{
if (ConnectionState == EElevenLabsConnectionState::Connected ||
ConnectionState == EElevenLabsConnectionState::Connecting)
if (ConnectionState == EPS_AI_Agent_ConnectionState_ElevenLabs::Connected ||
ConnectionState == EPS_AI_Agent_ConnectionState_ElevenLabs::Connecting)
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("Connect called but already connecting/connected. Ignoring."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("Connect called but already connecting/connected. Ignoring."));
return;
}
@ -48,19 +48,19 @@ void UElevenLabsWebSocketProxy::Connect(const FString& AgentIDOverride, const FS
if (URL.IsEmpty())
{
const FString Msg = TEXT("Cannot connect: no Agent ID configured. Set it in Project Settings or pass it to Connect().");
UE_LOG(LogElevenLabsWS, Error, TEXT("%s"), *Msg);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Error, TEXT("%s"), *Msg);
OnError.Broadcast(Msg);
ConnectionState = EElevenLabsConnectionState::Error;
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Error;
return;
}
UE_LOG(LogElevenLabsWS, Log, TEXT("Connecting to ElevenLabs: %s"), *URL);
ConnectionState = EElevenLabsConnectionState::Connecting;
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Connecting to ElevenLabs: %s"), *URL);
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Connecting;
// Headers: the ElevenLabs Conversational AI WS endpoint accepts the
// xi-api-key header on the initial HTTP upgrade request.
TMap<FString, FString> UpgradeHeaders;
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
const FString ResolvedKey = APIKeyOverride.IsEmpty() ? Settings->API_Key : APIKeyOverride;
if (!ResolvedKey.IsEmpty())
{
@ -69,36 +69,36 @@ void UElevenLabsWebSocketProxy::Connect(const FString& AgentIDOverride, const FS
WebSocket = FWebSocketsModule::Get().CreateWebSocket(URL, TEXT(""), UpgradeHeaders);
WebSocket->OnConnected().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsConnected);
WebSocket->OnConnectionError().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsConnectionError);
WebSocket->OnClosed().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsClosed);
WebSocket->OnConnected().AddUObject(this, &UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsConnected);
WebSocket->OnConnectionError().AddUObject(this, &UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsConnectionError);
WebSocket->OnClosed().AddUObject(this, &UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsClosed);
// NOTE: We bind ONLY OnRawMessage (binary frames), NOT OnMessage (text frames).
// UE's WebSocket implementation fires BOTH callbacks for the same frame when using
// the libwebsockets backend — binding both causes every audio packet to be decoded
// and played twice. OnRawMessage handles all frame types: raw binary audio AND
// text-framed JSON (detected by peeking first byte for '{').
WebSocket->OnRawMessage().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsBinaryMessage);
WebSocket->OnRawMessage().AddUObject(this, &UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsBinaryMessage);
WebSocket->Connect();
}
void UElevenLabsWebSocketProxy::Disconnect()
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::Disconnect()
{
if (WebSocket.IsValid() && WebSocket->IsConnected())
{
WebSocket->Close(1000, TEXT("Client disconnected"));
}
ConnectionState = EElevenLabsConnectionState::Disconnected;
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Disconnected;
}
// ─────────────────────────────────────────────────────────────────────────────
// Audio & turn control
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::SendAudioChunk(const TArray<uint8>& PCMData)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendAudioChunk(const TArray<uint8>& PCMData)
{
if (!IsConnected())
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("SendAudioChunk: not connected (state=%d). Audio dropped."),
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("SendAudioChunk: not connected (state=%d). Audio dropped."),
(int32)ConnectionState);
return;
}
@ -118,7 +118,7 @@ void UElevenLabsWebSocketProxy::SendAudioChunk(const TArray<uint8>& PCMData)
const FString AudioJson = FString::Printf(TEXT("{\"user_audio_chunk\":\"%s\"}"), *Base64Audio);
// Per-chunk log at Verbose only — Log level is too spammy (10+ lines per second).
UE_LOG(LogElevenLabsWS, Verbose, TEXT("SendAudioChunk: %d bytes"), PCMData.Num());
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("SendAudioChunk: %d bytes"), PCMData.Num());
{
FScopeLock Lock(&WebSocketSendLock);
@ -129,9 +129,9 @@ void UElevenLabsWebSocketProxy::SendAudioChunk(const TArray<uint8>& PCMData)
}
}
void UElevenLabsWebSocketProxy::SendUserTurnStart()
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendUserTurnStart()
{
// No-op: the ElevenLabs API does not require a "start speaking" signal.
// No-op: the PS_AI_Agent|ElevenLabs API does not require a "start speaking" signal.
// The server's VAD detects speech from the audio chunks we send.
// user_activity is a keep-alive/timeout-reset message and should NOT be
// sent here — it would delay the agent's turn after the user stops.
@ -145,12 +145,12 @@ void UElevenLabsWebSocketProxy::SendUserTurnStart()
bAgentResponseStartedFired = false;
const double T = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+%.2fs] User turn started — mic open, audio chunks will follow."), T);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+%.2fs] User turn started — mic open, audio chunks will follow."), T);
}
void UElevenLabsWebSocketProxy::SendUserTurnEnd()
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendUserTurnEnd()
{
// No explicit "end turn" message exists in the ElevenLabs API.
// No explicit "end turn" message exists in the PS_AI_Agent|ElevenLabs API.
// The server detects end-of-speech via VAD when we stop sending audio chunks.
UserTurnEndTime = FPlatformTime::Seconds();
bWaitingForResponse = true;
@ -162,42 +162,42 @@ void UElevenLabsWebSocketProxy::SendUserTurnEnd()
// The flag is only reset in SendUserTurnStart() at the beginning of a new user turn.
const double T = UserTurnEndTime - SessionStartTime;
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+%.2fs] User turn ended — server VAD silence detection started (turn_timeout=1s)."), T);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+%.2fs] User turn ended — server VAD silence detection started (turn_timeout=1s)."), T);
}
void UElevenLabsWebSocketProxy::SendTextMessage(const FString& Text)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendTextMessage(const FString& Text)
{
if (!IsConnected())
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("SendTextMessage: not connected."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("SendTextMessage: not connected."));
return;
}
if (Text.IsEmpty()) return;
// API: { "type": "user_message", "text": "Hello agent" }
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::UserMessage);
Msg->SetStringField(TEXT("type"), PS_AI_Agent_MessageType_ElevenLabs::UserMessage);
Msg->SetStringField(TEXT("text"), Text);
SendJsonMessage(Msg);
}
void UElevenLabsWebSocketProxy::SendInterrupt()
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendInterrupt()
{
if (!IsConnected()) return;
UE_LOG(LogElevenLabsWS, Log, TEXT("Sending interrupt."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Sending interrupt."));
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::Interrupt);
Msg->SetStringField(TEXT("type"), PS_AI_Agent_MessageType_ElevenLabs::Interrupt);
SendJsonMessage(Msg);
}
// ─────────────────────────────────────────────────────────────────────────────
// WebSocket callbacks
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::OnWsConnected()
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsConnected()
{
UE_LOG(LogElevenLabsWS, Log, TEXT("WebSocket connected. Sending conversation_initiation_client_data..."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("WebSocket connected. Sending conversation_initiation_client_data..."));
// State stays Connecting until we receive conversation_initiation_metadata from the server.
// ElevenLabs requires this message immediately after the WebSocket handshake to
@ -231,7 +231,7 @@ void UElevenLabsWebSocketProxy::OnWsConnected()
// In Server VAD mode, the config override is empty (matches C++ sample exactly).
TSharedPtr<FJsonObject> ConversationConfigOverride = MakeShareable(new FJsonObject());
if (TurnMode == EElevenLabsTurnMode::Client)
if (TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Client)
{
// turn_timeout: how long the server waits after VAD detects silence before
// processing the user's turn. Default is ~3s. In push-to-talk mode this
@ -267,7 +267,7 @@ void UElevenLabsWebSocketProxy::OnWsConnected()
// agent's configured custom_llm_extra_body with nothing. Omit entirely.
TSharedPtr<FJsonObject> InitMsg = MakeShareable(new FJsonObject());
InitMsg->SetStringField(TEXT("type"), ElevenLabsMessageType::ConversationClientData);
InitMsg->SetStringField(TEXT("type"), PS_AI_Agent_MessageType_ElevenLabs::ConversationClientData);
InitMsg->SetObjectField(TEXT("conversation_config_override"), ConversationConfigOverride);
// NOTE: We bypass SendJsonMessage() here intentionally.
@ -277,38 +277,38 @@ void UElevenLabsWebSocketProxy::OnWsConnected()
FString InitJson;
TSharedRef<TJsonWriter<>> InitWriter = TJsonWriterFactory<>::Create(&InitJson);
FJsonSerializer::Serialize(InitMsg.ToSharedRef(), InitWriter);
UE_LOG(LogElevenLabsWS, Log, TEXT("Sending initiation: %s"), *InitJson);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Sending initiation: %s"), *InitJson);
WebSocket->Send(InitJson);
}
void UElevenLabsWebSocketProxy::OnWsConnectionError(const FString& Error)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsConnectionError(const FString& Error)
{
UE_LOG(LogElevenLabsWS, Error, TEXT("WebSocket connection error: %s"), *Error);
ConnectionState = EElevenLabsConnectionState::Error;
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Error, TEXT("WebSocket connection error: %s"), *Error);
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Error;
OnError.Broadcast(Error);
}
void UElevenLabsWebSocketProxy::OnWsClosed(int32 StatusCode, const FString& Reason, bool bWasClean)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsClosed(int32 StatusCode, const FString& Reason, bool bWasClean)
{
UE_LOG(LogElevenLabsWS, Log, TEXT("WebSocket closed. Code=%d Reason=%s Clean=%d"), StatusCode, *Reason, bWasClean);
ConnectionState = EElevenLabsConnectionState::Disconnected;
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("WebSocket closed. Code=%d Reason=%s Clean=%d"), StatusCode, *Reason, bWasClean);
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Disconnected;
WebSocket.Reset();
OnDisconnected.Broadcast(StatusCode, Reason);
}
void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsMessage(const FString& Message)
{
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
if (Settings->bVerboseLogging)
{
UE_LOG(LogElevenLabsWS, Verbose, TEXT(">> %s"), *Message);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT(">> %s"), *Message);
}
TSharedPtr<FJsonObject> Root;
TSharedRef<TJsonReader<>> Reader = TJsonReaderFactory<>::Create(Message);
if (!FJsonSerializer::Deserialize(Reader, Root) || !Root.IsValid())
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("Failed to parse WebSocket message as JSON (first 80 chars): %.80s"), *Message);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("Failed to parse WebSocket message as JSON (first 80 chars): %.80s"), *Message);
return;
}
@ -318,26 +318,26 @@ void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
{
// Fallback: some messages use the top-level key as the type
// e.g. { "user_audio_chunk": "..." } from ourselves (shouldn't arrive)
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Message has no 'type' field, ignoring."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Message has no 'type' field, ignoring."));
return;
}
// Suppress ping from the visible log — they arrive every ~2s and flood the output.
// Handle ping early before the generic type log.
if (MsgType == ElevenLabsMessageType::PingEvent)
if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::PingEvent)
{
HandlePing(Root);
return;
}
// Log every non-ping message type received from the server.
UE_LOG(LogElevenLabsWS, Log, TEXT("Received message type: %s"), *MsgType);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Received message type: %s"), *MsgType);
if (MsgType == ElevenLabsMessageType::ConversationInitiation)
if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::ConversationInitiation)
{
HandleConversationInitiation(Root);
}
else if (MsgType == ElevenLabsMessageType::AudioResponse)
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::AudioResponse)
{
// Log time-to-first-audio: latency between end of user turn and first agent audio.
if (bWaitingForResponse && !bFirstAudioResponseLogged)
@ -346,14 +346,14 @@ void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
const double T = Now - SessionStartTime;
const double LatencyFromTurnEnd = (Now - UserTurnEndTime) * 1000.0;
const double LatencyFromLastChunk = (Now - LastAudioChunkSentTime) * 1000.0;
UE_LOG(LogElevenLabsWS, Warning,
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning,
TEXT("[T+%.2fs] [LATENCY] First audio: %.0f ms after turn end (%.0f ms after last chunk)"),
T, LatencyFromTurnEnd, LatencyFromLastChunk);
bFirstAudioResponseLogged = true;
}
HandleAudioResponse(Root);
}
else if (MsgType == ElevenLabsMessageType::UserTranscript)
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::UserTranscript)
{
// Log transcription latency.
if (bWaitingForResponse)
@ -361,14 +361,14 @@ void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
const double Now = FPlatformTime::Seconds();
const double T = Now - SessionStartTime;
const double LatencyFromTurnEnd = (Now - UserTurnEndTime) * 1000.0;
UE_LOG(LogElevenLabsWS, Warning,
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning,
TEXT("[T+%.2fs] [LATENCY] User transcript: %.0f ms after turn end"),
T, LatencyFromTurnEnd);
bWaitingForResponse = false;
}
HandleTranscript(Root);
}
else if (MsgType == ElevenLabsMessageType::AgentResponse)
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::AgentResponse)
{
// Log agent text response latency.
if (UserTurnEndTime > 0.0)
@ -376,36 +376,36 @@ void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
const double Now = FPlatformTime::Seconds();
const double T = Now - SessionStartTime;
const double LatencyFromTurnEnd = (Now - UserTurnEndTime) * 1000.0;
UE_LOG(LogElevenLabsWS, Warning,
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning,
TEXT("[T+%.2fs] [LATENCY] Agent text response: %.0f ms after turn end"),
T, LatencyFromTurnEnd);
}
HandleAgentResponse(Root);
}
else if (MsgType == ElevenLabsMessageType::AgentChatResponsePart)
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::AgentChatResponsePart)
{
HandleAgentChatResponsePart(Root);
}
else if (MsgType == ElevenLabsMessageType::AgentResponseCorrection)
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::AgentResponseCorrection)
{
// Silently ignore — corrected text after interruption.
UE_LOG(LogElevenLabsWS, Verbose, TEXT("agent_response_correction received (ignored)."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("agent_response_correction received (ignored)."));
}
else if (MsgType == ElevenLabsMessageType::ClientToolCall)
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::ClientToolCall)
{
HandleClientToolCall(Root);
}
else if (MsgType == ElevenLabsMessageType::InterruptionEvent)
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::InterruptionEvent)
{
HandleInterruption(Root);
}
else
{
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Unhandled message type: %s"), *MsgType);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Unhandled message type: %s"), *MsgType);
}
}
void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size, SIZE_T BytesRemaining)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size, SIZE_T BytesRemaining)
{
// Accumulate fragments until BytesRemaining == 0.
const uint8* Bytes = static_cast<const uint8*>(Data);
@ -430,10 +430,10 @@ void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size,
reinterpret_cast<const char*>(BinaryFrameBuffer.GetData())));
BinaryFrameBuffer.Reset();
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
if (Settings->bVerboseLogging)
{
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Binary JSON frame (%d bytes): %.120s"), TotalSize, *JsonString);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Binary JSON frame (%d bytes): %.120s"), TotalSize, *JsonString);
}
OnWsMessage(JsonString);
@ -442,7 +442,7 @@ void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size,
{
// Raw binary audio frame — PCM bytes sent directly without Base64/JSON wrapper.
// Log first few bytes as hex to help diagnose the format.
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
if (Settings->bVerboseLogging)
{
FString HexPreview;
@ -451,7 +451,7 @@ void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size,
{
HexPreview += FString::Printf(TEXT("%02X "), BinaryFrameBuffer[i]);
}
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Binary audio frame: %d bytes | first bytes: %s"), TotalSize, *HexPreview);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Binary audio frame: %d bytes | first bytes: %s"), TotalSize, *HexPreview);
}
// Broadcast raw PCM bytes directly to the audio queue.
@ -464,7 +464,7 @@ void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size,
// ─────────────────────────────────────────────────────────────────────────────
// Message handlers
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::HandleConversationInitiation(const TSharedPtr<FJsonObject>& Root)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleConversationInitiation(const TSharedPtr<FJsonObject>& Root)
{
// Expected structure:
// { "type": "conversation_initiation_metadata",
@ -480,12 +480,12 @@ void UElevenLabsWebSocketProxy::HandleConversationInitiation(const TSharedPtr<FJ
}
SessionStartTime = FPlatformTime::Seconds();
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+0.00s] Conversation initiated. ID=%s"), *ConversationInfo.ConversationID);
ConnectionState = EElevenLabsConnectionState::Connected;
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+0.00s] Conversation initiated. ID=%s"), *ConversationInfo.ConversationID);
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Connected;
OnConnected.Broadcast(ConversationInfo);
}
void UElevenLabsWebSocketProxy::HandleAudioResponse(const TSharedPtr<FJsonObject>& Root)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleAudioResponse(const TSharedPtr<FJsonObject>& Root)
{
// Expected structure:
// { "type": "audio",
@ -494,7 +494,7 @@ void UElevenLabsWebSocketProxy::HandleAudioResponse(const TSharedPtr<FJsonObject
const TSharedPtr<FJsonObject>* AudioEvent = nullptr;
if (!Root->TryGetObjectField(TEXT("audio_event"), AudioEvent) || !AudioEvent)
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("audio message missing 'audio_event' field."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("audio message missing 'audio_event' field."));
return;
}
@ -505,28 +505,28 @@ void UElevenLabsWebSocketProxy::HandleAudioResponse(const TSharedPtr<FJsonObject
(*AudioEvent)->TryGetNumberField(TEXT("event_id"), EventId);
if (EventId > 0 && EventId <= LastInterruptEventId)
{
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Discarding audio event_id=%d (interrupted at %d)."), EventId, LastInterruptEventId);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Discarding audio event_id=%d (interrupted at %d)."), EventId, LastInterruptEventId);
return;
}
FString Base64Audio;
if (!(*AudioEvent)->TryGetStringField(TEXT("audio_base_64"), Base64Audio))
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("audio_event missing 'audio_base_64' field."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("audio_event missing 'audio_base_64' field."));
return;
}
TArray<uint8> PCMData;
if (!FBase64::Decode(Base64Audio, PCMData))
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("Failed to Base64-decode audio data."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("Failed to Base64-decode audio data."));
return;
}
OnAudioReceived.Broadcast(PCMData);
}
void UElevenLabsWebSocketProxy::HandleTranscript(const TSharedPtr<FJsonObject>& Root)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleTranscript(const TSharedPtr<FJsonObject>& Root)
{
// API structure:
// { "type": "user_transcript",
@ -536,11 +536,11 @@ void UElevenLabsWebSocketProxy::HandleTranscript(const TSharedPtr<FJsonObject>&
const TSharedPtr<FJsonObject>* TranscriptEvent = nullptr;
if (!Root->TryGetObjectField(TEXT("user_transcription_event"), TranscriptEvent) || !TranscriptEvent)
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("user_transcript message missing 'user_transcription_event' field."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("user_transcript message missing 'user_transcription_event' field."));
return;
}
FElevenLabsTranscriptSegment Segment;
FPS_AI_Agent_TranscriptSegment_ElevenLabs Segment;
Segment.Speaker = TEXT("user");
(*TranscriptEvent)->TryGetStringField(TEXT("user_transcript"), Segment.Text);
// user_transcript messages are always final (interim results are not sent for user speech)
@ -549,7 +549,7 @@ void UElevenLabsWebSocketProxy::HandleTranscript(const TSharedPtr<FJsonObject>&
OnTranscript.Broadcast(Segment);
}
void UElevenLabsWebSocketProxy::HandleAgentResponse(const TSharedPtr<FJsonObject>& Root)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleAgentResponse(const TSharedPtr<FJsonObject>& Root)
{
// ISSUE-22: reset bAgentResponseStartedFired so OnAgentResponseStarted fires again on
// the next turn. In Server VAD mode SendUserTurnStart() is never called — it is the only
@ -574,7 +574,7 @@ void UElevenLabsWebSocketProxy::HandleAgentResponse(const TSharedPtr<FJsonObject
OnAgentResponse.Broadcast(ResponseText);
}
void UElevenLabsWebSocketProxy::HandleAgentChatResponsePart(const TSharedPtr<FJsonObject>& Root)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleAgentChatResponsePart(const TSharedPtr<FJsonObject>& Root)
{
// agent_chat_response_part = the server is actively generating a response (LLM token stream).
// Fire OnAgentResponseStarted once per turn so the component can auto-stop the microphone
@ -591,7 +591,7 @@ void UElevenLabsWebSocketProxy::HandleAgentChatResponsePart(const TSharedPtr<FJs
// of the previous response).
if (LastInterruptEventId > 0)
{
UE_LOG(LogElevenLabsWS, Log,
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log,
TEXT("New generation started — resetting LastInterruptEventId (was %d)."),
LastInterruptEventId);
LastInterruptEventId = 0;
@ -600,7 +600,7 @@ void UElevenLabsWebSocketProxy::HandleAgentChatResponsePart(const TSharedPtr<FJs
const double Now = FPlatformTime::Seconds();
const double T = Now - SessionStartTime;
const double LatencyFromTurnEnd = UserTurnEndTime > 0.0 ? (Now - UserTurnEndTime) * 1000.0 : 0.0;
UE_LOG(LogElevenLabsWS, Log,
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log,
TEXT("[T+%.2fs] Agent started generating (%.0f ms after turn end — includes VAD silence timeout + LLM start)."),
T, LatencyFromTurnEnd);
OnAgentResponseStarted.Broadcast();
@ -643,7 +643,7 @@ void UElevenLabsWebSocketProxy::HandleAgentChatResponsePart(const TSharedPtr<FJs
}
}
void UElevenLabsWebSocketProxy::HandleInterruption(const TSharedPtr<FJsonObject>& Root)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleInterruption(const TSharedPtr<FJsonObject>& Root)
{
// Extract the interrupt event_id so we can filter stale audio frames.
// { "type": "interruption", "interruption_event": { "event_id": 42 } }
@ -658,11 +658,11 @@ void UElevenLabsWebSocketProxy::HandleInterruption(const TSharedPtr<FJsonObject>
}
}
UE_LOG(LogElevenLabsWS, Log, TEXT("Agent interrupted (server ack, LastInterruptEventId=%d)."), LastInterruptEventId);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Agent interrupted (server ack, LastInterruptEventId=%d)."), LastInterruptEventId);
OnInterrupted.Broadcast();
}
void UElevenLabsWebSocketProxy::HandleClientToolCall(const TSharedPtr<FJsonObject>& Root)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleClientToolCall(const TSharedPtr<FJsonObject>& Root)
{
// Incoming: { "type": "client_tool_call", "client_tool_call": {
// "tool_name": "set_emotion", "tool_call_id": "abc123",
@ -670,11 +670,11 @@ void UElevenLabsWebSocketProxy::HandleClientToolCall(const TSharedPtr<FJsonObjec
const TSharedPtr<FJsonObject>* ToolCallObj = nullptr;
if (!Root->TryGetObjectField(TEXT("client_tool_call"), ToolCallObj) || !ToolCallObj)
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("client_tool_call: missing client_tool_call object."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("client_tool_call: missing client_tool_call object."));
return;
}
FElevenLabsClientToolCall ToolCall;
FPS_AI_Agent_ClientToolCall_ElevenLabs ToolCall;
(*ToolCallObj)->TryGetStringField(TEXT("tool_name"), ToolCall.ToolName);
(*ToolCallObj)->TryGetStringField(TEXT("tool_call_id"), ToolCall.ToolCallId);
@ -698,29 +698,29 @@ void UElevenLabsWebSocketProxy::HandleClientToolCall(const TSharedPtr<FJsonObjec
}
const double T = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+%.2fs] Client tool call: %s (id=%s, %d params)"),
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+%.2fs] Client tool call: %s (id=%s, %d params)"),
T, *ToolCall.ToolName, *ToolCall.ToolCallId, ToolCall.Parameters.Num());
OnClientToolCall.Broadcast(ToolCall);
}
void UElevenLabsWebSocketProxy::SendClientToolResult(const FString& ToolCallId, const FString& Result, bool bIsError)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendClientToolResult(const FString& ToolCallId, const FString& Result, bool bIsError)
{
// Outgoing: { "type": "client_tool_result", "tool_call_id": "abc123",
// "result": "emotion set to surprise", "is_error": false }
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::ClientToolResult);
Msg->SetStringField(TEXT("type"), PS_AI_Agent_MessageType_ElevenLabs::ClientToolResult);
Msg->SetStringField(TEXT("tool_call_id"), ToolCallId);
Msg->SetStringField(TEXT("result"), Result);
Msg->SetBoolField(TEXT("is_error"), bIsError);
SendJsonMessage(Msg);
const double T = FPlatformTime::Seconds() - SessionStartTime;
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+%.2fs] Sent client_tool_result for %s: %s (error=%s)"),
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+%.2fs] Sent client_tool_result for %s: %s (error=%s)"),
T, *ToolCallId, *Result, bIsError ? TEXT("true") : TEXT("false"));
}
void UElevenLabsWebSocketProxy::HandlePing(const TSharedPtr<FJsonObject>& Root)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandlePing(const TSharedPtr<FJsonObject>& Root)
{
// Reply with a pong to keep the connection alive.
// Incoming: { "type": "ping", "ping_event": { "event_id": 1, "ping_ms": 150 } }
@ -741,11 +741,11 @@ void UElevenLabsWebSocketProxy::HandlePing(const TSharedPtr<FJsonObject>& Root)
// ─────────────────────────────────────────────────────────────────────────────
// Helpers
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& JsonObj)
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& JsonObj)
{
if (!WebSocket.IsValid() || !WebSocket->IsConnected())
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("SendJsonMessage: WebSocket not connected."));
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("SendJsonMessage: WebSocket not connected."));
return;
}
@ -753,10 +753,10 @@ void UElevenLabsWebSocketProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& J
TSharedRef<TJsonWriter<>> Writer = TJsonWriterFactory<>::Create(&Out);
FJsonSerializer::Serialize(JsonObj.ToSharedRef(), Writer);
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
if (Settings->bVerboseLogging)
{
UE_LOG(LogElevenLabsWS, Verbose, TEXT("<< %s"), *Out);
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("<< %s"), *Out);
}
{
@ -765,9 +765,9 @@ void UElevenLabsWebSocketProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& J
}
}
FString UElevenLabsWebSocketProxy::BuildWebSocketURL(const FString& AgentIDOverride, const FString& APIKeyOverride) const
FString UPS_AI_Agent_WebSocket_ElevenLabsProxy::BuildWebSocketURL(const FString& AgentIDOverride, const FString& APIKeyOverride) const
{
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
// Custom URL override takes full precedence
if (!Settings->CustomWebSocketURL.IsEmpty())

View File

@ -0,0 +1,50 @@
// Copyright ASTERION. All Rights Reserved.
#include "PS_AI_ConvAgent.h"
#include "Developer/Settings/Public/ISettingsModule.h"
#include "UObject/UObjectGlobals.h"
#include "UObject/Package.h"
IMPLEMENT_MODULE(FPS_AI_ConvAgentModule, PS_AI_ConvAgent)
#define LOCTEXT_NAMESPACE "PS_AI_ConvAgent"
void FPS_AI_ConvAgentModule::StartupModule()
{
Settings = NewObject<UPS_AI_Agent_Settings_ElevenLabs>(GetTransientPackage(), "PS_AI_Agent_Settings_ElevenLabs", RF_Standalone);
Settings->AddToRoot();
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
{
SettingsModule->RegisterSettings(
"Project", "Plugins", "PS_AI_ConvAgent_ElevenLabs",
LOCTEXT("SettingsName", "PS AI Agent - ElevenLabs"),
LOCTEXT("SettingsDescription", "Configure the PS AI Agent - ElevenLabs plugin"),
Settings);
}
}
void FPS_AI_ConvAgentModule::ShutdownModule()
{
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
{
SettingsModule->UnregisterSettings("Project", "Plugins", "PS_AI_ConvAgent_ElevenLabs");
}
if (!GExitPurge)
{
Settings->RemoveFromRoot();
}
else
{
Settings = nullptr;
}
}
UPS_AI_Agent_Settings_ElevenLabs* FPS_AI_ConvAgentModule::GetSettings() const
{
check(Settings);
return Settings;
}
#undef LOCTEXT_NAMESPACE

View File

@ -4,28 +4,28 @@
#include "CoreMinimal.h"
#include "Animation/AnimNodeBase.h"
#include "AnimNode_ElevenLabsFacialExpression.generated.h"
#include "AnimNode_PS_AI_Agent_FacialExpression.generated.h"
class UElevenLabsFacialExpressionComponent;
class UPS_AI_Agent_FacialExpressionComponent;
/**
* Animation node that injects ElevenLabs facial expression curves into the AnimGraph.
* Animation node that injects facial expression curves into the AnimGraph.
*
* Place this node in the MetaHuman Face AnimBP BEFORE the ElevenLabs Lip Sync node.
* Place this node in the MetaHuman Face AnimBP BEFORE the PS AI Agent Lip Sync node.
* It reads emotion curves (eyes, eyebrows, cheeks, mouth mood) from the
* ElevenLabsFacialExpressionComponent on the same Actor and outputs them as
* PS_AI_Agent_FacialExpressionComponent on the same Actor and outputs them as
* CTRL_expressions_* animation curves.
*
* The Lip Sync node placed AFTER this one will override mouth-area curves
* during speech, while non-mouth emotion curves (eyes, brows) pass through.
*
* Graph layout:
* [Live Link Pose] [ElevenLabs Facial Expression] [ElevenLabs Lip Sync] [mh_arkit_mapping_pose] ...
* [Live Link Pose] [PS AI Agent Facial Expression] [PS AI Agent Lip Sync] [mh_arkit_mapping_pose] ...
*
* The node auto-discovers the FacialExpressionComponent no manual wiring needed.
*/
USTRUCT(BlueprintInternalUseOnly)
struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsFacialExpression : public FAnimNode_Base
struct PS_AI_CONVAGENT_API FAnimNode_PS_AI_Agent_FacialExpression : public FAnimNode_Base
{
GENERATED_USTRUCT_BODY()
@ -43,7 +43,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsFacialExpression : public
private:
/** Cached reference to the facial expression component on the owning actor. */
TWeakObjectPtr<UElevenLabsFacialExpressionComponent> FacialExpressionComponent;
TWeakObjectPtr<UPS_AI_Agent_FacialExpressionComponent> FacialExpressionComponent;
/** Emotion expression curves to inject (CTRL_expressions_* format).
* Copied from the component during Update (game thread safe). */

View File

@ -4,26 +4,26 @@
#include "CoreMinimal.h"
#include "Animation/AnimNodeBase.h"
#include "AnimNode_ElevenLabsLipSync.generated.h"
#include "AnimNode_PS_AI_Agent_LipSync.generated.h"
class UElevenLabsLipSyncComponent;
class UPS_AI_Agent_LipSyncComponent;
/**
* Animation node that injects ElevenLabs lip sync curves into the AnimGraph.
* Animation node that injects lip sync curves into the AnimGraph.
*
* Place this node in the MetaHuman Face AnimBP BEFORE the mh_arkit_mapping_pose
* node. It reads ARKit blendshape weights from the ElevenLabsLipSyncComponent
* node. It reads ARKit blendshape weights from the PS_AI_Agent_LipSyncComponent
* on the same Actor and outputs them as animation curves (jawOpen, mouthFunnel,
* etc.). The downstream mh_arkit_mapping_pose node then converts these ARKit
* names to CTRL_expressions_* curves that drive the MetaHuman facial bones.
*
* Graph layout:
* [Live Link Pose] [ElevenLabs Lip Sync] [mh_arkit_mapping_pose] ...
* [Live Link Pose] [PS AI Agent Lip Sync] [mh_arkit_mapping_pose] ...
*
* The node auto-discovers the LipSyncComponent no manual wiring needed.
*/
USTRUCT(BlueprintInternalUseOnly)
struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsLipSync : public FAnimNode_Base
struct PS_AI_CONVAGENT_API FAnimNode_PS_AI_Agent_LipSync : public FAnimNode_Base
{
GENERATED_USTRUCT_BODY()
@ -41,7 +41,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsLipSync : public FAnimNode
private:
/** Cached reference to the lip sync component on the owning actor. */
TWeakObjectPtr<UElevenLabsLipSyncComponent> LipSyncComponent;
TWeakObjectPtr<UPS_AI_Agent_LipSyncComponent> LipSyncComponent;
/** ARKit blendshape curves to inject (jawOpen, mouthFunnel, etc.).
* Copied from the component during Update (game thread safe). */

View File

@ -5,12 +5,12 @@
#include "CoreMinimal.h"
#include "Animation/AnimNodeBase.h"
#include "BoneContainer.h"
#include "AnimNode_ElevenLabsPosture.generated.h"
#include "AnimNode_PS_AI_Agent_Posture.generated.h"
class UElevenLabsPostureComponent;
class UPS_AI_Agent_PostureComponent;
/**
* Animation node that injects ElevenLabs posture data into the AnimGraph.
* Animation node that injects posture data into the AnimGraph.
*
* Handles two types of output (each can be toggled independently):
* 1. Head bone rotation (yaw + pitch) applied directly to the bone transform
@ -25,10 +25,10 @@ class UElevenLabsPostureComponent;
* Injects ARKit eye curves before mh_arkit_mapping_pose.
* Graph: [Source] [Facial Expression] [Posture] [Lip Sync] [mh_arkit] ...
*
* The node auto-discovers the ElevenLabsPostureComponent no manual wiring needed.
* The node auto-discovers the PS_AI_Agent_PostureComponent no manual wiring needed.
*/
USTRUCT(BlueprintInternalUseOnly)
struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsPosture : public FAnimNode_Base
struct PS_AI_CONVAGENT_API FAnimNode_PS_AI_Agent_Posture : public FAnimNode_Base
{
GENERATED_USTRUCT_BODY()
@ -56,7 +56,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsPosture : public FAnimNode
private:
/** Cached reference to the posture component on the owning actor. */
TWeakObjectPtr<UElevenLabsPostureComponent> PostureComponent;
TWeakObjectPtr<UPS_AI_Agent_PostureComponent> PostureComponent;
/** Eye gaze curves to inject (8 ARKit eye look curves).
* Copied from the component during Update (game thread safe). */

View File

@ -4,20 +4,20 @@
#include "CoreMinimal.h"
#include "Components/ActorComponent.h"
#include "ElevenLabsDefinitions.h"
#include "ElevenLabsWebSocketProxy.h"
#include "PS_AI_Agent_Definitions.h"
#include "PS_AI_Agent_WebSocket_ElevenLabsProxy.h"
#include "Sound/SoundWaveProcedural.h"
#include <atomic>
#include "ElevenLabsConversationalAgentComponent.generated.h"
#include "PS_AI_Agent_Conv_ElevenLabsComponent.generated.h"
class UAudioComponent;
class UElevenLabsMicrophoneCaptureComponent;
class UPS_AI_Agent_MicrophoneCaptureComponent;
// ─────────────────────────────────────────────────────────────────────────────
// Delegates exposed to Blueprint
// ─────────────────────────────────────────────────────────────────────────────
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentConnected,
const FElevenLabsConversationInfo&, ConversationInfo);
const FPS_AI_Agent_ConversationInfo_ElevenLabs&, ConversationInfo);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnAgentDisconnected,
int32, StatusCode, const FString&, Reason);
@ -26,7 +26,7 @@ DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentError,
const FString&, ErrorMessage);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentTranscript,
const FElevenLabsTranscriptSegment&, Segment);
const FPS_AI_Agent_TranscriptSegment_ElevenLabs&, Segment);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentTextResponse,
const FString&, ResponseText);
@ -68,8 +68,8 @@ DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnAgentResponseTimeout);
* The emotion changes BEFORE the corresponding audio arrives, giving time to blend.
*/
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnAgentEmotionChanged,
EElevenLabsEmotion, Emotion,
EElevenLabsEmotionIntensity, Intensity);
EPS_AI_Agent_Emotion, Emotion,
EPS_AI_Agent_EmotionIntensity, Intensity);
/**
* Fired for any client tool call that is NOT automatically handled (i.e. not "set_emotion").
@ -77,14 +77,14 @@ DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnAgentEmotionChanged,
* You MUST call SendClientToolResult on the WebSocketProxy to acknowledge the call.
*/
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentClientToolCall,
const FElevenLabsClientToolCall&, ToolCall);
const FPS_AI_Agent_ClientToolCall_ElevenLabs&, ToolCall);
// Non-dynamic delegate for raw agent audio (high-frequency, C++ consumers only).
// Delivers PCM chunks as int16, 16kHz mono, little-endian.
DECLARE_MULTICAST_DELEGATE_OneParam(FOnAgentAudioData, const TArray<uint8>& /*PCMData*/);
// ─────────────────────────────────────────────────────────────────────────────
// UElevenLabsConversationalAgentComponent
// UPS_AI_Agent_Conv_ElevenLabsComponent
//
// Attach this to any Actor (e.g. a character NPC) to give it a voice powered by
// the ElevenLabs Conversational AI API.
@ -96,60 +96,60 @@ DECLARE_MULTICAST_DELEGATE_OneParam(FOnAgentAudioData, const TArray<uint8>& /*PC
// 4. React to events (OnAgentTranscript, OnAgentTextResponse, etc.) in Blueprint.
// 5. Call EndConversation() when done.
// ─────────────────────────────────────────────────────────────────────────────
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
DisplayName = "ElevenLabs Conversational Agent")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsConversationalAgentComponent : public UActorComponent
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
DisplayName = "PS AI Agent - ElevenLabs Conv")
class PS_AI_CONVAGENT_API UPS_AI_Agent_Conv_ElevenLabsComponent : public UActorComponent
{
GENERATED_BODY()
public:
UElevenLabsConversationalAgentComponent();
UPS_AI_Agent_Conv_ElevenLabsComponent();
// ── Configuration ─────────────────────────────────────────────────────────
/** ElevenLabs Agent ID used for this conversation. Leave empty to use the default from Project Settings > ElevenLabs. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
/** ElevenLabs Agent ID used for this conversation. Leave empty to use the default from Project Settings > PS AI Agent - ElevenLabs. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
meta = (ToolTip = "ElevenLabs Agent ID. Leave empty to use the project default from Project Settings."))
FString AgentID;
/** How turn-taking is managed between the user and the agent.\n- Server VAD (recommended): ElevenLabs automatically detects when the user stops speaking.\n- Client Controlled: You manually call StartListening/StopListening (push-to-talk with a key). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
meta = (ToolTip = "Turn-taking mode.\n- Server VAD: ElevenLabs detects end-of-speech automatically (hands-free).\n- Client Controlled: You call StartListening/StopListening manually (push-to-talk)."))
EElevenLabsTurnMode TurnMode = EElevenLabsTurnMode::Server;
EPS_AI_Agent_TurnMode_ElevenLabs TurnMode = EPS_AI_Agent_TurnMode_ElevenLabs::Server;
/** Automatically open the microphone as soon as the WebSocket connection is established. Only applies in Server VAD mode. In Client (push-to-talk) mode, you must call StartListening manually. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
meta = (ToolTip = "Auto-open the microphone when the conversation starts.\nOnly applies in Server VAD mode. In push-to-talk mode, call StartListening() manually."))
bool bAutoStartListening = true;
/** Let the LLM start generating a response during silence, before the VAD is fully confident the user has finished speaking. Saves 200-500ms of latency but may be unstable in long multi-turn sessions. Disabled by default. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Latency",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Latency",
meta = (ToolTip = "Speculative turn: the LLM begins generating during silence before full turn-end confidence.\nReduces latency by 200-500ms. May be unstable in long sessions — test before enabling in production."))
bool bSpeculativeTurn = false;
/** How many milliseconds of microphone audio to accumulate before sending a chunk to ElevenLabs. Lower values reduce latency but may degrade voice detection accuracy. Higher values are more reliable but add delay. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Latency",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Latency",
meta = (ClampMin = "20", ClampMax = "500",
ToolTip = "Mic audio chunk duration sent to ElevenLabs.\n- 50-80ms: lower latency, less reliable voice detection.\n- 100ms (default): good balance.\n- 150-250ms: more reliable, higher latency."))
int32 MicChunkDurationMs = 100;
/** Allow the user to interrupt the agent while it is speaking.\n- In Server VAD mode: the microphone stays open during agent speech and the server detects interruptions automatically.\n- In Client (push-to-talk) mode: pressing the talk key while the agent speaks sends an interrupt signal.\n- When disabled: the user must wait for the agent to finish speaking before talking. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
meta = (ToolTip = "Allow the user to interrupt the agent while it speaks.\n- Server VAD: mic stays open, server detects user voice automatically.\n- Push-to-talk: pressing the talk key interrupts the agent.\n- Disabled: user must wait for the agent to finish."))
bool bAllowInterruption = true;
/** Enable the OnAgentTranscript event, which provides real-time speech-to-text of what the user is saying. Disable if you don't need to display user speech to reduce processing overhead. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Events",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fire OnAgentTranscript with real-time speech-to-text of user speech.\nDisable if you don't need to display what the user said."))
bool bEnableUserTranscript = true;
/** Enable the OnAgentTextResponse event, which provides the agent's complete text response once fully generated. Disable if you only need the audio output. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Events",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fire OnAgentTextResponse with the agent's complete text once fully generated.\nDisable if you only need the audio output."))
bool bEnableAgentTextResponse = true;
/** Enable the OnAgentPartialResponse event, which streams the agent's text word-by-word as the LLM generates it. Use this for real-time subtitles that appear while the agent speaks, rather than waiting for the full response. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Events",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fire OnAgentPartialResponse with streaming text fragments as the LLM generates them.\nIdeal for real-time subtitles. Each event gives one text chunk, not the accumulated text."))
bool bEnableAgentPartialResponse = false;
@ -157,13 +157,13 @@ public:
* Delays playback start so early TTS chunks can accumulate, preventing
* mid-sentence pauses when the second chunk hasn't arrived yet.
* Set to 0 for immediate playback. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Latency",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Latency",
meta = (ClampMin = "0", ClampMax = "4000",
ToolTip = "Pre-buffer delay in ms before starting audio playback.\nHigher values reduce mid-sentence pauses but add initial latency.\n0 = immediate playback."))
int32 AudioPreBufferMs = 2000;
/** Safety timeout: if the server does not start generating a response within this many seconds after the user stops speaking, fire OnAgentResponseTimeout. Set to 0 to disable. A normal response starts within 0.1-0.8s. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
meta = (ClampMin = "0.0",
ToolTip = "Seconds to wait for a server response after the user stops speaking.\nFires OnAgentResponseTimeout if exceeded. Normal latency is 0.1-0.8s.\nSet to 0 to disable. Default: 10s."))
float ResponseTimeoutSeconds = 10.0f;
@ -171,81 +171,81 @@ public:
// ── Events ────────────────────────────────────────────────────────────────
/** Fired when the WebSocket connection is established and the conversation session is ready. Provides the ConversationID and AgentID. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires when the connection to ElevenLabs is established and the conversation is ready to begin."))
FOnAgentConnected OnAgentConnected;
/** Fired when the WebSocket connection is closed (gracefully or due to an error). Provides the status code and reason. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires when the connection to ElevenLabs is closed. Check StatusCode and Reason for details."))
FOnAgentDisconnected OnAgentDisconnected;
/** Fired on any connection or protocol error. The error message describes what went wrong. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires on connection or protocol errors. The ErrorMessage describes the issue."))
FOnAgentError OnAgentError;
/** Fired with real-time speech-to-text of the user's voice. Includes both tentative (in-progress) and final transcripts. Requires bEnableUserTranscript to be true. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Real-time speech-to-text of the user's voice.\nIncludes tentative and final transcripts. Enable with bEnableUserTranscript."))
FOnAgentTranscript OnAgentTranscript;
/** Fired once when the agent's complete text response is available. This is the full text that corresponds to the audio the agent speaks. Requires bEnableAgentTextResponse to be true. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "The agent's complete text response (matches the spoken audio).\nFires once when the full text is ready. Enable with bEnableAgentTextResponse."))
FOnAgentTextResponse OnAgentTextResponse;
/** Fired repeatedly as the LLM generates text, providing one word/fragment at a time. Use for real-time subtitles. Each call gives a new fragment, NOT the accumulated text. Requires bEnableAgentPartialResponse to be true. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Streaming text fragments as the LLM generates them (word by word).\nIdeal for real-time subtitles. Enable with bEnableAgentPartialResponse."))
FOnAgentPartialResponse OnAgentPartialResponse;
/** Fired when the agent begins playing audio (first audio chunk received). Use this to trigger speech animations or UI indicators. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires when the agent starts speaking (first audio chunk). Use for lip-sync or UI feedback."))
FOnAgentStartedSpeaking OnAgentStartedSpeaking;
/** Fired when the agent finishes playing all audio. Use this to re-open the microphone (in Server VAD mode without interruption) or update UI. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires when the agent finishes speaking. Use to re-open the mic or update UI."))
FOnAgentStoppedSpeaking OnAgentStoppedSpeaking;
/** Fired when the agent's speech is interrupted (either by the user speaking over it, or by a manual InterruptAgent call). The audio playback is automatically stopped. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires when the agent is interrupted mid-speech. Audio is automatically stopped."))
FOnAgentInterrupted OnAgentInterrupted;
/** Fired when the server starts generating a response (before any audio arrives). Use this for "thinking..." UI feedback. In push-to-talk mode, the microphone is automatically closed when this fires. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires when the server starts generating (before audio arrives).\nUse for 'thinking...' UI. Mic is auto-closed in push-to-talk mode."))
FOnAgentStartedGenerating OnAgentStartedGenerating;
/** Fired if the server does not start generating a response within ResponseTimeoutSeconds after the user stops speaking. Use this to show a "try again" message or automatically re-open the microphone. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires if the server doesn't respond within ResponseTimeoutSeconds.\nUse to show 'try again' or re-open the mic automatically."))
FOnAgentResponseTimeout OnAgentResponseTimeout;
/** Fired when the agent changes emotion via the "set_emotion" client tool. The emotion is set BEFORE the corresponding audio arrives, giving you time to smoothly blend facial expressions. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires when the agent sets an emotion (joy, sadness, surprise, fear, anger, disgust).\nDriven by the 'set_emotion' client tool. Arrives before the audio."))
FOnAgentEmotionChanged OnAgentEmotionChanged;
/** Fired for client tool calls that are NOT automatically handled (i.e. not "set_emotion"). You must call GetWebSocketProxy()->SendClientToolResult() to respond. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
meta = (ToolTip = "Fires for custom client tool calls (not set_emotion).\nYou must respond via GetWebSocketProxy()->SendClientToolResult()."))
FOnAgentClientToolCall OnAgentClientToolCall;
/** The current emotion of the agent, as set by the "set_emotion" client tool. Defaults to Neutral. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
EElevenLabsEmotion CurrentEmotion = EElevenLabsEmotion::Neutral;
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
EPS_AI_Agent_Emotion CurrentEmotion = EPS_AI_Agent_Emotion::Neutral;
/** The current emotion intensity. Defaults to Medium. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
EElevenLabsEmotionIntensity CurrentEmotionIntensity = EElevenLabsEmotionIntensity::Medium;
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
EPS_AI_Agent_EmotionIntensity CurrentEmotionIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
// ── Raw audio data (C++ only, used by LipSync component) ────────────────
/** Raw PCM audio from the agent (int16, 16kHz mono). Fires for each WebSocket audio chunk.
* Used internally by UElevenLabsLipSyncComponent for spectral analysis. */
* Used internally by UPS_AI_Agent_LipSyncComponent for spectral analysis. */
FOnAgentAudioData OnAgentAudioData;
// ── Control ───────────────────────────────────────────────────────────────
@ -254,25 +254,25 @@ public:
* Open the WebSocket connection and start the conversation.
* If bAutoStartListening is true, microphone capture also starts once connected.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void StartConversation();
/** Close the WebSocket and stop all audio. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void EndConversation();
/**
* Start capturing microphone audio and streaming it to ElevenLabs.
* In Client turn mode, also sends a UserTurnStart signal.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void StartListening();
/**
* Stop capturing microphone audio.
* In Client turn mode, also sends a UserTurnEnd signal.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void StopListening();
/**
@ -280,35 +280,35 @@ public:
* The agent will respond with audio and text just as if it heard you speak.
* Useful for testing in the Editor or for text-based interaction.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void SendTextMessage(const FString& Text);
/** Interrupt the agent's current utterance. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void InterruptAgent();
// ── State queries ─────────────────────────────────────────────────────────
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
bool IsConnected() const;
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
bool IsListening() const { return bIsListening; }
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
bool IsAgentSpeaking() const { return bAgentSpeaking; }
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
const FElevenLabsConversationInfo& GetConversationInfo() const;
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
const FPS_AI_Agent_ConversationInfo_ElevenLabs& GetConversationInfo() const;
/** True while audio is being pre-buffered (playback hasn't started yet).
* Used by the LipSync component to pause viseme queue consumption. */
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
bool IsPreBuffering() const { return bPreBuffering; }
/** Access the underlying WebSocket proxy (advanced use). */
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
UElevenLabsWebSocketProxy* GetWebSocketProxy() const { return WebSocketProxy; }
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
UPS_AI_Agent_WebSocket_ElevenLabsProxy* GetWebSocketProxy() const { return WebSocketProxy; }
// ─────────────────────────────────────────────────────────────────────────
// UActorComponent overrides
@ -321,7 +321,7 @@ public:
private:
// ── Internal event handlers ───────────────────────────────────────────────
UFUNCTION()
void HandleConnected(const FElevenLabsConversationInfo& Info);
void HandleConnected(const FPS_AI_Agent_ConversationInfo_ElevenLabs& Info);
UFUNCTION()
void HandleDisconnected(int32 StatusCode, const FString& Reason);
@ -333,7 +333,7 @@ private:
void HandleAudioReceived(const TArray<uint8>& PCMData);
UFUNCTION()
void HandleTranscript(const FElevenLabsTranscriptSegment& Segment);
void HandleTranscript(const FPS_AI_Agent_TranscriptSegment_ElevenLabs& Segment);
UFUNCTION()
void HandleAgentResponse(const FString& ResponseText);
@ -348,7 +348,7 @@ private:
void HandleAgentResponsePart(const FString& PartialText);
UFUNCTION()
void HandleClientToolCall(const FElevenLabsClientToolCall& ToolCall);
void HandleClientToolCall(const FPS_AI_Agent_ClientToolCall_ElevenLabs& ToolCall);
// ── Audio playback ────────────────────────────────────────────────────────
void InitAudioPlayback();
@ -364,7 +364,7 @@ private:
// ── Sub-objects ───────────────────────────────────────────────────────────
UPROPERTY()
UElevenLabsWebSocketProxy* WebSocketProxy = nullptr;
UPS_AI_Agent_WebSocket_ElevenLabsProxy* WebSocketProxy = nullptr;
UPROPERTY()
UAudioComponent* AudioPlaybackComponent = nullptr;

View File

@ -3,13 +3,13 @@
#pragma once
#include "CoreMinimal.h"
#include "ElevenLabsDefinitions.generated.h"
#include "PS_AI_Agent_Definitions.generated.h"
// ─────────────────────────────────────────────────────────────────────────────
// Connection state
// ─────────────────────────────────────────────────────────────────────────────
UENUM(BlueprintType)
enum class EElevenLabsConnectionState : uint8
enum class EPS_AI_Agent_ConnectionState_ElevenLabs : uint8
{
Disconnected UMETA(DisplayName = "Disconnected"),
Connecting UMETA(DisplayName = "Connecting"),
@ -21,7 +21,7 @@ enum class EElevenLabsConnectionState : uint8
// Agent turn mode
// ─────────────────────────────────────────────────────────────────────────────
UENUM(BlueprintType)
enum class EElevenLabsTurnMode : uint8
enum class EPS_AI_Agent_TurnMode_ElevenLabs : uint8
{
/** ElevenLabs server decides when the user has finished speaking (default). */
Server UMETA(DisplayName = "Server VAD"),
@ -32,7 +32,7 @@ enum class EElevenLabsTurnMode : uint8
// ─────────────────────────────────────────────────────────────────────────────
// WebSocket message type helpers (internal, not exposed to Blueprint)
// ─────────────────────────────────────────────────────────────────────────────
namespace ElevenLabsMessageType
namespace PS_AI_Agent_MessageType_ElevenLabs
{
// Client → Server
static const FString AudioChunk = TEXT("user_audio_chunk");
@ -62,7 +62,7 @@ namespace ElevenLabsMessageType
// Audio format exchanged with ElevenLabs
// PCM 16-bit signed, 16000 Hz, mono, little-endian.
// ─────────────────────────────────────────────────────────────────────────────
namespace ElevenLabsAudio
namespace PS_AI_Agent_Audio_ElevenLabs
{
static constexpr int32 SampleRate = 16000;
static constexpr int32 Channels = 1;
@ -75,16 +75,16 @@ namespace ElevenLabsAudio
// Conversation metadata received on successful connection
// ─────────────────────────────────────────────────────────────────────────────
USTRUCT(BlueprintType)
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsConversationInfo
struct PS_AI_CONVAGENT_API FPS_AI_Agent_ConversationInfo_ElevenLabs
{
GENERATED_BODY()
/** Unique ID of this conversation session assigned by ElevenLabs. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
FString ConversationID;
/** Agent ID that is responding. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
FString AgentID;
};
@ -92,20 +92,20 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsConversationInfo
// Transcript segment
// ─────────────────────────────────────────────────────────────────────────────
USTRUCT(BlueprintType)
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsTranscriptSegment
struct PS_AI_CONVAGENT_API FPS_AI_Agent_TranscriptSegment_ElevenLabs
{
GENERATED_BODY()
/** Transcribed text. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
FString Text;
/** "user" or "agent". */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
FString Speaker;
/** Whether this is a final transcript or a tentative (in-progress) one. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
bool bIsFinal = false;
};
@ -113,7 +113,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsTranscriptSegment
// Agent emotion (driven by client tool "set_emotion" from the LLM)
// ─────────────────────────────────────────────────────────────────────────────
UENUM(BlueprintType)
enum class EElevenLabsEmotion : uint8
enum class EPS_AI_Agent_Emotion : uint8
{
Neutral UMETA(DisplayName = "Neutral"),
Joy UMETA(DisplayName = "Joy"),
@ -128,7 +128,7 @@ enum class EElevenLabsEmotion : uint8
// Emotion intensity (maps to Normal/Medium/Extreme pose variants)
// ─────────────────────────────────────────────────────────────────────────────
UENUM(BlueprintType)
enum class EElevenLabsEmotionIntensity : uint8
enum class EPS_AI_Agent_EmotionIntensity : uint8
{
Low UMETA(DisplayName = "Low (Normal)"),
Medium UMETA(DisplayName = "Medium"),
@ -139,19 +139,19 @@ enum class EElevenLabsEmotionIntensity : uint8
// Client tool call received from ElevenLabs server
// ─────────────────────────────────────────────────────────────────────────────
USTRUCT(BlueprintType)
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsClientToolCall
struct PS_AI_CONVAGENT_API FPS_AI_Agent_ClientToolCall_ElevenLabs
{
GENERATED_BODY()
/** Name of the tool the agent wants to invoke (e.g. "set_emotion"). */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
FString ToolName;
/** Unique ID for this tool invocation — must be echoed back in client_tool_result. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
FString ToolCallId;
/** Raw JSON parameters as key-value string pairs. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
TMap<FString, FString> Parameters;
};

View File

@ -5,8 +5,8 @@
#include "CoreMinimal.h"
#include "Engine/DataAsset.h"
#include "Engine/AssetManager.h"
#include "ElevenLabsDefinitions.h"
#include "ElevenLabsEmotionPoseMap.generated.h"
#include "PS_AI_Agent_Definitions.h"
#include "PS_AI_Agent_EmotionPoseMap.generated.h"
class UAnimSequence;
@ -14,7 +14,7 @@ class UAnimSequence;
// Emotion pose set: 3 intensity levels (Normal / Medium / Extreme)
// ─────────────────────────────────────────────────────────────────────────────
USTRUCT(BlueprintType)
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsEmotionPoseSet
struct PS_AI_CONVAGENT_API FPS_AI_Agent_EmotionPoseSet
{
GENERATED_BODY()
@ -38,16 +38,16 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsEmotionPoseSet
* Reusable data asset that maps emotions to facial expression AnimSequences.
*
* Create ONE instance of this asset in the Content Browser
* (right-click Miscellaneous Data Asset ElevenLabsEmotionPoseMap),
* (right-click Miscellaneous Data Asset PS_AI_Agent_EmotionPoseMap),
* assign your emotion AnimSequences, then reference this asset
* on the ElevenLabs Facial Expression component.
* on the PS AI Agent Facial Expression component.
*
* The component plays the AnimSequence in real-time (looping) to drive
* emotion-based facial expressions (eyes, eyebrows, cheeks, mouth mood).
* Lip sync overrides the mouth-area curves on top.
*/
UCLASS(BlueprintType, Blueprintable, DisplayName = "ElevenLabs Emotion Pose Map")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsEmotionPoseMap : public UPrimaryDataAsset
UCLASS(BlueprintType, Blueprintable, DisplayName = "PS AI Agent Emotion Pose Map")
class PS_AI_CONVAGENT_API UPS_AI_Agent_EmotionPoseMap : public UPrimaryDataAsset
{
GENERATED_BODY()
@ -57,5 +57,5 @@ public:
* Neutral is recommended it plays by default at startup (blinking, breathing). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Emotion Poses",
meta = (ToolTip = "Emotion → AnimSequence mapping with 3 intensity levels.\nThese drive the base facial expression (eyes, brows, cheeks).\nLip sync overrides the mouth area on top."))
TMap<EElevenLabsEmotion, FElevenLabsEmotionPoseSet> EmotionPoses;
TMap<EPS_AI_Agent_Emotion, FPS_AI_Agent_EmotionPoseSet> EmotionPoses;
};

View File

@ -4,18 +4,18 @@
#include "CoreMinimal.h"
#include "Components/ActorComponent.h"
#include "ElevenLabsDefinitions.h"
#include "ElevenLabsFacialExpressionComponent.generated.h"
#include "PS_AI_Agent_Definitions.h"
#include "PS_AI_Agent_FacialExpressionComponent.generated.h"
class UElevenLabsConversationalAgentComponent;
class UElevenLabsEmotionPoseMap;
class UPS_AI_Agent_Conv_ElevenLabsComponent;
class UPS_AI_Agent_EmotionPoseMap;
class UAnimSequence;
// ─────────────────────────────────────────────────────────────────────────────
// UElevenLabsFacialExpressionComponent
// UPS_AI_Agent_FacialExpressionComponent
//
// Drives emotion-based facial expressions on a MetaHuman (or any skeletal mesh)
// as a BASE layer. Lip sync (from ElevenLabsLipSyncComponent) modulates on top,
// as a BASE layer. Lip sync (from PS_AI_Agent_LipSyncComponent) modulates on top,
// overriding only mouth-area curves during speech.
//
// Emotion AnimSequences are played back in real-time (looping), not sampled
@ -24,31 +24,31 @@ class UAnimSequence;
//
// Workflow:
// 1. Assign a PoseMap data asset with Emotion Poses filled in.
// 2. Add the AnimNode "ElevenLabs Facial Expression" in the Face AnimBP
// BEFORE the "ElevenLabs Lip Sync" node.
// 2. Add the AnimNode "PS AI Agent Facial Expression" in the Face AnimBP
// BEFORE the "PS AI Agent Lip Sync" node.
// 3. The component listens to OnAgentEmotionChanged from the agent component.
// 4. Emotion animations crossfade smoothly (configurable duration).
// 5. The AnimNode reads GetCurrentEmotionCurves() and injects them into the pose.
// ─────────────────────────────────────────────────────────────────────────────
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
DisplayName = "ElevenLabs Facial Expression")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsFacialExpressionComponent : public UActorComponent
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
DisplayName = "PS AI Agent Facial Expression")
class PS_AI_CONVAGENT_API UPS_AI_Agent_FacialExpressionComponent : public UActorComponent
{
GENERATED_BODY()
public:
UElevenLabsFacialExpressionComponent();
UPS_AI_Agent_FacialExpressionComponent();
// ── Configuration ─────────────────────────────────────────────────────────
/** Emotion pose map asset containing emotion AnimSequences (Normal / Medium / Extreme per emotion).
* Create a dedicated ElevenLabsEmotionPoseMap asset in the Content Browser. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|FacialExpression",
meta = (ToolTip = "Dedicated Emotion Pose Map asset.\nRight-click Content Browser → Miscellaneous → ElevenLabs Emotion Pose Map."))
TObjectPtr<UElevenLabsEmotionPoseMap> EmotionPoseMap;
* Create a dedicated PS_AI_Agent_EmotionPoseMap asset in the Content Browser. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|FacialExpression",
meta = (ToolTip = "Dedicated Emotion Pose Map asset.\nRight-click Content Browser → Miscellaneous → PS AI Agent Emotion Pose Map."))
TObjectPtr<UPS_AI_Agent_EmotionPoseMap> EmotionPoseMap;
/** Emotion crossfade duration in seconds. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|FacialExpression",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|FacialExpression",
meta = (ClampMin = "0.1", ClampMax = "3.0",
ToolTip = "How long (seconds) to crossfade between emotions.\n0.5 = snappy, 1.5 = smooth."))
float EmotionBlendDuration = 0.5f;
@ -56,19 +56,19 @@ public:
// ── Getters ───────────────────────────────────────────────────────────────
/** Get the current emotion curves evaluated from the playing AnimSequence. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs|FacialExpression")
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|FacialExpression")
const TMap<FName, float>& GetCurrentEmotionCurves() const { return CurrentEmotionCurves; }
/** Get the active emotion. */
UFUNCTION(BlueprintPure, Category = "ElevenLabs|FacialExpression")
EElevenLabsEmotion GetActiveEmotion() const { return ActiveEmotion; }
UFUNCTION(BlueprintPure, Category = "PS_AI_Agent|FacialExpression")
EPS_AI_Agent_Emotion GetActiveEmotion() const { return ActiveEmotion; }
/** Get the active emotion intensity. */
UFUNCTION(BlueprintPure, Category = "ElevenLabs|FacialExpression")
EElevenLabsEmotionIntensity GetActiveIntensity() const { return ActiveEmotionIntensity; }
UFUNCTION(BlueprintPure, Category = "PS_AI_Agent|FacialExpression")
EPS_AI_Agent_EmotionIntensity GetActiveIntensity() const { return ActiveEmotionIntensity; }
/** Check if a curve name belongs to the mouth area (overridden by lip sync). */
UFUNCTION(BlueprintPure, Category = "ElevenLabs|FacialExpression")
UFUNCTION(BlueprintPure, Category = "PS_AI_Agent|FacialExpression")
static bool IsMouthCurve(const FName& CurveName);
// ── UActorComponent overrides ─────────────────────────────────────────────
@ -82,7 +82,7 @@ private:
/** Called when the agent changes emotion via client tool. */
UFUNCTION()
void OnEmotionChanged(EElevenLabsEmotion Emotion, EElevenLabsEmotionIntensity Intensity);
void OnEmotionChanged(EPS_AI_Agent_Emotion Emotion, EPS_AI_Agent_EmotionIntensity Intensity);
// ── Helpers ───────────────────────────────────────────────────────────────
@ -90,7 +90,7 @@ private:
void ValidateEmotionPoses();
/** Find the best AnimSequence for a given emotion + intensity (with fallback). */
UAnimSequence* FindAnimForEmotion(EElevenLabsEmotion Emotion, EElevenLabsEmotionIntensity Intensity) const;
UAnimSequence* FindAnimForEmotion(EPS_AI_Agent_Emotion Emotion, EPS_AI_Agent_EmotionIntensity Intensity) const;
/** Evaluate all FloatCurves from an AnimSequence at a given time. */
TMap<FName, float> EvaluateAnimCurves(UAnimSequence* AnimSeq, float Time) const;
@ -118,9 +118,9 @@ private:
TMap<FName, float> CurrentEmotionCurves;
/** Active emotion (for change detection). */
EElevenLabsEmotion ActiveEmotion = EElevenLabsEmotion::Neutral;
EElevenLabsEmotionIntensity ActiveEmotionIntensity = EElevenLabsEmotionIntensity::Medium;
EPS_AI_Agent_Emotion ActiveEmotion = EPS_AI_Agent_Emotion::Neutral;
EPS_AI_Agent_EmotionIntensity ActiveEmotionIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
/** Cached reference to the agent component on the same Actor. */
TWeakObjectPtr<UElevenLabsConversationalAgentComponent> AgentComponent;
TWeakObjectPtr<UPS_AI_Agent_Conv_ElevenLabsComponent> AgentComponent;
};

View File

@ -5,11 +5,11 @@
#include "CoreMinimal.h"
#include "Components/ActorComponent.h"
#include "DSP/SpectrumAnalyzer.h"
#include "ElevenLabsLipSyncComponent.generated.h"
#include "PS_AI_Agent_LipSyncComponent.generated.h"
class UElevenLabsConversationalAgentComponent;
class UElevenLabsFacialExpressionComponent;
class UElevenLabsLipSyncPoseMap;
class UPS_AI_Agent_Conv_ElevenLabsComponent;
class UPS_AI_Agent_FacialExpressionComponent;
class UPS_AI_Agent_LipSyncPoseMap;
class USkeletalMeshComponent;
/** A single entry in the decoupled viseme timeline.
@ -22,10 +22,10 @@ struct FVisemeTimelineEntry
};
// Fired every tick when viseme/blendshape data has been updated.
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsVisemesReady);
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnPS_AI_Agent_VisemesReady);
/**
* Real-time lip sync component for ElevenLabs Conversational AI.
* Real-time lip sync component for conversational AI agents.
*
* Attaches to the same Actor as the Conversational Agent component.
* Receives the agent's audio stream, performs spectral analysis,
@ -38,26 +38,26 @@ DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsVisemesReady);
* 3. Conversation starts lip sync works automatically.
* 4. (Optional) Bind OnVisemesReady for custom Blueprint handling.
*/
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
DisplayName = "ElevenLabs Lip Sync")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsLipSyncComponent : public UActorComponent
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
DisplayName = "PS AI Agent Lip Sync")
class PS_AI_CONVAGENT_API UPS_AI_Agent_LipSyncComponent : public UActorComponent
{
GENERATED_BODY()
public:
UElevenLabsLipSyncComponent();
~UElevenLabsLipSyncComponent();
UPS_AI_Agent_LipSyncComponent();
~UPS_AI_Agent_LipSyncComponent();
// ── Configuration ─────────────────────────────────────────────────────────
/** Target skeletal mesh to auto-apply morph targets. Leave empty to handle
* visemes manually via OnVisemesReady + GetCurrentBlendshapes(). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
meta = (ToolTip = "Skeletal mesh to drive morph targets on.\nLeave empty to read values manually via GetCurrentBlendshapes()."))
TObjectPtr<USkeletalMeshComponent> TargetMesh;
/** Overall mouth movement intensity multiplier. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
meta = (ClampMin = "0.0", ClampMax = "3.0",
ToolTip = "Lip sync intensity.\n1.0 = normal, higher = more expressive, lower = subtler."))
float LipSyncStrength = 1.0f;
@ -65,26 +65,26 @@ public:
/** Scales the audio amplitude driving mouth movement.
* Lower values produce subtler animation, higher values are more pronounced.
* Use this to tone down overly strong lip movement without changing the shape. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
meta = (ClampMin = "0.5", ClampMax = "1.0",
ToolTip = "Audio amplitude scale.\n0.5 = subtle, 0.75 = balanced, 1.0 = full.\nReduces overall mouth movement without affecting viseme shape."))
float AmplitudeScale = 1.0f;
/** How quickly viseme weights interpolate towards new values each frame. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
meta = (ClampMin = "35.0", ClampMax = "65.0",
ToolTip = "Smoothing speed for viseme transitions.\n35 = smooth and soft, 50 = balanced, 65 = sharp and responsive."))
float SmoothingSpeed = 50.0f;
// ── Emotion Expression Blend ─────────────────────────────────────────────
/** How much facial emotion (from ElevenLabsFacialExpressionComponent) bleeds through
/** How much facial emotion (from PS_AI_Agent_FacialExpressionComponent) bleeds through
* during speech on expression curves (smile, frown, sneer, dimple, etc.).
* Shape curves (jaw, tongue, lip shape) are always 100% lip sync.
* 0.0 = lip sync completely overrides expression curves (old behavior).
* 1.0 = emotion fully additive on expression curves during speech.
* 0.5 = balanced blend (recommended). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
meta = (ClampMin = "0.0", ClampMax = "1.0",
ToolTip = "Emotion bleed-through during speech.\n0 = pure lip sync, 0.5 = balanced, 1 = full emotion on expression curves.\nOnly affects expression curves (smile, frown, etc.), not lip shape."))
float EmotionExpressionBlend = 0.5f;
@ -94,7 +94,7 @@ public:
/** Envelope attack time in milliseconds.
* Controls how fast the mouth opens when speech starts or gets louder.
* Lower = snappier onset, higher = gentler opening. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
meta = (ClampMin = "1.0", ClampMax = "50.0",
ToolTip = "Envelope attack (ms).\n5 = snappy, 10 = balanced, 25 = gentle.\nHow fast the mouth opens on speech onset."))
float EnvelopeAttackMs = 10.0f;
@ -102,7 +102,7 @@ public:
/** Envelope release time in milliseconds.
* Controls how slowly the mouth closes when speech gets quieter.
* Higher = smoother, more natural decay. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
meta = (ClampMin = "10.0", ClampMax = "200.0",
ToolTip = "Envelope release (ms).\n25 = responsive, 50 = balanced, 120 = smooth/cinematic.\nHow slowly the mouth closes between syllables."))
float EnvelopeReleaseMs = 50.0f;
@ -110,31 +110,31 @@ public:
// ── Phoneme Pose Map ─────────────────────────────────────────────────────
/** Optional pose map asset mapping OVR visemes to phoneme AnimSequences.
* Create once (right-click Miscellaneous Data Asset ElevenLabsLipSyncPoseMap),
* Create once (right-click Miscellaneous Data Asset PS_AI_Agent_LipSyncPoseMap),
* assign your MHF_* poses, then reference it on every MetaHuman.
* When assigned, curve data is extracted at BeginPlay and replaces the
* hardcoded ARKit blendshape mapping with artist-crafted poses.
* Leave empty to use the built-in ARKit mapping (backward compatible). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
meta = (ToolTip = "Reusable pose map asset.\nAssign your MHF_* AnimSequences once, reuse on all MetaHumans.\nLeave empty for built-in ARKit mapping."))
TObjectPtr<UElevenLabsLipSyncPoseMap> PoseMap;
TObjectPtr<UPS_AI_Agent_LipSyncPoseMap> PoseMap;
// ── Events ────────────────────────────────────────────────────────────────
/** Fires every tick when viseme data has been updated.
* Use GetCurrentVisemes() or GetCurrentBlendshapes() to read values. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|LipSync",
UPROPERTY(BlueprintAssignable, Category = "PS_AI_Agent|LipSync",
meta = (ToolTip = "Fires each frame with updated viseme data.\nCall GetCurrentVisemes() or GetCurrentBlendshapes() to read values."))
FOnElevenLabsVisemesReady OnVisemesReady;
FOnPS_AI_Agent_VisemesReady OnVisemesReady;
// ── Getters ───────────────────────────────────────────────────────────────
/** Get current OVR viseme weights (15 values: sil, PP, FF, TH, DD, kk, CH, SS, nn, RR, aa, E, ih, oh, ou). */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs|LipSync")
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|LipSync")
TMap<FName, float> GetCurrentVisemes() const { return SmoothedVisemes; }
/** Get current ARKit blendshape weights (MetaHuman compatible: jawOpen, mouthFunnel, mouthClose, etc.). */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs|LipSync")
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|LipSync")
TMap<FName, float> GetCurrentBlendshapes() const { return CurrentBlendshapes; }
// ── UActorComponent overrides ─────────────────────────────────────────────
@ -285,11 +285,11 @@ private:
double WaitingForTextStartTime = 0.0;
// Cached reference to the agent component on the same Actor
TWeakObjectPtr<UElevenLabsConversationalAgentComponent> AgentComponent;
TWeakObjectPtr<UPS_AI_Agent_Conv_ElevenLabsComponent> AgentComponent;
FDelegateHandle AudioDataHandle;
// Cached reference to the facial expression component for emotion blending
TWeakObjectPtr<UElevenLabsFacialExpressionComponent> CachedFacialExprComp;
TWeakObjectPtr<UPS_AI_Agent_FacialExpressionComponent> CachedFacialExprComp;
// ── Pose-extracted curve data ────────────────────────────────────────────

View File

@ -5,7 +5,7 @@
#include "CoreMinimal.h"
#include "Engine/DataAsset.h"
#include "Engine/AssetManager.h"
#include "ElevenLabsLipSyncPoseMap.generated.h"
#include "PS_AI_Agent_LipSyncPoseMap.generated.h"
class UAnimSequence;
@ -13,16 +13,16 @@ class UAnimSequence;
* Reusable data asset that maps OVR visemes to phoneme pose AnimSequences.
*
* Create ONE instance of this asset in the Content Browser
* (right-click Miscellaneous Data Asset ElevenLabsLipSyncPoseMap),
* (right-click Miscellaneous Data Asset PS_AI_Agent_LipSyncPoseMap),
* assign your MHF_* AnimSequences once, then reference this single asset
* on every MetaHuman's ElevenLabs Lip Sync component.
* on every MetaHuman's PS AI Agent Lip Sync component.
*
* The component extracts curve data from each pose at BeginPlay and uses it
* to drive lip sync replacing the hardcoded ARKit blendshape mapping with
* artist-crafted poses that coordinate dozens of facial curves.
*/
UCLASS(BlueprintType, Blueprintable, DisplayName = "ElevenLabs Lip Sync Pose Map")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsLipSyncPoseMap : public UPrimaryDataAsset
UCLASS(BlueprintType, Blueprintable, DisplayName = "PS AI Agent Lip Sync Pose Map")
class PS_AI_CONVAGENT_API UPS_AI_Agent_LipSyncPoseMap : public UPrimaryDataAsset
{
GENERATED_BODY()

View File

@ -6,30 +6,30 @@
#include "Components/ActorComponent.h"
#include "AudioCapture.h"
#include <atomic>
#include "ElevenLabsMicrophoneCaptureComponent.generated.h"
#include "PS_AI_Agent_MicrophoneCaptureComponent.generated.h"
// Delivers captured float PCM samples (16000 Hz mono, resampled from device rate).
DECLARE_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAudioCaptured, const TArray<float>& /*FloatPCM*/);
DECLARE_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_AudioCaptured, const TArray<float>& /*FloatPCM*/);
/**
* Lightweight microphone capture component.
* Captures from the default audio input device, resamples to 16000 Hz mono,
* and delivers chunks via FOnElevenLabsAudioCaptured.
* and delivers chunks via FOnPS_AI_Agent_AudioCaptured.
*
* Modelled after Convai's ConvaiAudioCaptureComponent but stripped to the
* minimal functionality needed for the ElevenLabs Conversational AI API.
* minimal functionality needed for conversational AI agents.
*/
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
DisplayName = "ElevenLabs Microphone Capture")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsMicrophoneCaptureComponent : public UActorComponent
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
DisplayName = "PS AI Agent Microphone Capture")
class PS_AI_CONVAGENT_API UPS_AI_Agent_MicrophoneCaptureComponent : public UActorComponent
{
GENERATED_BODY()
public:
UElevenLabsMicrophoneCaptureComponent();
UPS_AI_Agent_MicrophoneCaptureComponent();
/** Multiplier applied to the microphone input volume before sending to ElevenLabs. Increase if the agent has trouble hearing you, decrease if your audio is clipping. Default: 1.0 (no change). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Microphone",
/** Multiplier applied to the microphone input volume before sending to the agent. Increase if the agent has trouble hearing you, decrease if your audio is clipping. Default: 1.0 (no change). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|MicCapture",
meta = (ClampMin = "0.0", ClampMax = "4.0",
ToolTip = "Microphone volume multiplier.\n1.0 = no change. Increase if the agent can't hear you, decrease if audio clips."))
float VolumeMultiplier = 1.0f;
@ -40,21 +40,21 @@ public:
* Audio is captured on a WASAPI background thread, resampled there (with
* echo suppression), then dispatched to the game thread for this broadcast.
*/
FOnElevenLabsAudioCaptured OnAudioCaptured;
FOnPS_AI_Agent_AudioCaptured OnAudioCaptured;
/** Optional pointer to an atomic bool that suppresses capture when true.
* Set by the agent component for echo suppression (skip mic while agent speaks). */
std::atomic<bool>* EchoSuppressFlag = nullptr;
/** Open the default capture device and begin streaming audio. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|MicCapture")
void StartCapture();
/** Stop streaming and close the capture device. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|MicCapture")
void StopCapture();
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
UFUNCTION(BlueprintPure, Category = "PS_AI_Agent|MicCapture")
bool IsCapturing() const { return bCapturing; }
// ─────────────────────────────────────────────────────────────────────────

View File

@ -5,18 +5,18 @@
#include "CoreMinimal.h"
#include "Components/ActorComponent.h"
#include "HAL/CriticalSection.h"
#include "ElevenLabsPostureComponent.generated.h"
#include "PS_AI_Agent_PostureComponent.generated.h"
class USkeletalMeshComponent;
DECLARE_LOG_CATEGORY_EXTERN(LogElevenLabsPosture, Log, All);
DECLARE_LOG_CATEGORY_EXTERN(LogPS_AI_Agent_Posture, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Neck bone chain entry for distributing head rotation across multiple bones
// ─────────────────────────────────────────────────────────────────────────────
USTRUCT(BlueprintType)
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsNeckBoneEntry
struct PS_AI_CONVAGENT_API FPS_AI_Agent_NeckBoneEntry
{
GENERATED_BODY()
@ -31,7 +31,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsNeckBoneEntry
};
// ─────────────────────────────────────────────────────────────────────────────
// UElevenLabsPostureComponent
// UPS_AI_Agent_PostureComponent
//
// Chase-based multi-layer look-at system for MetaHuman characters.
// Smoothly orients the character's body, head, and eyes toward a TargetActor.
@ -47,35 +47,35 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsNeckBoneEntry
//
// Workflow:
// 1. Add this component to the character Blueprint.
// 2. Add the AnimNode "ElevenLabs Posture" in the Body AnimBP
// 2. Add the AnimNode "PS AI Agent Posture" in the Body AnimBP
// with bApplyHeadRotation = true, bApplyEyeCurves = false.
// 3. Add the AnimNode "ElevenLabs Posture" in the Face AnimBP
// 3. Add the AnimNode "PS AI Agent Posture" in the Face AnimBP
// with bApplyHeadRotation = false, bApplyEyeCurves = true
// (between "Facial Expression" and "Lip Sync" nodes).
// 4. Set TargetActor to any actor (player pawn, a prop, etc.).
// 5. Set TargetOffset for actors without a skeleton (e.g. (0,0,160) for
// eye-level on a simple actor).
// ─────────────────────────────────────────────────────────────────────────────
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
DisplayName = "ElevenLabs Posture")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsPostureComponent : public UActorComponent
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
DisplayName = "PS AI Agent Posture")
class PS_AI_CONVAGENT_API UPS_AI_Agent_PostureComponent : public UActorComponent
{
GENERATED_BODY()
public:
UElevenLabsPostureComponent();
UPS_AI_Agent_PostureComponent();
// ── Target ───────────────────────────────────────────────────────────────
/** The actor to look at. Can be any actor (player, prop, etc.).
* Set to null to smoothly return to neutral. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ToolTip = "Target actor to look at.\nSet to null to return to neutral."))
TObjectPtr<AActor> TargetActor;
/** Offset from the target actor's origin to aim at.
* Useful for actors without a skeleton (e.g. (0,0,160) for eye-level). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ToolTip = "Offset from target actor origin.\nE.g. (0,0,160) for eye-level."))
FVector TargetOffset = FVector(0.0f, 0.0f, 0.0f);
@ -89,44 +89,44 @@ public:
// so small movements around the target don't re-trigger higher layers.
/** Maximum head yaw rotation in degrees. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0", ClampMax = "90"))
float MaxHeadYaw = 40.0f;
/** Maximum head pitch rotation in degrees. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0", ClampMax = "90"))
float MaxHeadPitch = 30.0f;
/** Maximum horizontal eye angle in degrees. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0", ClampMax = "90"))
float MaxEyeHorizontal = 15.0f;
/** Maximum vertical eye angle in degrees. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0", ClampMax = "90"))
float MaxEyeVertical = 10.0f;
// ── Smoothing speeds ─────────────────────────────────────────────────────
/** Body rotation interpolation speed (lower = slower, more natural). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0.1", ClampMax = "20"))
float BodyInterpSpeed = 4.0f;
/** Head rotation interpolation speed. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0.1", ClampMax = "20"))
float HeadInterpSpeed = 4.0f;
/** Eye movement interpolation speed (higher = snappier). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0.1", ClampMax = "20"))
float EyeInterpSpeed = 5.0f;
/** Interpolation speed when returning to neutral (TargetActor is null). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0.1", ClampMax = "20"))
float ReturnToNeutralSpeed = 3.0f;
@ -140,7 +140,7 @@ public:
* 1.0 = head always points at target regardless of animation.
* 0.0 = posture is additive on top of animation (old behavior).
* Default: 1.0 for conversational AI (always look at who you talk to). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0", ClampMax = "1"))
float HeadAnimationCompensation = 0.9f;
@ -148,7 +148,7 @@ public:
* 1.0 = eyes frozen on posture target, animation's eye movement removed.
* 0.0 = animation's eyes play through, posture is additive.
* Intermediate (e.g. 0.5) = smooth 50/50 blend. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0", ClampMax = "1"))
float EyeAnimationCompensation = 0.6f;
@ -157,7 +157,7 @@ public:
* counter-rotates the posture to keep the head pointing at the target.
* 1.0 = full compensation head stays locked on target even during bows.
* 0.0 = no compensation head follows body movement naturally. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "0", ClampMax = "1"))
float BodyDriftCompensation = 0.8f;
@ -167,7 +167,7 @@ public:
* Green = head bone target (desired).
* Cyan = computed gaze direction (body yaw + head + eyes).
* Useful for verifying that eye contact is accurate. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture")
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture")
bool bDrawDebugGaze = false;
// ── Forward offset ──────────────────────────────────────────────────────
@ -177,14 +177,14 @@ public:
* 0 = mesh faces +X (default UE convention)
* 90 = mesh faces +Y
* -90 = mesh faces -Y */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
meta = (ClampMin = "-180", ClampMax = "180"))
float MeshForwardYawOffset = 90.0f;
// ── Head bone ────────────────────────────────────────────────────────────
/** Name of the head bone on the skeletal mesh (used for eye origin calculation). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture")
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture")
FName HeadBoneName = FName(TEXT("head"));
// ── Neck bone chain ─────────────────────────────────────────────────────
@ -193,14 +193,14 @@ public:
* Order: root-to-tip (e.g. neck_01 neck_02 head).
* Weights should sum to ~1.0.
* If empty, falls back to single-bone behavior (HeadBoneName, weight 1.0). */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture")
TArray<FElevenLabsNeckBoneEntry> NeckBoneChain;
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture")
TArray<FPS_AI_Agent_NeckBoneEntry> NeckBoneChain;
// ── Getters (read by AnimNode) ───────────────────────────────────────────
/** Get current eye gaze curves (8 ARKit eye look curves).
* Returns a COPY safe to call from any thread. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs|Posture")
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|Posture")
TMap<FName, float> GetCurrentEyeCurves() const
{
FScopeLock Lock(&PostureDataLock);
@ -220,7 +220,7 @@ public:
FName GetHeadBoneName() const { return HeadBoneName; }
/** Get the neck bone chain (used by AnimNode to resolve bone indices). */
const TArray<FElevenLabsNeckBoneEntry>& GetNeckBoneChain() const { return NeckBoneChain; }
const TArray<FPS_AI_Agent_NeckBoneEntry>& GetNeckBoneChain() const { return NeckBoneChain; }
/** Get head animation compensation factor (0 = additive, 1 = full override). */
float GetHeadAnimationCompensation() const { return HeadAnimationCompensation; }

View File

@ -4,63 +4,63 @@
#include "CoreMinimal.h"
#include "UObject/NoExportTypes.h"
#include "ElevenLabsDefinitions.h"
#include "PS_AI_Agent_Definitions.h"
#include "IWebSocket.h"
#include "ElevenLabsWebSocketProxy.generated.h"
#include "PS_AI_Agent_WebSocket_ElevenLabsProxy.generated.h"
// ─────────────────────────────────────────────────────────────────────────────
// Delegates (all Blueprint-assignable)
// ─────────────────────────────────────────────────────────────────────────────
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsConnected,
const FElevenLabsConversationInfo&, ConversationInfo);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_Connected,
const FPS_AI_Agent_ConversationInfo_ElevenLabs&, ConversationInfo);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnElevenLabsDisconnected,
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnPS_AI_Agent_ElevenLabs_Disconnected,
int32, StatusCode, const FString&, Reason);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsError,
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_Error,
const FString&, ErrorMessage);
/** Fired when a PCM audio chunk arrives from the agent. Raw bytes, 16-bit signed 16kHz mono. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAudioReceived,
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_AudioReceived,
const TArray<uint8>&, PCMData);
/** Fired for user or agent transcript segments. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsTranscript,
const FElevenLabsTranscriptSegment&, Segment);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_Transcript,
const FPS_AI_Agent_TranscriptSegment_ElevenLabs&, Segment);
/** Fired with the final text response from the agent. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAgentResponse,
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_AgentResponse,
const FString&, ResponseText);
/** Fired when the agent interrupts the user. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsInterrupted);
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnPS_AI_Agent_ElevenLabs_Interrupted);
/**
* Fired when the server starts generating a response (first agent_chat_response_part received).
* This fires BEFORE audio arrives useful to detect that the server is processing
* the previous turn while the client may have restarted listening (auto-restart scenario).
*/
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsAgentResponseStarted);
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnPS_AI_Agent_ElevenLabs_ResponseStarted);
/** Fired for every agent_chat_response_part — streams the LLM text as it is generated.
* PartialText is the text fragment from this individual part (NOT accumulated). */
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAgentResponsePart,
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_ResponsePart,
const FString&, PartialText);
/** Fired when the server sends a client_tool_call — the agent wants the client to execute a tool. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsClientToolCall,
const FElevenLabsClientToolCall&, ToolCall);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_ClientToolCall,
const FPS_AI_Agent_ClientToolCall_ElevenLabs&, ToolCall);
// ─────────────────────────────────────────────────────────────────────────────
// WebSocket Proxy
// Manages the lifecycle of a single ElevenLabs Conversational AI WebSocket session.
// Instantiate via UElevenLabsConversationalAgentComponent (the component manages
// Instantiate via UPS_AI_Agent_Conv_ElevenLabsComponent (the component manages
// one proxy at a time), or create manually through Blueprints.
// ─────────────────────────────────────────────────────────────────────────────
UCLASS(BlueprintType, Blueprintable)
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsWebSocketProxy : public UObject
class PS_AI_CONVAGENT_API UPS_AI_Agent_WebSocket_ElevenLabsProxy : public UObject
{
GENERATED_BODY()
@ -68,48 +68,48 @@ public:
// ── Events ────────────────────────────────────────────────────────────────
/** Called once the WebSocket handshake succeeds and the agent sends its initiation metadata. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsConnected OnConnected;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_Connected OnConnected;
/** Called when the WebSocket closes (graceful or remote). */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsDisconnected OnDisconnected;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_Disconnected OnDisconnected;
/** Called on any connection or protocol error. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsError OnError;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_Error OnError;
/** Raw PCM audio coming from the agent — feed this into your audio component. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsAudioReceived OnAudioReceived;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_AudioReceived OnAudioReceived;
/** User or agent transcript (may be tentative while the conversation is ongoing). */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsTranscript OnTranscript;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_Transcript OnTranscript;
/** Final text response from the agent (complements audio). */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsAgentResponse OnAgentResponse;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_AgentResponse OnAgentResponse;
/** The agent was interrupted by new user speech. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsInterrupted OnInterrupted;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_Interrupted OnInterrupted;
/**
* Fired on the first agent_chat_response_part per turn i.e. the moment the server
* starts generating. Fires well before audio. The component uses this to stop the
* microphone if it was restarted before the server finished processing the previous turn.
*/
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsAgentResponseStarted OnAgentResponseStarted;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_ResponseStarted OnAgentResponseStarted;
/** Fired for every agent_chat_response_part with the streaming text fragment. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsAgentResponsePart OnAgentResponsePart;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_ResponsePart OnAgentResponsePart;
/** Fired when the agent invokes a client tool. Handle the call and reply with SendClientToolResult. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsClientToolCall OnClientToolCall;
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
FOnPS_AI_Agent_ElevenLabs_ClientToolCall OnClientToolCall;
// ── Lifecycle ─────────────────────────────────────────────────────────────
@ -120,22 +120,22 @@ public:
* @param AgentID ElevenLabs agent ID. Overrides the project-level default when non-empty.
* @param APIKey API key. Overrides the project-level default when non-empty.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void Connect(const FString& AgentID = TEXT(""), const FString& APIKey = TEXT(""));
/**
* Gracefully close the WebSocket connection.
* OnDisconnected will fire after the server acknowledges.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void Disconnect();
/** Current connection state. */
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
EElevenLabsConnectionState GetConnectionState() const { return ConnectionState; }
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
EPS_AI_Agent_ConnectionState_ElevenLabs GetConnectionState() const { return ConnectionState; }
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
bool IsConnected() const { return ConnectionState == EElevenLabsConnectionState::Connected; }
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
bool IsConnected() const { return ConnectionState == EPS_AI_Agent_ConnectionState_ElevenLabs::Connected; }
// ── Audio sending ─────────────────────────────────────────────────────────
@ -147,7 +147,7 @@ public:
*
* @param PCMData Raw PCM bytes (16-bit LE, 16kHz, mono).
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void SendAudioChunk(const TArray<uint8>& PCMData);
// ── Turn control (only relevant in Client turn mode) ──────────────────────
@ -157,7 +157,7 @@ public:
* Sends a { "type": "user_activity" } message to the server.
* Call this periodically while the user is speaking (e.g. every audio chunk).
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void SendUserTurnStart();
/**
@ -165,7 +165,7 @@ public:
* No explicit API message simply stop sending user_activity.
* The server detects silence and hands the turn to the agent.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void SendUserTurnEnd();
/**
@ -173,11 +173,11 @@ public:
* Useful for testing or text-only interaction.
* Sends: { "type": "user_message", "text": "..." }
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void SendTextMessage(const FString& Text);
/** Ask the agent to stop the current utterance. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void SendInterrupt();
/**
@ -188,13 +188,13 @@ public:
* @param Result A string result to return to the agent.
* @param bIsError True if the tool execution failed.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
void SendClientToolResult(const FString& ToolCallId, const FString& Result, bool bIsError = false);
// ── Info ──────────────────────────────────────────────────────────────────
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
const FElevenLabsConversationInfo& GetConversationInfo() const { return ConversationInfo; }
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
const FPS_AI_Agent_ConversationInfo_ElevenLabs& GetConversationInfo() const { return ConversationInfo; }
// ─────────────────────────────────────────────────────────────────────────
// Internal
@ -222,8 +222,8 @@ private:
FString BuildWebSocketURL(const FString& AgentID, const FString& APIKey) const;
TSharedPtr<IWebSocket> WebSocket;
EElevenLabsConnectionState ConnectionState = EElevenLabsConnectionState::Disconnected;
FElevenLabsConversationInfo ConversationInfo;
EPS_AI_Agent_ConnectionState_ElevenLabs ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Disconnected;
FPS_AI_Agent_ConversationInfo_ElevenLabs ConversationInfo;
// Serializes WebSocket->Send() calls — needed because SendAudioChunk can now be
// called from the WASAPI background thread while SendJsonMessage runs on game thread.
@ -258,9 +258,9 @@ private:
int32 LastInterruptEventId = 0;
public:
// Set by UElevenLabsConversationalAgentComponent before calling Connect().
// Set by UPS_AI_Agent_Conv_ElevenLabsComponent before calling Connect().
// Controls turn_timeout in conversation_initiation_client_data.
EElevenLabsTurnMode TurnMode = EElevenLabsTurnMode::Server;
EPS_AI_Agent_TurnMode_ElevenLabs TurnMode = EPS_AI_Agent_TurnMode_ElevenLabs::Server;
// Speculative turn: start LLM generation during silence before full turn confidence.
bool bSpeculativeTurn = true;

View File

@ -4,18 +4,18 @@
#include "CoreMinimal.h"
#include "Modules/ModuleManager.h"
#include "PS_AI_Agent_ElevenLabs.generated.h"
#include "PS_AI_ConvAgent.generated.h"
// ─────────────────────────────────────────────────────────────────────────────
// Settings object exposed in Project Settings → Plugins → ElevenLabs AI Agent
// Settings object exposed in Project Settings → Plugins → PS AI Agent - ElevenLabs
// ─────────────────────────────────────────────────────────────────────────────
UCLASS(config = Engine, defaultconfig)
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsSettings : public UObject
class PS_AI_CONVAGENT_API UPS_AI_Agent_Settings_ElevenLabs : public UObject
{
GENERATED_BODY()
public:
UElevenLabsSettings(const FObjectInitializer& ObjectInitializer)
UPS_AI_Agent_Settings_ElevenLabs(const FObjectInitializer& ObjectInitializer)
: Super(ObjectInitializer)
{
API_Key = TEXT("");
@ -28,14 +28,14 @@ public:
* Obtain from https://elevenlabs.io used to authenticate WebSocket connections.
* Keep this secret; do not ship with the key hard-coded in a shipping build.
*/
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API")
UPROPERTY(Config, EditAnywhere, Category = "PS_AI_Agent|ElevenLabs API")
FString API_Key;
/**
* The default ElevenLabs Conversational Agent ID to use when none is specified
* The default ElevenLabs Agent ID to use when none is specified
* on the component. Create agents at https://elevenlabs.io/app/conversational-ai
*/
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API")
UPROPERTY(Config, EditAnywhere, Category = "PS_AI_Agent|ElevenLabs API")
FString AgentID;
/**
@ -43,7 +43,7 @@ public:
* before connecting, so the API key is never exposed in the client.
* Set SignedURLEndpoint to point to your server that returns the signed URL.
*/
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API | Security")
UPROPERTY(Config, EditAnywhere, Category = "PS_AI_Agent|ElevenLabs API|Security")
bool bSignedURLMode;
/**
@ -51,7 +51,7 @@ public:
* Only used when bSignedURLMode = true.
* Expected response body: { "signed_url": "wss://..." }
*/
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API | Security",
UPROPERTY(Config, EditAnywhere, Category = "PS_AI_Agent|ElevenLabs API|Security",
meta = (EditCondition = "bSignedURLMode"))
FString SignedURLEndpoint;
@ -59,11 +59,11 @@ public:
* Override the ElevenLabs WebSocket base URL. Leave empty to use the default:
* wss://api.elevenlabs.io/v1/convai/conversation
*/
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "ElevenLabs API")
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "PS_AI_Agent|ElevenLabs API")
FString CustomWebSocketURL;
/** Log verbose WebSocket messages to the Output Log (useful during development). */
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "ElevenLabs API")
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "PS_AI_Agent|ElevenLabs API")
bool bVerboseLogging = false;
};
@ -71,7 +71,7 @@ public:
// ─────────────────────────────────────────────────────────────────────────────
// Module
// ─────────────────────────────────────────────────────────────────────────────
class PS_AI_AGENT_ELEVENLABS_API FPS_AI_Agent_ElevenLabsModule : public IModuleInterface
class PS_AI_CONVAGENT_API FPS_AI_ConvAgentModule : public IModuleInterface
{
public:
/** IModuleInterface implementation */
@ -81,19 +81,19 @@ public:
virtual bool IsGameModule() const override { return true; }
/** Singleton access */
static inline FPS_AI_Agent_ElevenLabsModule& Get()
static inline FPS_AI_ConvAgentModule& Get()
{
return FModuleManager::LoadModuleChecked<FPS_AI_Agent_ElevenLabsModule>("PS_AI_Agent_ElevenLabs");
return FModuleManager::LoadModuleChecked<FPS_AI_ConvAgentModule>("PS_AI_ConvAgent");
}
static inline bool IsAvailable()
{
return FModuleManager::Get().IsModuleLoaded("PS_AI_Agent_ElevenLabs");
return FModuleManager::Get().IsModuleLoaded("PS_AI_ConvAgent");
}
/** Access the settings object at runtime */
UElevenLabsSettings* GetSettings() const;
UPS_AI_Agent_Settings_ElevenLabs* GetSettings() const;
private:
UElevenLabsSettings* Settings = nullptr;
UPS_AI_Agent_Settings_ElevenLabs* Settings = nullptr;
};

View File

@ -2,9 +2,9 @@
using UnrealBuildTool;
public class PS_AI_Agent_ElevenLabsEditor : ModuleRules
public class PS_AI_ConvAgentEditor : ModuleRules
{
public PS_AI_Agent_ElevenLabsEditor(ReadOnlyTargetRules Target) : base(Target)
public PS_AI_ConvAgentEditor(ReadOnlyTargetRules Target) : base(Target)
{
DefaultBuildSettings = BuildSettingsVersion.Latest;
PCHUsage = PCHUsageMode.UseExplicitOrSharedPCHs;
@ -19,8 +19,8 @@ public class PS_AI_Agent_ElevenLabsEditor : ModuleRules
// AnimGraph editor node base class
"AnimGraph",
"BlueprintGraph",
// Runtime module containing FAnimNode_ElevenLabsLipSync
"PS_AI_Agent_ElevenLabs",
// Runtime module containing FAnimNode_PS_AI_Agent_LipSync
"PS_AI_ConvAgent",
});
}
}

View File

@ -0,0 +1,32 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimGraphNode_PS_AI_Agent_FacialExpression.h"
#define LOCTEXT_NAMESPACE "AnimNode_PS_AI_Agent_FacialExpression"
FText UAnimGraphNode_PS_AI_Agent_FacialExpression::GetNodeTitle(ENodeTitleType::Type TitleType) const
{
return LOCTEXT("NodeTitle", "PS AI Agent Facial Expression");
}
FText UAnimGraphNode_PS_AI_Agent_FacialExpression::GetTooltipText() const
{
return LOCTEXT("Tooltip",
"Injects emotion expression curves from the PS AI Agent Facial Expression component.\n\n"
"Place this node BEFORE the PS AI Agent Lip Sync node in the MetaHuman Face AnimBP.\n"
"It outputs CTRL_expressions_* curves for eyes, eyebrows, cheeks, and mouth mood.\n"
"The Lip Sync node placed after will override mouth-area curves during speech.");
}
FString UAnimGraphNode_PS_AI_Agent_FacialExpression::GetNodeCategory() const
{
return TEXT("PS AI Agent");
}
FLinearColor UAnimGraphNode_PS_AI_Agent_FacialExpression::GetNodeTitleColor() const
{
// Warm amber to distinguish from Lip Sync (teal)
return FLinearColor(0.8f, 0.5f, 0.1f, 1.0f);
}
#undef LOCTEXT_NAMESPACE

View File

@ -0,0 +1,32 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimGraphNode_PS_AI_Agent_LipSync.h"
#define LOCTEXT_NAMESPACE "AnimNode_PS_AI_Agent_LipSync"
FText UAnimGraphNode_PS_AI_Agent_LipSync::GetNodeTitle(ENodeTitleType::Type TitleType) const
{
return LOCTEXT("NodeTitle", "PS AI Agent Lip Sync");
}
FText UAnimGraphNode_PS_AI_Agent_LipSync::GetTooltipText() const
{
return LOCTEXT("Tooltip",
"Injects lip sync animation curves from the PS AI Agent Lip Sync component.\n\n"
"Place this node BEFORE the Control Rig in the MetaHuman Face AnimBP.\n"
"It auto-discovers the component and outputs CTRL_expressions_* curves\n"
"that the MetaHuman Control Rig uses to drive facial bones.");
}
FString UAnimGraphNode_PS_AI_Agent_LipSync::GetNodeCategory() const
{
return TEXT("PS AI Agent");
}
FLinearColor UAnimGraphNode_PS_AI_Agent_LipSync::GetNodeTitleColor() const
{
// PS AI Agent teal color
return FLinearColor(0.1f, 0.7f, 0.6f, 1.0f);
}
#undef LOCTEXT_NAMESPACE

View File

@ -0,0 +1,33 @@
// Copyright ASTERION. All Rights Reserved.
#include "AnimGraphNode_PS_AI_Agent_Posture.h"
#define LOCTEXT_NAMESPACE "AnimNode_PS_AI_Agent_Posture"
FText UAnimGraphNode_PS_AI_Agent_Posture::GetNodeTitle(ENodeTitleType::Type TitleType) const
{
return LOCTEXT("NodeTitle", "PS AI Agent Posture");
}
FText UAnimGraphNode_PS_AI_Agent_Posture::GetTooltipText() const
{
return LOCTEXT("Tooltip",
"Injects head rotation and eye gaze curves from the PS AI Agent Posture component.\n\n"
"Place this node AFTER the PS AI Agent Facial Expression node and\n"
"BEFORE the PS AI Agent Lip Sync node in the MetaHuman Face AnimBP.\n\n"
"The component distributes look-at rotation across body (actor yaw),\n"
"head (bone rotation), and eyes (ARKit curves) for a natural look-at effect.");
}
FString UAnimGraphNode_PS_AI_Agent_Posture::GetNodeCategory() const
{
return TEXT("PS AI Agent");
}
FLinearColor UAnimGraphNode_PS_AI_Agent_Posture::GetNodeTitleColor() const
{
// Cool blue to distinguish from Facial Expression (amber) and Lip Sync (teal)
return FLinearColor(0.2f, 0.4f, 0.9f, 1.0f);
}
#undef LOCTEXT_NAMESPACE

View File

@ -0,0 +1,29 @@
// Copyright ASTERION. All Rights Reserved.
#include "PS_AI_Agent_EmotionPoseMapFactory.h"
#include "PS_AI_Agent_EmotionPoseMap.h"
#include "AssetTypeCategories.h"
UPS_AI_Agent_EmotionPoseMapFactory::UPS_AI_Agent_EmotionPoseMapFactory()
{
SupportedClass = UPS_AI_Agent_EmotionPoseMap::StaticClass();
bCreateNew = true;
bEditAfterNew = true;
}
UObject* UPS_AI_Agent_EmotionPoseMapFactory::FactoryCreateNew(
UClass* Class, UObject* InParent, FName Name, EObjectFlags Flags,
UObject* Context, FFeedbackContext* Warn)
{
return NewObject<UPS_AI_Agent_EmotionPoseMap>(InParent, Class, Name, Flags);
}
FText UPS_AI_Agent_EmotionPoseMapFactory::GetDisplayName() const
{
return FText::FromString(TEXT("PS AI Agent Emotion Pose Map"));
}
uint32 UPS_AI_Agent_EmotionPoseMapFactory::GetMenuCategories() const
{
return EAssetTypeCategories::Misc;
}

View File

@ -4,19 +4,19 @@
#include "CoreMinimal.h"
#include "Factories/Factory.h"
#include "ElevenLabsLipSyncPoseMapFactory.generated.h"
#include "PS_AI_Agent_EmotionPoseMapFactory.generated.h"
/**
* Factory that lets users create ElevenLabsLipSyncPoseMap assets
* Factory that lets users create PS_AI_Agent_EmotionPoseMap assets
* directly from the Content Browser (right-click Miscellaneous).
*/
UCLASS()
class UElevenLabsLipSyncPoseMapFactory : public UFactory
class UPS_AI_Agent_EmotionPoseMapFactory : public UFactory
{
GENERATED_BODY()
public:
UElevenLabsLipSyncPoseMapFactory();
UPS_AI_Agent_EmotionPoseMapFactory();
virtual UObject* FactoryCreateNew(UClass* Class, UObject* InParent,
FName Name, EObjectFlags Flags, UObject* Context,

View File

@ -0,0 +1,29 @@
// Copyright ASTERION. All Rights Reserved.
#include "PS_AI_Agent_LipSyncPoseMapFactory.h"
#include "PS_AI_Agent_LipSyncPoseMap.h"
#include "AssetTypeCategories.h"
UPS_AI_Agent_LipSyncPoseMapFactory::UPS_AI_Agent_LipSyncPoseMapFactory()
{
SupportedClass = UPS_AI_Agent_LipSyncPoseMap::StaticClass();
bCreateNew = true;
bEditAfterNew = true;
}
UObject* UPS_AI_Agent_LipSyncPoseMapFactory::FactoryCreateNew(
UClass* Class, UObject* InParent, FName Name, EObjectFlags Flags,
UObject* Context, FFeedbackContext* Warn)
{
return NewObject<UPS_AI_Agent_LipSyncPoseMap>(InParent, Class, Name, Flags);
}
FText UPS_AI_Agent_LipSyncPoseMapFactory::GetDisplayName() const
{
return FText::FromString(TEXT("PS AI Agent Lip Sync Pose Map"));
}
uint32 UPS_AI_Agent_LipSyncPoseMapFactory::GetMenuCategories() const
{
return EAssetTypeCategories::Misc;
}

View File

@ -4,19 +4,19 @@
#include "CoreMinimal.h"
#include "Factories/Factory.h"
#include "ElevenLabsEmotionPoseMapFactory.generated.h"
#include "PS_AI_Agent_LipSyncPoseMapFactory.generated.h"
/**
* Factory that lets users create ElevenLabsEmotionPoseMap assets
* Factory that lets users create PS_AI_Agent_LipSyncPoseMap assets
* directly from the Content Browser (right-click Miscellaneous).
*/
UCLASS()
class UElevenLabsEmotionPoseMapFactory : public UFactory
class UPS_AI_Agent_LipSyncPoseMapFactory : public UFactory
{
GENERATED_BODY()
public:
UElevenLabsEmotionPoseMapFactory();
UPS_AI_Agent_LipSyncPoseMapFactory();
virtual UObject* FactoryCreateNew(UClass* Class, UObject* InParent,
FName Name, EObjectFlags Flags, UObject* Context,

View File

@ -0,0 +1,16 @@
// Copyright ASTERION. All Rights Reserved.
#include "Modules/ModuleManager.h"
/**
* Editor module for PS_AI_ConvAgent plugin.
* Provides AnimGraph node(s) for the PS AI Agent Lip Sync system.
*/
class FPS_AI_ConvAgentEditorModule : public IModuleInterface
{
public:
virtual void StartupModule() override {}
virtual void ShutdownModule() override {}
};
IMPLEMENT_MODULE(FPS_AI_ConvAgentEditorModule, PS_AI_ConvAgentEditor)

View File

@ -0,0 +1,31 @@
// Copyright ASTERION. All Rights Reserved.
#pragma once
#include "CoreMinimal.h"
#include "AnimGraphNode_Base.h"
#include "AnimNode_PS_AI_Agent_FacialExpression.h"
#include "AnimGraphNode_PS_AI_Agent_FacialExpression.generated.h"
/**
* AnimGraph editor node for the PS AI Agent Facial Expression AnimNode.
*
* This node appears in the AnimBP graph editor under the "PS AI Agent" category.
* Place it BEFORE the PS AI Agent Lip Sync node in the MetaHuman Face AnimBP.
* It auto-discovers the PS_AI_Agent_FacialExpressionComponent on the owning Actor
* and injects CTRL_expressions_* curves for emotion-driven facial expressions.
*/
UCLASS()
class UAnimGraphNode_PS_AI_Agent_FacialExpression : public UAnimGraphNode_Base
{
GENERATED_BODY()
UPROPERTY(EditAnywhere, Category = "Settings")
FAnimNode_PS_AI_Agent_FacialExpression Node;
// UAnimGraphNode_Base interface
virtual FText GetNodeTitle(ENodeTitleType::Type TitleType) const override;
virtual FText GetTooltipText() const override;
virtual FString GetNodeCategory() const override;
virtual FLinearColor GetNodeTitleColor() const override;
};

View File

@ -4,24 +4,24 @@
#include "CoreMinimal.h"
#include "AnimGraphNode_Base.h"
#include "AnimNode_ElevenLabsLipSync.h"
#include "AnimGraphNode_ElevenLabsLipSync.generated.h"
#include "AnimNode_PS_AI_Agent_LipSync.h"
#include "AnimGraphNode_PS_AI_Agent_LipSync.generated.h"
/**
* AnimGraph editor node for the ElevenLabs Lip Sync AnimNode.
* AnimGraph editor node for the PS AI Agent Lip Sync AnimNode.
*
* This node appears in the AnimBP graph editor under the "ElevenLabs" category.
* This node appears in the AnimBP graph editor under the "PS AI Agent" category.
* Drop it into your MetaHuman Face AnimBP between the pose source and the
* Control Rig node. It auto-discovers the ElevenLabsLipSyncComponent on the
* Control Rig node. It auto-discovers the PS_AI_Agent_LipSyncComponent on the
* owning Actor and injects CTRL_expressions_* curves for lip sync.
*/
UCLASS()
class UAnimGraphNode_ElevenLabsLipSync : public UAnimGraphNode_Base
class UAnimGraphNode_PS_AI_Agent_LipSync : public UAnimGraphNode_Base
{
GENERATED_BODY()
UPROPERTY(EditAnywhere, Category = "Settings")
FAnimNode_ElevenLabsLipSync Node;
FAnimNode_PS_AI_Agent_LipSync Node;
// UAnimGraphNode_Base interface
virtual FText GetNodeTitle(ENodeTitleType::Type TitleType) const override;

View File

@ -4,26 +4,26 @@
#include "CoreMinimal.h"
#include "AnimGraphNode_Base.h"
#include "AnimNode_ElevenLabsPosture.h"
#include "AnimGraphNode_ElevenLabsPosture.generated.h"
#include "AnimNode_PS_AI_Agent_Posture.h"
#include "AnimGraphNode_PS_AI_Agent_Posture.generated.h"
/**
* AnimGraph editor node for the ElevenLabs Posture AnimNode.
* AnimGraph editor node for the PS AI Agent Posture AnimNode.
*
* This node appears in the AnimBP graph editor under the "ElevenLabs" category.
* Place it AFTER the ElevenLabs Facial Expression node and BEFORE the
* ElevenLabs Lip Sync node in the MetaHuman Face AnimBP.
* This node appears in the AnimBP graph editor under the "PS AI Agent" category.
* Place it AFTER the PS AI Agent Facial Expression node and BEFORE the
* PS AI Agent Lip Sync node in the MetaHuman Face AnimBP.
*
* It auto-discovers the ElevenLabsPostureComponent on the owning Actor
* It auto-discovers the PS_AI_Agent_PostureComponent on the owning Actor
* and injects head bone rotation + ARKit eye gaze curves for look-at tracking.
*/
UCLASS()
class UAnimGraphNode_ElevenLabsPosture : public UAnimGraphNode_Base
class UAnimGraphNode_PS_AI_Agent_Posture : public UAnimGraphNode_Base
{
GENERATED_BODY()
UPROPERTY(EditAnywhere, Category = "Settings")
FAnimNode_ElevenLabsPosture Node;
FAnimNode_PS_AI_Agent_Posture Node;
// UAnimGraphNode_Base interface
virtual FText GetNodeTitle(ENodeTitleType::Type TitleType) const override;