Rename plugin ElevenLabs → PS_AI_ConvAgent for multi-backend support
Separates generic systems (lip sync, posture, facial expression, mic) from ElevenLabs-specific ones (conv agent, websocket, settings). Generic classes now use PS_AI_Agent_ prefix; ElevenLabs-specific keep _ElevenLabs suffix. CoreRedirects in DefaultEngine.ini ensure existing .uasset references load correctly. Binaries/Intermediate deleted for full rebuild. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
8a4622f348
commit
eeb0b1172b
@ -1,5 +1,11 @@
|
||||
# PS_AI_Agent — Notes pour Claude
|
||||
|
||||
## Plugin Rename (completed)
|
||||
- Plugin: `PS_AI_Agent_ElevenLabs` → `PS_AI_ConvAgent`
|
||||
- Generic classes: `ElevenLabs*` → `PS_AI_Agent_*` (e.g. `UPS_AI_Agent_PostureComponent`)
|
||||
- ElevenLabs-specific classes keep `_ElevenLabs` suffix (e.g. `UPS_AI_Agent_Conv_ElevenLabsComponent`)
|
||||
- CoreRedirects in DefaultEngine.ini handle .uasset references
|
||||
|
||||
## Posture System — Diagonal Tilt Bug (à corriger)
|
||||
|
||||
### Problème
|
||||
@ -12,15 +18,17 @@ Le swing-twist est fait une seule fois sur le tip bone (head), puis le CleanOffs
|
||||
Faire le swing-twist **PER-BONE** : pour chaque bone de la chaîne, composer le FractionalRot avec le bone's own rotation, swing-twist autour du bone's own tilt axis, puis appliquer le swing seulement.
|
||||
|
||||
### Fichiers concernés
|
||||
- `AnimNode_ElevenLabsPosture.cpp` — Evaluate_AnyThread, section multi-bone chain
|
||||
- `AnimNode_PS_AI_Agent_Posture.cpp` — Evaluate_AnyThread, section multi-bone chain
|
||||
- Commit actuel: `8df6967` sur main
|
||||
|
||||
### Config actuelle neck chain (BP_Taro)
|
||||
neck_01=0.25, neck_02=0.35, head=0.40
|
||||
|
||||
## Architecture Posture
|
||||
- **ElevenLabsPostureComponent** (game thread) : cascade Eyes→Head→Body, produit FQuat + eye curves
|
||||
- **AnimNode_ElevenLabsPosture** (anim thread) : applique rotation sur bone(s) + injecte curves
|
||||
- **PS_AI_Agent_PostureComponent** (game thread) : cascade Eyes→Head→Body, produit FQuat + eye curves
|
||||
- **AnimNode_PS_AI_Agent_Posture** (anim thread) : applique rotation sur bone(s) + injecte curves
|
||||
- Pipeline 100% FQuat (pas de round-trip FRotator)
|
||||
- Thread safety via FCriticalSection
|
||||
- ARKit eye curves normalisées par range fixe (30°/20°), pas par MaxEye threshold
|
||||
- ARKit eye curves normalisées par range fixe (40°/35°), pas par MaxEye threshold
|
||||
- Body drift compensation: applied ONLY on first chain bone, AFTER swing-twist
|
||||
- Debug gaze: per-eye lines from Face mesh using Z-axis (GetAxisZ())
|
||||
|
||||
@ -80,6 +80,51 @@ FontDPI=72
|
||||
+ActiveGameNameRedirects=(OldGameName="TP_Blank",NewGameName="/Script/PS_AI_Agent")
|
||||
+ActiveGameNameRedirects=(OldGameName="/Script/TP_Blank",NewGameName="/Script/PS_AI_Agent")
|
||||
|
||||
[CoreRedirects]
|
||||
; ── Plugin module rename ──
|
||||
+PackageRedirects=(OldName="/Script/PS_AI_Agent_ElevenLabs", NewName="/Script/PS_AI_ConvAgent")
|
||||
+PackageRedirects=(OldName="/Script/PS_AI_Agent_ElevenLabsEditor", NewName="/Script/PS_AI_ConvAgentEditor")
|
||||
|
||||
; ── Generic classes (ElevenLabs → PS_AI_Agent) ──
|
||||
+ClassRedirects=(OldName="ElevenLabsLipSyncComponent", NewName="PS_AI_Agent_LipSyncComponent")
|
||||
+ClassRedirects=(OldName="ElevenLabsFacialExpressionComponent", NewName="PS_AI_Agent_FacialExpressionComponent")
|
||||
+ClassRedirects=(OldName="ElevenLabsPostureComponent", NewName="PS_AI_Agent_PostureComponent")
|
||||
+ClassRedirects=(OldName="ElevenLabsMicrophoneCaptureComponent", NewName="PS_AI_Agent_MicrophoneCaptureComponent")
|
||||
+ClassRedirects=(OldName="ElevenLabsLipSyncPoseMap", NewName="PS_AI_Agent_LipSyncPoseMap")
|
||||
+ClassRedirects=(OldName="ElevenLabsEmotionPoseMap", NewName="PS_AI_Agent_EmotionPoseMap")
|
||||
|
||||
; ── ElevenLabs-specific classes ──
|
||||
+ClassRedirects=(OldName="ElevenLabsConversationalAgentComponent", NewName="PS_AI_Agent_Conv_ElevenLabsComponent")
|
||||
+ClassRedirects=(OldName="ElevenLabsWebSocketProxy", NewName="PS_AI_Agent_WebSocket_ElevenLabsProxy")
|
||||
+ClassRedirects=(OldName="ElevenLabsSettings", NewName="PS_AI_Agent_Settings_ElevenLabs")
|
||||
|
||||
; ── AnimNode structs ──
|
||||
+StructRedirects=(OldName="AnimNode_ElevenLabsLipSync", NewName="AnimNode_PS_AI_Agent_LipSync")
|
||||
+StructRedirects=(OldName="AnimNode_ElevenLabsFacialExpression", NewName="AnimNode_PS_AI_Agent_FacialExpression")
|
||||
+StructRedirects=(OldName="AnimNode_ElevenLabsPosture", NewName="AnimNode_PS_AI_Agent_Posture")
|
||||
|
||||
; ── AnimGraphNode classes ──
|
||||
+ClassRedirects=(OldName="AnimGraphNode_ElevenLabsLipSync", NewName="AnimGraphNode_PS_AI_Agent_LipSync")
|
||||
+ClassRedirects=(OldName="AnimGraphNode_ElevenLabsFacialExpression", NewName="AnimGraphNode_PS_AI_Agent_FacialExpression")
|
||||
+ClassRedirects=(OldName="AnimGraphNode_ElevenLabsPosture", NewName="AnimGraphNode_PS_AI_Agent_Posture")
|
||||
|
||||
; ── Factory classes ──
|
||||
+ClassRedirects=(OldName="ElevenLabsLipSyncPoseMapFactory", NewName="PS_AI_Agent_LipSyncPoseMapFactory")
|
||||
+ClassRedirects=(OldName="ElevenLabsEmotionPoseMapFactory", NewName="PS_AI_Agent_EmotionPoseMapFactory")
|
||||
|
||||
; ── Structs ──
|
||||
+StructRedirects=(OldName="ElevenLabsEmotionPoseSet", NewName="PS_AI_Agent_EmotionPoseSet")
|
||||
+StructRedirects=(OldName="ElevenLabsNeckBoneEntry", NewName="PS_AI_Agent_NeckBoneEntry")
|
||||
+StructRedirects=(OldName="ElevenLabsConversationInfo", NewName="PS_AI_Agent_ConversationInfo_ElevenLabs")
|
||||
+StructRedirects=(OldName="ElevenLabsTranscriptSegment", NewName="PS_AI_Agent_TranscriptSegment_ElevenLabs")
|
||||
+StructRedirects=(OldName="ElevenLabsClientToolCall", NewName="PS_AI_Agent_ClientToolCall_ElevenLabs")
|
||||
|
||||
; ── Enums ──
|
||||
+EnumRedirects=(OldName="EElevenLabsConnectionState", NewName="EPS_AI_Agent_ConnectionState_ElevenLabs")
|
||||
+EnumRedirects=(OldName="EElevenLabsTurnMode", NewName="EPS_AI_Agent_TurnMode_ElevenLabs")
|
||||
+EnumRedirects=(OldName="EElevenLabsEmotion", NewName="EPS_AI_Agent_Emotion")
|
||||
+EnumRedirects=(OldName="EElevenLabsEmotionIntensity", NewName="EPS_AI_Agent_EmotionIntensity")
|
||||
|
||||
[/Script/AndroidFileServerEditor.AndroidFileServerRuntimeSettings]
|
||||
bEnablePlugin=True
|
||||
bAllowNetworkConnection=True
|
||||
|
||||
@ -19,7 +19,7 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"Name": "PS_AI_Agent_ElevenLabs",
|
||||
"Name": "PS_AI_ConvAgent",
|
||||
"Enabled": true
|
||||
},
|
||||
{
|
||||
|
||||
@ -1,50 +0,0 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "PS_AI_Agent_ElevenLabs.h"
|
||||
#include "Developer/Settings/Public/ISettingsModule.h"
|
||||
#include "UObject/UObjectGlobals.h"
|
||||
#include "UObject/Package.h"
|
||||
|
||||
IMPLEMENT_MODULE(FPS_AI_Agent_ElevenLabsModule, PS_AI_Agent_ElevenLabs)
|
||||
|
||||
#define LOCTEXT_NAMESPACE "PS_AI_Agent_ElevenLabs"
|
||||
|
||||
void FPS_AI_Agent_ElevenLabsModule::StartupModule()
|
||||
{
|
||||
Settings = NewObject<UElevenLabsSettings>(GetTransientPackage(), "ElevenLabsSettings", RF_Standalone);
|
||||
Settings->AddToRoot();
|
||||
|
||||
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
|
||||
{
|
||||
SettingsModule->RegisterSettings(
|
||||
"Project", "Plugins", "ElevenLabsAIAgent",
|
||||
LOCTEXT("SettingsName", "ElevenLabs AI Agent"),
|
||||
LOCTEXT("SettingsDescription", "Configure the ElevenLabs Conversational AI Agent plugin"),
|
||||
Settings);
|
||||
}
|
||||
}
|
||||
|
||||
void FPS_AI_Agent_ElevenLabsModule::ShutdownModule()
|
||||
{
|
||||
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
|
||||
{
|
||||
SettingsModule->UnregisterSettings("Project", "Plugins", "ElevenLabsAIAgent");
|
||||
}
|
||||
|
||||
if (!GExitPurge)
|
||||
{
|
||||
Settings->RemoveFromRoot();
|
||||
}
|
||||
else
|
||||
{
|
||||
Settings = nullptr;
|
||||
}
|
||||
}
|
||||
|
||||
UElevenLabsSettings* FPS_AI_Agent_ElevenLabsModule::GetSettings() const
|
||||
{
|
||||
check(Settings);
|
||||
return Settings;
|
||||
}
|
||||
|
||||
#undef LOCTEXT_NAMESPACE
|
||||
@ -1,32 +0,0 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimGraphNode_ElevenLabsFacialExpression.h"
|
||||
|
||||
#define LOCTEXT_NAMESPACE "AnimNode_ElevenLabsFacialExpression"
|
||||
|
||||
FText UAnimGraphNode_ElevenLabsFacialExpression::GetNodeTitle(ENodeTitleType::Type TitleType) const
|
||||
{
|
||||
return LOCTEXT("NodeTitle", "ElevenLabs Facial Expression");
|
||||
}
|
||||
|
||||
FText UAnimGraphNode_ElevenLabsFacialExpression::GetTooltipText() const
|
||||
{
|
||||
return LOCTEXT("Tooltip",
|
||||
"Injects emotion expression curves from the ElevenLabs Facial Expression component.\n\n"
|
||||
"Place this node BEFORE the ElevenLabs Lip Sync node in the MetaHuman Face AnimBP.\n"
|
||||
"It outputs CTRL_expressions_* curves for eyes, eyebrows, cheeks, and mouth mood.\n"
|
||||
"The Lip Sync node placed after will override mouth-area curves during speech.");
|
||||
}
|
||||
|
||||
FString UAnimGraphNode_ElevenLabsFacialExpression::GetNodeCategory() const
|
||||
{
|
||||
return TEXT("ElevenLabs");
|
||||
}
|
||||
|
||||
FLinearColor UAnimGraphNode_ElevenLabsFacialExpression::GetNodeTitleColor() const
|
||||
{
|
||||
// Warm amber to distinguish from Lip Sync (teal)
|
||||
return FLinearColor(0.8f, 0.5f, 0.1f, 1.0f);
|
||||
}
|
||||
|
||||
#undef LOCTEXT_NAMESPACE
|
||||
@ -1,32 +0,0 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimGraphNode_ElevenLabsLipSync.h"
|
||||
|
||||
#define LOCTEXT_NAMESPACE "AnimNode_ElevenLabsLipSync"
|
||||
|
||||
FText UAnimGraphNode_ElevenLabsLipSync::GetNodeTitle(ENodeTitleType::Type TitleType) const
|
||||
{
|
||||
return LOCTEXT("NodeTitle", "ElevenLabs Lip Sync");
|
||||
}
|
||||
|
||||
FText UAnimGraphNode_ElevenLabsLipSync::GetTooltipText() const
|
||||
{
|
||||
return LOCTEXT("Tooltip",
|
||||
"Injects lip sync animation curves from the ElevenLabs Lip Sync component.\n\n"
|
||||
"Place this node BEFORE the Control Rig in the MetaHuman Face AnimBP.\n"
|
||||
"It auto-discovers the component and outputs CTRL_expressions_* curves\n"
|
||||
"that the MetaHuman Control Rig uses to drive facial bones.");
|
||||
}
|
||||
|
||||
FString UAnimGraphNode_ElevenLabsLipSync::GetNodeCategory() const
|
||||
{
|
||||
return TEXT("ElevenLabs");
|
||||
}
|
||||
|
||||
FLinearColor UAnimGraphNode_ElevenLabsLipSync::GetNodeTitleColor() const
|
||||
{
|
||||
// ElevenLabs brand-ish teal color
|
||||
return FLinearColor(0.1f, 0.7f, 0.6f, 1.0f);
|
||||
}
|
||||
|
||||
#undef LOCTEXT_NAMESPACE
|
||||
@ -1,33 +0,0 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimGraphNode_ElevenLabsPosture.h"
|
||||
|
||||
#define LOCTEXT_NAMESPACE "AnimNode_ElevenLabsPosture"
|
||||
|
||||
FText UAnimGraphNode_ElevenLabsPosture::GetNodeTitle(ENodeTitleType::Type TitleType) const
|
||||
{
|
||||
return LOCTEXT("NodeTitle", "ElevenLabs Posture");
|
||||
}
|
||||
|
||||
FText UAnimGraphNode_ElevenLabsPosture::GetTooltipText() const
|
||||
{
|
||||
return LOCTEXT("Tooltip",
|
||||
"Injects head rotation and eye gaze curves from the ElevenLabs Posture component.\n\n"
|
||||
"Place this node AFTER the ElevenLabs Facial Expression node and\n"
|
||||
"BEFORE the ElevenLabs Lip Sync node in the MetaHuman Face AnimBP.\n\n"
|
||||
"The component distributes look-at rotation across body (actor yaw),\n"
|
||||
"head (bone rotation), and eyes (ARKit curves) for a natural look-at effect.");
|
||||
}
|
||||
|
||||
FString UAnimGraphNode_ElevenLabsPosture::GetNodeCategory() const
|
||||
{
|
||||
return TEXT("ElevenLabs");
|
||||
}
|
||||
|
||||
FLinearColor UAnimGraphNode_ElevenLabsPosture::GetNodeTitleColor() const
|
||||
{
|
||||
// Cool blue to distinguish from Facial Expression (amber) and Lip Sync (teal)
|
||||
return FLinearColor(0.2f, 0.4f, 0.9f, 1.0f);
|
||||
}
|
||||
|
||||
#undef LOCTEXT_NAMESPACE
|
||||
@ -1,29 +0,0 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "ElevenLabsEmotionPoseMapFactory.h"
|
||||
#include "ElevenLabsEmotionPoseMap.h"
|
||||
#include "AssetTypeCategories.h"
|
||||
|
||||
UElevenLabsEmotionPoseMapFactory::UElevenLabsEmotionPoseMapFactory()
|
||||
{
|
||||
SupportedClass = UElevenLabsEmotionPoseMap::StaticClass();
|
||||
bCreateNew = true;
|
||||
bEditAfterNew = true;
|
||||
}
|
||||
|
||||
UObject* UElevenLabsEmotionPoseMapFactory::FactoryCreateNew(
|
||||
UClass* Class, UObject* InParent, FName Name, EObjectFlags Flags,
|
||||
UObject* Context, FFeedbackContext* Warn)
|
||||
{
|
||||
return NewObject<UElevenLabsEmotionPoseMap>(InParent, Class, Name, Flags);
|
||||
}
|
||||
|
||||
FText UElevenLabsEmotionPoseMapFactory::GetDisplayName() const
|
||||
{
|
||||
return FText::FromString(TEXT("ElevenLabs Emotion Pose Map"));
|
||||
}
|
||||
|
||||
uint32 UElevenLabsEmotionPoseMapFactory::GetMenuCategories() const
|
||||
{
|
||||
return EAssetTypeCategories::Misc;
|
||||
}
|
||||
@ -1,29 +0,0 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "ElevenLabsLipSyncPoseMapFactory.h"
|
||||
#include "ElevenLabsLipSyncPoseMap.h"
|
||||
#include "AssetTypeCategories.h"
|
||||
|
||||
UElevenLabsLipSyncPoseMapFactory::UElevenLabsLipSyncPoseMapFactory()
|
||||
{
|
||||
SupportedClass = UElevenLabsLipSyncPoseMap::StaticClass();
|
||||
bCreateNew = true;
|
||||
bEditAfterNew = true;
|
||||
}
|
||||
|
||||
UObject* UElevenLabsLipSyncPoseMapFactory::FactoryCreateNew(
|
||||
UClass* Class, UObject* InParent, FName Name, EObjectFlags Flags,
|
||||
UObject* Context, FFeedbackContext* Warn)
|
||||
{
|
||||
return NewObject<UElevenLabsLipSyncPoseMap>(InParent, Class, Name, Flags);
|
||||
}
|
||||
|
||||
FText UElevenLabsLipSyncPoseMapFactory::GetDisplayName() const
|
||||
{
|
||||
return FText::FromString(TEXT("ElevenLabs Lip Sync Pose Map"));
|
||||
}
|
||||
|
||||
uint32 UElevenLabsLipSyncPoseMapFactory::GetMenuCategories() const
|
||||
{
|
||||
return EAssetTypeCategories::Misc;
|
||||
}
|
||||
@ -1,16 +0,0 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "Modules/ModuleManager.h"
|
||||
|
||||
/**
|
||||
* Editor module for PS_AI_Agent_ElevenLabs plugin.
|
||||
* Provides AnimGraph node(s) for the ElevenLabs Lip Sync system.
|
||||
*/
|
||||
class FPS_AI_Agent_ElevenLabsEditorModule : public IModuleInterface
|
||||
{
|
||||
public:
|
||||
virtual void StartupModule() override {}
|
||||
virtual void ShutdownModule() override {}
|
||||
};
|
||||
|
||||
IMPLEMENT_MODULE(FPS_AI_Agent_ElevenLabsEditorModule, PS_AI_Agent_ElevenLabsEditor)
|
||||
@ -1,31 +0,0 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "AnimGraphNode_Base.h"
|
||||
#include "AnimNode_ElevenLabsFacialExpression.h"
|
||||
#include "AnimGraphNode_ElevenLabsFacialExpression.generated.h"
|
||||
|
||||
/**
|
||||
* AnimGraph editor node for the ElevenLabs Facial Expression AnimNode.
|
||||
*
|
||||
* This node appears in the AnimBP graph editor under the "ElevenLabs" category.
|
||||
* Place it BEFORE the ElevenLabs Lip Sync node in the MetaHuman Face AnimBP.
|
||||
* It auto-discovers the ElevenLabsFacialExpressionComponent on the owning Actor
|
||||
* and injects CTRL_expressions_* curves for emotion-driven facial expressions.
|
||||
*/
|
||||
UCLASS()
|
||||
class UAnimGraphNode_ElevenLabsFacialExpression : public UAnimGraphNode_Base
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
UPROPERTY(EditAnywhere, Category = "Settings")
|
||||
FAnimNode_ElevenLabsFacialExpression Node;
|
||||
|
||||
// UAnimGraphNode_Base interface
|
||||
virtual FText GetNodeTitle(ENodeTitleType::Type TitleType) const override;
|
||||
virtual FText GetTooltipText() const override;
|
||||
virtual FString GetNodeCategory() const override;
|
||||
virtual FLinearColor GetNodeTitleColor() const override;
|
||||
};
|
||||
@ -2,12 +2,12 @@
|
||||
"FileVersion": 3,
|
||||
"Version": 1,
|
||||
"VersionName": "1.0.0",
|
||||
"FriendlyName": "PS AI Agent - ElevenLabs",
|
||||
"Description": "Integrates ElevenLabs Conversational AI Agent into Unreal Engine 5.5. Supports real-time voice conversation via WebSocket, microphone capture, and audio playback.",
|
||||
"FriendlyName": "PS AI Conversational Agent",
|
||||
"Description": "Conversational AI Agent framework for Unreal Engine 5.5. Supports lip sync, facial expressions, posture/look-at, and pluggable backends (ElevenLabs, etc.).",
|
||||
"Category": "AI",
|
||||
"CreatedBy": "ASTERION",
|
||||
"CreatedByURL": "",
|
||||
"DocsURL": "https://elevenlabs.io/docs/conversational-ai",
|
||||
"DocsURL": "",
|
||||
"MarketplaceURL": "",
|
||||
"SupportURL": "",
|
||||
"CanContainContent": false,
|
||||
@ -16,7 +16,7 @@
|
||||
"Installed": false,
|
||||
"Modules": [
|
||||
{
|
||||
"Name": "PS_AI_Agent_ElevenLabs",
|
||||
"Name": "PS_AI_ConvAgent",
|
||||
"Type": "Runtime",
|
||||
"LoadingPhase": "PreDefault",
|
||||
"PlatformAllowList": [
|
||||
@ -26,7 +26,7 @@
|
||||
]
|
||||
},
|
||||
{
|
||||
"Name": "PS_AI_Agent_ElevenLabsEditor",
|
||||
"Name": "PS_AI_ConvAgentEditor",
|
||||
"Type": "UncookedOnly",
|
||||
"LoadingPhase": "Default",
|
||||
"PlatformAllowList": [
|
||||
@ -2,9 +2,9 @@
|
||||
|
||||
using UnrealBuildTool;
|
||||
|
||||
public class PS_AI_Agent_ElevenLabs : ModuleRules
|
||||
public class PS_AI_ConvAgent : ModuleRules
|
||||
{
|
||||
public PS_AI_Agent_ElevenLabs(ReadOnlyTargetRules Target) : base(Target)
|
||||
public PS_AI_ConvAgent(ReadOnlyTargetRules Target) : base(Target)
|
||||
{
|
||||
DefaultBuildSettings = BuildSettingsVersion.Latest;
|
||||
PCHUsage = PCHUsageMode.UseExplicitOrSharedPCHs;
|
||||
@ -18,7 +18,7 @@ public class PS_AI_Agent_ElevenLabs : ModuleRules
|
||||
// JSON serialization for WebSocket message payloads
|
||||
"Json",
|
||||
"JsonUtilities",
|
||||
// WebSocket for ElevenLabs Conversational AI real-time API
|
||||
// WebSocket for conversational AI real-time API
|
||||
"WebSockets",
|
||||
// HTTP for REST calls (agent metadata, auth, etc.)
|
||||
"HTTP",
|
||||
@ -1,22 +1,22 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimNode_ElevenLabsFacialExpression.h"
|
||||
#include "ElevenLabsFacialExpressionComponent.h"
|
||||
#include "AnimNode_PS_AI_Agent_FacialExpression.h"
|
||||
#include "PS_AI_Agent_FacialExpressionComponent.h"
|
||||
#include "Components/SkeletalMeshComponent.h"
|
||||
#include "Animation/AnimInstanceProxy.h"
|
||||
#include "GameFramework/Actor.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsFacialExprAnimNode, Log, All);
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_FacialExprAnimNode, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// FAnimNode_Base interface
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void FAnimNode_ElevenLabsFacialExpression::Initialize_AnyThread(const FAnimationInitializeContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_FacialExpression::Initialize_AnyThread(const FAnimationInitializeContext& Context)
|
||||
{
|
||||
BasePose.Initialize(Context);
|
||||
|
||||
// Find the ElevenLabsFacialExpressionComponent on the owning actor.
|
||||
// Find the PS_AI_Agent_FacialExpressionComponent on the owning actor.
|
||||
// This runs during initialization (game thread) so actor access is safe.
|
||||
FacialExpressionComponent.Reset();
|
||||
CachedEmotionCurves.Reset();
|
||||
@ -27,19 +27,19 @@ void FAnimNode_ElevenLabsFacialExpression::Initialize_AnyThread(const FAnimation
|
||||
{
|
||||
if (AActor* Owner = SkelMesh->GetOwner())
|
||||
{
|
||||
UElevenLabsFacialExpressionComponent* Comp =
|
||||
Owner->FindComponentByClass<UElevenLabsFacialExpressionComponent>();
|
||||
UPS_AI_Agent_FacialExpressionComponent* Comp =
|
||||
Owner->FindComponentByClass<UPS_AI_Agent_FacialExpressionComponent>();
|
||||
if (Comp)
|
||||
{
|
||||
FacialExpressionComponent = Comp;
|
||||
UE_LOG(LogElevenLabsFacialExprAnimNode, Log,
|
||||
TEXT("ElevenLabs Facial Expression AnimNode bound to component on %s."),
|
||||
UE_LOG(LogPS_AI_Agent_FacialExprAnimNode, Log,
|
||||
TEXT("PS AI Agent Facial Expression AnimNode bound to component on %s."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsFacialExprAnimNode, Warning,
|
||||
TEXT("No ElevenLabsFacialExpressionComponent found on %s. "
|
||||
UE_LOG(LogPS_AI_Agent_FacialExprAnimNode, Warning,
|
||||
TEXT("No PS_AI_Agent_FacialExpressionComponent found on %s. "
|
||||
"Add the component alongside the Conversational Agent."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
@ -48,12 +48,12 @@ void FAnimNode_ElevenLabsFacialExpression::Initialize_AnyThread(const FAnimation
|
||||
}
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsFacialExpression::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_FacialExpression::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
|
||||
{
|
||||
BasePose.CacheBones(Context);
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsFacialExpression::Update_AnyThread(const FAnimationUpdateContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_FacialExpression::Update_AnyThread(const FAnimationUpdateContext& Context)
|
||||
{
|
||||
BasePose.Update(Context);
|
||||
|
||||
@ -68,7 +68,7 @@ void FAnimNode_ElevenLabsFacialExpression::Update_AnyThread(const FAnimationUpda
|
||||
}
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsFacialExpression::Evaluate_AnyThread(FPoseContext& Output)
|
||||
void FAnimNode_PS_AI_Agent_FacialExpression::Evaluate_AnyThread(FPoseContext& Output)
|
||||
{
|
||||
// Evaluate the upstream pose (pass-through)
|
||||
BasePose.Evaluate(Output);
|
||||
@ -84,9 +84,9 @@ void FAnimNode_ElevenLabsFacialExpression::Evaluate_AnyThread(FPoseContext& Outp
|
||||
}
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsFacialExpression::GatherDebugData(FNodeDebugData& DebugData)
|
||||
void FAnimNode_PS_AI_Agent_FacialExpression::GatherDebugData(FNodeDebugData& DebugData)
|
||||
{
|
||||
FString DebugLine = FString::Printf(TEXT("ElevenLabs Facial Expression (%d curves)"), CachedEmotionCurves.Num());
|
||||
FString DebugLine = FString::Printf(TEXT("PS AI Agent Facial Expression (%d curves)"), CachedEmotionCurves.Num());
|
||||
DebugData.AddDebugItem(DebugLine);
|
||||
BasePose.GatherDebugData(DebugData);
|
||||
}
|
||||
@ -1,22 +1,22 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimNode_ElevenLabsLipSync.h"
|
||||
#include "ElevenLabsLipSyncComponent.h"
|
||||
#include "AnimNode_PS_AI_Agent_LipSync.h"
|
||||
#include "PS_AI_Agent_LipSyncComponent.h"
|
||||
#include "Components/SkeletalMeshComponent.h"
|
||||
#include "Animation/AnimInstanceProxy.h"
|
||||
#include "GameFramework/Actor.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsAnimNode, Log, All);
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_LipSyncAnimNode, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// FAnimNode_Base interface
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void FAnimNode_ElevenLabsLipSync::Initialize_AnyThread(const FAnimationInitializeContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_LipSync::Initialize_AnyThread(const FAnimationInitializeContext& Context)
|
||||
{
|
||||
BasePose.Initialize(Context);
|
||||
|
||||
// Find the ElevenLabsLipSyncComponent on the owning actor.
|
||||
// Find the PS_AI_Agent_LipSyncComponent on the owning actor.
|
||||
// This runs during initialization (game thread) so actor access is safe.
|
||||
LipSyncComponent.Reset();
|
||||
CachedCurves.Reset();
|
||||
@ -27,19 +27,19 @@ void FAnimNode_ElevenLabsLipSync::Initialize_AnyThread(const FAnimationInitializ
|
||||
{
|
||||
if (AActor* Owner = SkelMesh->GetOwner())
|
||||
{
|
||||
UElevenLabsLipSyncComponent* Comp =
|
||||
Owner->FindComponentByClass<UElevenLabsLipSyncComponent>();
|
||||
UPS_AI_Agent_LipSyncComponent* Comp =
|
||||
Owner->FindComponentByClass<UPS_AI_Agent_LipSyncComponent>();
|
||||
if (Comp)
|
||||
{
|
||||
LipSyncComponent = Comp;
|
||||
UE_LOG(LogElevenLabsAnimNode, Log,
|
||||
TEXT("ElevenLabs Lip Sync AnimNode bound to component on %s."),
|
||||
UE_LOG(LogPS_AI_Agent_LipSyncAnimNode, Log,
|
||||
TEXT("PS AI Agent Lip Sync AnimNode bound to component on %s."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsAnimNode, Warning,
|
||||
TEXT("No ElevenLabsLipSyncComponent found on %s. "
|
||||
UE_LOG(LogPS_AI_Agent_LipSyncAnimNode, Warning,
|
||||
TEXT("No PS_AI_Agent_LipSyncComponent found on %s. "
|
||||
"Add the component alongside the Conversational Agent."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
@ -48,12 +48,12 @@ void FAnimNode_ElevenLabsLipSync::Initialize_AnyThread(const FAnimationInitializ
|
||||
}
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsLipSync::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_LipSync::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
|
||||
{
|
||||
BasePose.CacheBones(Context);
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsLipSync::Update_AnyThread(const FAnimationUpdateContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_LipSync::Update_AnyThread(const FAnimationUpdateContext& Context)
|
||||
{
|
||||
BasePose.Update(Context);
|
||||
|
||||
@ -69,7 +69,7 @@ void FAnimNode_ElevenLabsLipSync::Update_AnyThread(const FAnimationUpdateContext
|
||||
}
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsLipSync::Evaluate_AnyThread(FPoseContext& Output)
|
||||
void FAnimNode_PS_AI_Agent_LipSync::Evaluate_AnyThread(FPoseContext& Output)
|
||||
{
|
||||
// Evaluate the upstream pose (pass-through)
|
||||
BasePose.Evaluate(Output);
|
||||
@ -87,9 +87,9 @@ void FAnimNode_ElevenLabsLipSync::Evaluate_AnyThread(FPoseContext& Output)
|
||||
}
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsLipSync::GatherDebugData(FNodeDebugData& DebugData)
|
||||
void FAnimNode_PS_AI_Agent_LipSync::GatherDebugData(FNodeDebugData& DebugData)
|
||||
{
|
||||
FString DebugLine = FString::Printf(TEXT("ElevenLabs Lip Sync (%d curves)"), CachedCurves.Num());
|
||||
FString DebugLine = FString::Printf(TEXT("PS AI Agent Lip Sync (%d curves)"), CachedCurves.Num());
|
||||
DebugData.AddDebugItem(DebugLine);
|
||||
BasePose.GatherDebugData(DebugData);
|
||||
}
|
||||
@ -1,12 +1,12 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimNode_ElevenLabsPosture.h"
|
||||
#include "ElevenLabsPostureComponent.h"
|
||||
#include "AnimNode_PS_AI_Agent_Posture.h"
|
||||
#include "PS_AI_Agent_PostureComponent.h"
|
||||
#include "Components/SkeletalMeshComponent.h"
|
||||
#include "Animation/AnimInstanceProxy.h"
|
||||
#include "GameFramework/Actor.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsPostureAnimNode, Log, All);
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_PostureAnimNode, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// ARKit → MetaHuman CTRL eye curve mapping.
|
||||
@ -77,7 +77,7 @@ static const TMap<FName, FName>& GetARKitToCTRLEyeMap()
|
||||
// FAnimNode_Base interface
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializeContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_Posture::Initialize_AnyThread(const FAnimationInitializeContext& Context)
|
||||
{
|
||||
BasePose.Initialize(Context);
|
||||
|
||||
@ -103,20 +103,20 @@ void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializ
|
||||
{
|
||||
if (AActor* Owner = SkelMesh->GetOwner())
|
||||
{
|
||||
UElevenLabsPostureComponent* Comp =
|
||||
Owner->FindComponentByClass<UElevenLabsPostureComponent>();
|
||||
UPS_AI_Agent_PostureComponent* Comp =
|
||||
Owner->FindComponentByClass<UPS_AI_Agent_PostureComponent>();
|
||||
if (Comp)
|
||||
{
|
||||
PostureComponent = Comp;
|
||||
HeadBoneName = Comp->GetHeadBoneName();
|
||||
|
||||
// Cache neck bone chain configuration
|
||||
const TArray<FElevenLabsNeckBoneEntry>& Chain = Comp->GetNeckBoneChain();
|
||||
const TArray<FPS_AI_Agent_NeckBoneEntry>& Chain = Comp->GetNeckBoneChain();
|
||||
if (Chain.Num() > 0)
|
||||
{
|
||||
ChainBoneNames.Reserve(Chain.Num());
|
||||
ChainBoneWeights.Reserve(Chain.Num());
|
||||
for (const FElevenLabsNeckBoneEntry& Entry : Chain)
|
||||
for (const FPS_AI_Agent_NeckBoneEntry& Entry : Chain)
|
||||
{
|
||||
if (!Entry.BoneName.IsNone() && Entry.Weight > 0.0f)
|
||||
{
|
||||
@ -126,8 +126,8 @@ void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializ
|
||||
}
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Log,
|
||||
TEXT("ElevenLabs Posture AnimNode: Owner=%s Mesh=%s Head=%s Eyes=%s Chain=%d HeadComp=%.1f EyeComp=%.1f DriftComp=%.1f"),
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
|
||||
TEXT("PS AI Agent Posture AnimNode: Owner=%s Mesh=%s Head=%s Eyes=%s Chain=%d HeadComp=%.1f EyeComp=%.1f DriftComp=%.1f"),
|
||||
*Owner->GetName(), *SkelMesh->GetName(),
|
||||
bApplyHeadRotation ? TEXT("ON") : TEXT("OFF"),
|
||||
bApplyEyeCurves ? TEXT("ON") : TEXT("OFF"),
|
||||
@ -138,8 +138,8 @@ void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializ
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
TEXT("No ElevenLabsPostureComponent found on %s."),
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT("No PS_AI_Agent_PostureComponent found on %s."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
}
|
||||
@ -147,7 +147,7 @@ void FAnimNode_ElevenLabsPosture::Initialize_AnyThread(const FAnimationInitializ
|
||||
}
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_Posture::CacheBones_AnyThread(const FAnimationCacheBonesContext& Context)
|
||||
{
|
||||
BasePose.CacheBones(Context);
|
||||
|
||||
@ -187,7 +187,7 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
|
||||
: FQuat::Identity;
|
||||
ChainRefPoseRotations.Add(RefRot);
|
||||
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Log,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
|
||||
TEXT(" Chain bone [%d] '%s' → index %d (weight=%.2f)"),
|
||||
i, *ChainBoneNames[i].ToString(), CompactIdx.GetInt(),
|
||||
ChainBoneWeights[i]);
|
||||
@ -196,7 +196,7 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
|
||||
{
|
||||
ChainBoneIndices.Add(FCompactPoseBoneIndex(INDEX_NONE));
|
||||
ChainRefPoseRotations.Add(FQuat::Identity);
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT(" Chain bone [%d] '%s' NOT FOUND in skeleton!"),
|
||||
i, *ChainBoneNames[i].ToString());
|
||||
}
|
||||
@ -217,20 +217,20 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
|
||||
? RefPose[MeshIndex].GetRotation()
|
||||
: FQuat::Identity;
|
||||
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Log,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
|
||||
TEXT("Head bone '%s' resolved to index %d."),
|
||||
*HeadBoneName.ToString(), HeadBoneIndex.GetInt());
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT("Head bone '%s' NOT FOUND in skeleton. Available bones:"),
|
||||
*HeadBoneName.ToString());
|
||||
|
||||
const int32 NumBones = FMath::Min(RefSkeleton.GetNum(), 10);
|
||||
for (int32 i = 0; i < NumBones; ++i)
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT(" [%d] %s"), i, *RefSkeleton.GetBoneName(i).ToString());
|
||||
}
|
||||
}
|
||||
@ -258,13 +258,13 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
|
||||
? RefPose[MeshIdx].GetRotation()
|
||||
: FQuat::Identity;
|
||||
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Log,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
|
||||
TEXT("Eye bone '%s' resolved to index %d."),
|
||||
*BoneName.ToString(), OutIndex.GetInt());
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Log,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
|
||||
TEXT("Eye bone '%s' not found in skeleton (OK for Body AnimBP)."),
|
||||
*BoneName.ToString());
|
||||
}
|
||||
@ -311,7 +311,7 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
|
||||
RefAccumAboveChain = RefAccumAboveChain * RefRot;
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Log,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Log,
|
||||
TEXT("Body drift: %d ancestor bones above chain. RefAccum=(%s)"),
|
||||
AncestorBoneIndices.Num(),
|
||||
*RefAccumAboveChain.Rotator().ToCompactString());
|
||||
@ -320,7 +320,7 @@ void FAnimNode_ElevenLabsPosture::CacheBones_AnyThread(const FAnimationCacheBone
|
||||
}
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsPosture::Update_AnyThread(const FAnimationUpdateContext& Context)
|
||||
void FAnimNode_PS_AI_Agent_Posture::Update_AnyThread(const FAnimationUpdateContext& Context)
|
||||
{
|
||||
BasePose.Update(Context);
|
||||
|
||||
@ -383,7 +383,7 @@ static FQuat ComputeCompensatedBoneRot(
|
||||
return (CompensatedContrib * RefPoseRot).GetNormalized();
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
void FAnimNode_PS_AI_Agent_Posture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
{
|
||||
// Evaluate the upstream pose (pass-through)
|
||||
BasePose.Evaluate(Output);
|
||||
@ -398,7 +398,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
|
||||
const bool bHasEyeBones = (LeftEyeBoneIndex.GetInt() != INDEX_NONE);
|
||||
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT("[%s] Posture Evaluate: HeadComp=%.2f EyeComp=%.2f DriftComp=%.2f Valid=%s HeadRot=(%s) Eyes=%d Chain=%d Ancestors=%d"),
|
||||
NodeRole,
|
||||
CachedHeadCompensation,
|
||||
@ -420,7 +420,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
const FQuat Delta = AnimRot * RefRot.Inverse();
|
||||
const FRotator DeltaRot = Delta.Rotator();
|
||||
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT(" Chain[0] '%s' AnimDelta from RefPose: Y=%.2f P=%.2f R=%.2f (this gets removed at Comp=1)"),
|
||||
*ChainBoneNames[0].ToString(),
|
||||
DeltaRot.Yaw, DeltaRot.Pitch, DeltaRot.Roll);
|
||||
@ -454,7 +454,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
}
|
||||
if (++EyeDiagLogCounter % 300 == 1)
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT("[EYE DIAG MODE 1] Forcing CTRL_expressions_eyeLookUpL=1.0 | Left eye should look UP if Control Rig reads CTRL curves"));
|
||||
}
|
||||
|
||||
@ -473,7 +473,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
}
|
||||
if (++EyeDiagLogCounter % 300 == 1)
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT("[EYE DIAG MODE 2] Forcing ARKit eyeLookUpLeft=1.0 | Left eye should look UP if mh_arkit_mapping_pose drives eyes"));
|
||||
}
|
||||
|
||||
@ -494,7 +494,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
Output.Curve.Set(ZeroCTRL, 0.0f);
|
||||
if (++EyeDiagLogCounter % 300 == 1)
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT("[EYE DIAG MODE 3] Forcing FACIAL_L_Eye bone -25° pitch | Left eye should look UP if bone rotation drives eyes"));
|
||||
}
|
||||
#endif
|
||||
@ -576,7 +576,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
#if !UE_BUILD_SHIPPING
|
||||
if (EvalDebugFrameCounter % 300 == 1 && CachedEyeCurves.Num() > 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT(" Eyes: Comp=%.2f → %s (anim weight=%.0f%%, posture weight=%.0f%%)"),
|
||||
Comp,
|
||||
Comp > 0.001f ? TEXT("BLEND") : TEXT("PASSTHROUGH"),
|
||||
@ -633,7 +633,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
|
||||
if (++DiagLogCounter % 90 == 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT("DIAG Phase %d: %s | Timer=%.1f"), Phase, PhaseName, DiagTimer);
|
||||
}
|
||||
}
|
||||
@ -695,7 +695,7 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
if (EvalDebugFrameCounter % 300 == 1)
|
||||
{
|
||||
const FRotator CorrRot = FullCorrection.Rotator();
|
||||
UE_LOG(LogElevenLabsPostureAnimNode, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_PostureAnimNode, Warning,
|
||||
TEXT(" DriftCorrection: Y=%.1f P=%.1f R=%.1f | Comp=%.2f"),
|
||||
CorrRot.Yaw, CorrRot.Pitch, CorrRot.Roll,
|
||||
CachedBodyDriftCompensation);
|
||||
@ -784,11 +784,11 @@ void FAnimNode_ElevenLabsPosture::Evaluate_AnyThread(FPoseContext& Output)
|
||||
#endif
|
||||
}
|
||||
|
||||
void FAnimNode_ElevenLabsPosture::GatherDebugData(FNodeDebugData& DebugData)
|
||||
void FAnimNode_PS_AI_Agent_Posture::GatherDebugData(FNodeDebugData& DebugData)
|
||||
{
|
||||
const FRotator DebugRot = CachedHeadRotation.Rotator();
|
||||
FString DebugLine = FString::Printf(
|
||||
TEXT("ElevenLabs Posture (eyes: %d, head: Y=%.1f P=%.1f, chain: %d, headComp: %.1f, eyeComp: %.1f, driftComp: %.1f)"),
|
||||
TEXT("PS AI Agent Posture (eyes: %d, head: Y=%.1f P=%.1f, chain: %d, headComp: %.1f, eyeComp: %.1f, driftComp: %.1f)"),
|
||||
CachedEyeCurves.Num(),
|
||||
DebugRot.Yaw, DebugRot.Pitch,
|
||||
ChainBoneIndices.Num(),
|
||||
@ -1,19 +1,19 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "ElevenLabsConversationalAgentComponent.h"
|
||||
#include "ElevenLabsMicrophoneCaptureComponent.h"
|
||||
#include "PS_AI_Agent_ElevenLabs.h"
|
||||
#include "PS_AI_Agent_Conv_ElevenLabsComponent.h"
|
||||
#include "PS_AI_Agent_MicrophoneCaptureComponent.h"
|
||||
#include "PS_AI_ConvAgent.h"
|
||||
|
||||
#include "Components/AudioComponent.h"
|
||||
#include "Sound/SoundWaveProcedural.h"
|
||||
#include "GameFramework/Actor.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsAgent, Log, All);
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_Conv_ElevenLabs, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Constructor
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UElevenLabsConversationalAgentComponent::UElevenLabsConversationalAgentComponent()
|
||||
UPS_AI_Agent_Conv_ElevenLabsComponent::UPS_AI_Agent_Conv_ElevenLabsComponent()
|
||||
{
|
||||
PrimaryComponentTick.bCanEverTick = true;
|
||||
// Tick is used only to detect silence (agent stopped speaking).
|
||||
@ -24,19 +24,19 @@ UElevenLabsConversationalAgentComponent::UElevenLabsConversationalAgentComponent
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Lifecycle
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsConversationalAgentComponent::BeginPlay()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::BeginPlay()
|
||||
{
|
||||
Super::BeginPlay();
|
||||
InitAudioPlayback();
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
|
||||
{
|
||||
EndConversation();
|
||||
Super::EndPlay(EndPlayReason);
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELevelTick TickType,
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::TickComponent(float DeltaTime, ELevelTick TickType,
|
||||
FActorComponentTickFunction* ThisTickFunction)
|
||||
{
|
||||
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
|
||||
@ -50,7 +50,7 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
|
||||
{
|
||||
bWaitingForAgentResponse = false;
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
|
||||
TEXT("[T+%.2fs] [Turn %d] Response timeout — server did not start generating after %.1fs. Firing OnAgentResponseTimeout."),
|
||||
T, LastClosedTurnIndex, WaitTime);
|
||||
OnAgentResponseTimeout.Broadcast();
|
||||
@ -69,7 +69,7 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
|
||||
bAgentGenerating = false;
|
||||
GeneratingTickCount = 0;
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
|
||||
TEXT("[T+%.2fs] [Turn %d] Generating timeout (10s) — server generated but no audio arrived. Clearing bAgentGenerating."),
|
||||
T, LastClosedTurnIndex);
|
||||
}
|
||||
@ -89,7 +89,7 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
|
||||
{
|
||||
bPreBuffering = false;
|
||||
const double Tpb = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Pre-buffer timeout (%dms). Starting playback."),
|
||||
Tpb, LastClosedTurnIndex, AudioPreBufferMs);
|
||||
if (AudioPlaybackComponent && !AudioPlaybackComponent->IsPlaying())
|
||||
@ -146,7 +146,7 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
|
||||
if (bHardTimeoutFired)
|
||||
{
|
||||
const double Tht = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
|
||||
TEXT("[T+%.2fs] [Turn %d] Agent silence hard-timeout (10s) without agent_response — declaring agent stopped."),
|
||||
Tht, LastClosedTurnIndex);
|
||||
}
|
||||
@ -157,31 +157,31 @@ void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELe
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Control
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsConversationalAgentComponent::StartConversation()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::StartConversation()
|
||||
{
|
||||
if (!WebSocketProxy)
|
||||
{
|
||||
WebSocketProxy = NewObject<UElevenLabsWebSocketProxy>(this);
|
||||
WebSocketProxy = NewObject<UPS_AI_Agent_WebSocket_ElevenLabsProxy>(this);
|
||||
WebSocketProxy->OnConnected.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleConnected);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleConnected);
|
||||
WebSocketProxy->OnDisconnected.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleDisconnected);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleDisconnected);
|
||||
WebSocketProxy->OnError.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleError);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleError);
|
||||
WebSocketProxy->OnAudioReceived.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleAudioReceived);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAudioReceived);
|
||||
WebSocketProxy->OnTranscript.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleTranscript);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleTranscript);
|
||||
WebSocketProxy->OnAgentResponse.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleAgentResponse);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponse);
|
||||
WebSocketProxy->OnInterrupted.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleInterrupted);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleInterrupted);
|
||||
WebSocketProxy->OnAgentResponseStarted.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleAgentResponseStarted);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponseStarted);
|
||||
WebSocketProxy->OnAgentResponsePart.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleAgentResponsePart);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponsePart);
|
||||
WebSocketProxy->OnClientToolCall.AddDynamic(this,
|
||||
&UElevenLabsConversationalAgentComponent::HandleClientToolCall);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::HandleClientToolCall);
|
||||
}
|
||||
|
||||
// Pass configuration to the proxy before connecting.
|
||||
@ -191,7 +191,7 @@ void UElevenLabsConversationalAgentComponent::StartConversation()
|
||||
WebSocketProxy->Connect(AgentID);
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::EndConversation()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::EndConversation()
|
||||
{
|
||||
StopListening();
|
||||
// ISSUE-4: StopListening() may set bWaitingForAgentResponse=true (normal turn end path).
|
||||
@ -207,11 +207,11 @@ void UElevenLabsConversationalAgentComponent::EndConversation()
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::StartListening()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::StartListening()
|
||||
{
|
||||
if (!IsConnected())
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Warning, TEXT("StartListening: not connected."));
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning, TEXT("StartListening: not connected."));
|
||||
return;
|
||||
}
|
||||
|
||||
@ -230,14 +230,14 @@ void UElevenLabsConversationalAgentComponent::StartListening()
|
||||
{
|
||||
if (bAgentSpeaking && bAllowInterruption)
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Log, TEXT("StartListening: interrupting agent (speaking) to allow user to speak."));
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("StartListening: interrupting agent (speaking) to allow user to speak."));
|
||||
InterruptAgent();
|
||||
// InterruptAgent → StopAgentAudio clears bAgentSpeaking / bAgentGenerating,
|
||||
// so we fall through and open the microphone immediately.
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Log, TEXT("StartListening ignored: agent is %s%s — will listen after agent finishes."),
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("StartListening ignored: agent is %s%s — will listen after agent finishes."),
|
||||
bAgentGenerating ? TEXT("generating") : TEXT("speaking"),
|
||||
(bAgentSpeaking && !bAllowInterruption) ? TEXT(" (interruption disabled)") : TEXT(""));
|
||||
return;
|
||||
@ -248,19 +248,19 @@ void UElevenLabsConversationalAgentComponent::StartListening()
|
||||
bIsListening = true;
|
||||
TurnStartTime = FPlatformTime::Seconds();
|
||||
|
||||
if (TurnMode == EElevenLabsTurnMode::Client)
|
||||
if (TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Client)
|
||||
{
|
||||
WebSocketProxy->SendUserTurnStart();
|
||||
}
|
||||
|
||||
// Find the microphone component on our owner actor, or create one.
|
||||
UElevenLabsMicrophoneCaptureComponent* Mic =
|
||||
GetOwner()->FindComponentByClass<UElevenLabsMicrophoneCaptureComponent>();
|
||||
UPS_AI_Agent_MicrophoneCaptureComponent* Mic =
|
||||
GetOwner()->FindComponentByClass<UPS_AI_Agent_MicrophoneCaptureComponent>();
|
||||
|
||||
if (!Mic)
|
||||
{
|
||||
Mic = NewObject<UElevenLabsMicrophoneCaptureComponent>(GetOwner(),
|
||||
TEXT("ElevenLabsMicrophone"));
|
||||
Mic = NewObject<UPS_AI_Agent_MicrophoneCaptureComponent>(GetOwner(),
|
||||
TEXT("PS_AI_Agent_Microphone"));
|
||||
Mic->RegisterComponent();
|
||||
}
|
||||
|
||||
@ -268,13 +268,13 @@ void UElevenLabsConversationalAgentComponent::StartListening()
|
||||
// up if StartListening is called more than once without a matching StopListening.
|
||||
Mic->OnAudioCaptured.RemoveAll(this);
|
||||
Mic->OnAudioCaptured.AddUObject(this,
|
||||
&UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured);
|
||||
&UPS_AI_Agent_Conv_ElevenLabsComponent::OnMicrophoneDataCaptured);
|
||||
// Echo suppression: point the mic at our atomic bAgentSpeaking flag so it skips
|
||||
// capture entirely (before resampling) while the agent is speaking.
|
||||
// In Server VAD + interruption mode, disable echo suppression so the server
|
||||
// receives the user's voice even during agent playback — the server's own VAD
|
||||
// handles echo filtering and interruption detection.
|
||||
if (TurnMode == EElevenLabsTurnMode::Server && bAllowInterruption)
|
||||
if (TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Server && bAllowInterruption)
|
||||
{
|
||||
Mic->EchoSuppressFlag = nullptr;
|
||||
}
|
||||
@ -285,16 +285,16 @@ void UElevenLabsConversationalAgentComponent::StartListening()
|
||||
Mic->StartCapture();
|
||||
|
||||
const double T = TurnStartTime - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Log, TEXT("[T+%.2fs] [Turn %d] Mic opened — user speaking."), T, TurnIndex);
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("[T+%.2fs] [Turn %d] Mic opened — user speaking."), T, TurnIndex);
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::StopListening()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::StopListening()
|
||||
{
|
||||
if (!bIsListening) return;
|
||||
bIsListening = false;
|
||||
|
||||
if (UElevenLabsMicrophoneCaptureComponent* Mic =
|
||||
GetOwner() ? GetOwner()->FindComponentByClass<UElevenLabsMicrophoneCaptureComponent>() : nullptr)
|
||||
if (UPS_AI_Agent_MicrophoneCaptureComponent* Mic =
|
||||
GetOwner() ? GetOwner()->FindComponentByClass<UPS_AI_Agent_MicrophoneCaptureComponent>() : nullptr)
|
||||
{
|
||||
Mic->StopCapture();
|
||||
Mic->OnAudioCaptured.RemoveAll(this);
|
||||
@ -316,7 +316,7 @@ void UElevenLabsConversationalAgentComponent::StopListening()
|
||||
{
|
||||
if (MicAccumulationBuffer.Num() > 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("StopListening: discarding %d bytes of accumulated mic audio (collision — server is mid-generation)."),
|
||||
MicAccumulationBuffer.Num());
|
||||
}
|
||||
@ -328,7 +328,7 @@ void UElevenLabsConversationalAgentComponent::StopListening()
|
||||
MicAccumulationBuffer.Reset();
|
||||
}
|
||||
|
||||
if (WebSocketProxy && TurnMode == EElevenLabsTurnMode::Client)
|
||||
if (WebSocketProxy && TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Client)
|
||||
{
|
||||
WebSocketProxy->SendUserTurnEnd();
|
||||
}
|
||||
@ -349,7 +349,7 @@ void UElevenLabsConversationalAgentComponent::StopListening()
|
||||
const double TurnDuration = TurnStartTime > 0.0 ? TurnEndTime - TurnStartTime : 0.0;
|
||||
if (bWaitingForAgentResponse)
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Mic closed — user spoke %.2fs. Waiting for server response (timeout %.0fs)..."),
|
||||
T, TurnIndex, TurnDuration, ResponseTimeoutSeconds);
|
||||
}
|
||||
@ -357,23 +357,23 @@ void UElevenLabsConversationalAgentComponent::StopListening()
|
||||
{
|
||||
// Collision avoidance: StopListening was called from HandleAgentResponseStarted
|
||||
// while server was already generating — no need to wait or time out.
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Mic closed (collision avoidance) — user spoke %.2fs. Server is already generating Turn %d response."),
|
||||
T, TurnIndex, TurnDuration, LastClosedTurnIndex);
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::SendTextMessage(const FString& Text)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::SendTextMessage(const FString& Text)
|
||||
{
|
||||
if (!IsConnected())
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Warning, TEXT("SendTextMessage: not connected. Call StartConversation() first."));
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning, TEXT("SendTextMessage: not connected. Call StartConversation() first."));
|
||||
return;
|
||||
}
|
||||
WebSocketProxy->SendTextMessage(Text);
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::InterruptAgent()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::InterruptAgent()
|
||||
{
|
||||
bWaitingForAgentResponse = false; // Interrupting — no response expected from previous turn.
|
||||
if (WebSocketProxy) WebSocketProxy->SendInterrupt();
|
||||
@ -383,26 +383,26 @@ void UElevenLabsConversationalAgentComponent::InterruptAgent()
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// State queries
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
bool UElevenLabsConversationalAgentComponent::IsConnected() const
|
||||
bool UPS_AI_Agent_Conv_ElevenLabsComponent::IsConnected() const
|
||||
{
|
||||
return WebSocketProxy && WebSocketProxy->IsConnected();
|
||||
}
|
||||
|
||||
const FElevenLabsConversationInfo& UElevenLabsConversationalAgentComponent::GetConversationInfo() const
|
||||
const FPS_AI_Agent_ConversationInfo_ElevenLabs& UPS_AI_Agent_Conv_ElevenLabsComponent::GetConversationInfo() const
|
||||
{
|
||||
static FElevenLabsConversationInfo Empty;
|
||||
static FPS_AI_Agent_ConversationInfo_ElevenLabs Empty;
|
||||
return WebSocketProxy ? WebSocketProxy->GetConversationInfo() : Empty;
|
||||
}
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// WebSocket event handlers
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsConversationalAgentComponent::HandleConnected(const FElevenLabsConversationInfo& Info)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleConnected(const FPS_AI_Agent_ConversationInfo_ElevenLabs& Info)
|
||||
{
|
||||
SessionStartTime = FPlatformTime::Seconds();
|
||||
TurnIndex = 0;
|
||||
LastClosedTurnIndex = 0;
|
||||
UE_LOG(LogElevenLabsAgent, Log, TEXT("[T+0.00s] Agent connected. ConversationID=%s"), *Info.ConversationID);
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("[T+0.00s] Agent connected. ConversationID=%s"), *Info.ConversationID);
|
||||
OnAgentConnected.Broadcast(Info);
|
||||
|
||||
// In Client turn mode (push-to-talk), the user controls listening manually via
|
||||
@ -410,15 +410,15 @@ void UElevenLabsConversationalAgentComponent::HandleConnected(const FElevenLabsC
|
||||
// permanently and interfere with push-to-talk — the T-release StopListening()
|
||||
// would close the mic that auto-start opened, leaving the user unable to speak.
|
||||
// Only auto-start in Server VAD mode where the mic stays open the whole session.
|
||||
if (bAutoStartListening && TurnMode == EElevenLabsTurnMode::Server)
|
||||
if (bAutoStartListening && TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Server)
|
||||
{
|
||||
StartListening();
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleDisconnected(int32 StatusCode, const FString& Reason)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleDisconnected(int32 StatusCode, const FString& Reason)
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Log, TEXT("Agent disconnected. Code=%d Reason=%s"), StatusCode, *Reason);
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("Agent disconnected. Code=%d Reason=%s"), StatusCode, *Reason);
|
||||
|
||||
// ISSUE-13: stop audio playback and clear the queue if the WebSocket drops while the
|
||||
// agent is speaking. Without this the audio component kept playing buffered PCM after
|
||||
@ -432,8 +432,8 @@ void UElevenLabsConversationalAgentComponent::HandleDisconnected(int32 StatusCod
|
||||
GeneratingTickCount = 0;
|
||||
TurnIndex = 0;
|
||||
LastClosedTurnIndex = 0;
|
||||
CurrentEmotion = EElevenLabsEmotion::Neutral;
|
||||
CurrentEmotionIntensity = EElevenLabsEmotionIntensity::Medium;
|
||||
CurrentEmotion = EPS_AI_Agent_Emotion::Neutral;
|
||||
CurrentEmotionIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
|
||||
{
|
||||
FScopeLock Lock(&MicSendLock);
|
||||
MicAccumulationBuffer.Reset();
|
||||
@ -441,13 +441,13 @@ void UElevenLabsConversationalAgentComponent::HandleDisconnected(int32 StatusCod
|
||||
OnAgentDisconnected.Broadcast(StatusCode, Reason);
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleError(const FString& ErrorMessage)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleError(const FString& ErrorMessage)
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Error, TEXT("Agent error: %s"), *ErrorMessage);
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Error, TEXT("Agent error: %s"), *ErrorMessage);
|
||||
OnAgentError.Broadcast(ErrorMessage);
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleAudioReceived(const TArray<uint8>& PCMData)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAudioReceived(const TArray<uint8>& PCMData)
|
||||
{
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
const int32 NumSamples = PCMData.Num() / sizeof(int16);
|
||||
@ -457,7 +457,7 @@ void UElevenLabsConversationalAgentComponent::HandleAudioReceived(const TArray<u
|
||||
FScopeLock Lock(&AudioQueueLock);
|
||||
QueueBefore = AudioQueue.Num() / sizeof(int16);
|
||||
}
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Audio chunk received: %d samples (%.0fms) | AudioQueue before: %d samples (%.0fms)"),
|
||||
T, LastClosedTurnIndex, NumSamples, DurationMs,
|
||||
QueueBefore, (static_cast<float>(QueueBefore) / 16000.0f) * 1000.0f);
|
||||
@ -467,7 +467,7 @@ void UElevenLabsConversationalAgentComponent::HandleAudioReceived(const TArray<u
|
||||
OnAgentAudioData.Broadcast(PCMData);
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleTranscript(const FElevenLabsTranscriptSegment& Segment)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleTranscript(const FPS_AI_Agent_TranscriptSegment_ElevenLabs& Segment)
|
||||
{
|
||||
if (bEnableUserTranscript)
|
||||
{
|
||||
@ -475,7 +475,7 @@ void UElevenLabsConversationalAgentComponent::HandleTranscript(const FElevenLabs
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleAgentResponse(const FString& ResponseText)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponse(const FString& ResponseText)
|
||||
{
|
||||
// The server sends agent_response when the full text response is complete.
|
||||
// This is our reliable signal that no more TTS audio chunks will follow.
|
||||
@ -488,14 +488,14 @@ void UElevenLabsConversationalAgentComponent::HandleAgentResponse(const FString&
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleInterrupted()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleInterrupted()
|
||||
{
|
||||
bWaitingForAgentResponse = false; // Interrupted — no response expected from previous turn.
|
||||
StopAgentAudio();
|
||||
OnAgentInterrupted.Broadcast();
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleAgentResponseStarted()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponseStarted()
|
||||
{
|
||||
// The server has started generating a response (first agent_chat_response_part).
|
||||
// Set bAgentGenerating BEFORE StopListening so that any StartListening call
|
||||
@ -512,29 +512,29 @@ void UElevenLabsConversationalAgentComponent::HandleAgentResponseStarted()
|
||||
// In Server VAD + interruption mode, keep the mic open so the server can
|
||||
// detect if the user speaks over the agent and send an interruption event.
|
||||
// The server handles echo filtering and VAD — we just keep streaming audio.
|
||||
if (TurnMode == EElevenLabsTurnMode::Server && bAllowInterruption)
|
||||
if (TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Server && bAllowInterruption)
|
||||
{
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Agent generating — mic stays open (Server VAD + interruption). (%.2fs after turn end)"),
|
||||
T, LastClosedTurnIndex, LatencyFromTurnEnd);
|
||||
}
|
||||
else
|
||||
{
|
||||
// Collision: server generating while mic was open — stop mic without flushing.
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Collision — mic was open, stopping. (%.2fs after turn end)"),
|
||||
T, LastClosedTurnIndex, LatencyFromTurnEnd);
|
||||
StopListening();
|
||||
}
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Agent generating. (%.2fs after turn end)"),
|
||||
T, LastClosedTurnIndex, LatencyFromTurnEnd);
|
||||
OnAgentStartedGenerating.Broadcast();
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleAgentResponsePart(const FString& PartialText)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleAgentResponsePart(const FString& PartialText)
|
||||
{
|
||||
if (bEnableAgentPartialResponse)
|
||||
{
|
||||
@ -542,55 +542,55 @@ void UElevenLabsConversationalAgentComponent::HandleAgentResponsePart(const FStr
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::HandleClientToolCall(const FElevenLabsClientToolCall& ToolCall)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::HandleClientToolCall(const FPS_AI_Agent_ClientToolCall_ElevenLabs& ToolCall)
|
||||
{
|
||||
// Built-in handler for the "set_emotion" tool: parse emotion + intensity, auto-respond, broadcast.
|
||||
if (ToolCall.ToolName == TEXT("set_emotion"))
|
||||
{
|
||||
// Parse emotion
|
||||
EElevenLabsEmotion NewEmotion = EElevenLabsEmotion::Neutral;
|
||||
EPS_AI_Agent_Emotion NewEmotion = EPS_AI_Agent_Emotion::Neutral;
|
||||
const FString* EmotionStr = ToolCall.Parameters.Find(TEXT("emotion"));
|
||||
if (EmotionStr)
|
||||
{
|
||||
const FString Lower = EmotionStr->ToLower();
|
||||
if (Lower == TEXT("joy") || Lower == TEXT("happy") || Lower == TEXT("happiness"))
|
||||
NewEmotion = EElevenLabsEmotion::Joy;
|
||||
NewEmotion = EPS_AI_Agent_Emotion::Joy;
|
||||
else if (Lower == TEXT("sadness") || Lower == TEXT("sad"))
|
||||
NewEmotion = EElevenLabsEmotion::Sadness;
|
||||
NewEmotion = EPS_AI_Agent_Emotion::Sadness;
|
||||
else if (Lower == TEXT("anger") || Lower == TEXT("angry"))
|
||||
NewEmotion = EElevenLabsEmotion::Anger;
|
||||
NewEmotion = EPS_AI_Agent_Emotion::Anger;
|
||||
else if (Lower == TEXT("surprise") || Lower == TEXT("surprised"))
|
||||
NewEmotion = EElevenLabsEmotion::Surprise;
|
||||
NewEmotion = EPS_AI_Agent_Emotion::Surprise;
|
||||
else if (Lower == TEXT("fear") || Lower == TEXT("afraid") || Lower == TEXT("scared"))
|
||||
NewEmotion = EElevenLabsEmotion::Fear;
|
||||
NewEmotion = EPS_AI_Agent_Emotion::Fear;
|
||||
else if (Lower == TEXT("disgust") || Lower == TEXT("disgusted"))
|
||||
NewEmotion = EElevenLabsEmotion::Disgust;
|
||||
NewEmotion = EPS_AI_Agent_Emotion::Disgust;
|
||||
else if (Lower == TEXT("neutral"))
|
||||
NewEmotion = EElevenLabsEmotion::Neutral;
|
||||
NewEmotion = EPS_AI_Agent_Emotion::Neutral;
|
||||
else
|
||||
UE_LOG(LogElevenLabsAgent, Warning, TEXT("Unknown emotion '%s', defaulting to Neutral."), **EmotionStr);
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning, TEXT("Unknown emotion '%s', defaulting to Neutral."), **EmotionStr);
|
||||
}
|
||||
|
||||
// Parse intensity (default: medium)
|
||||
EElevenLabsEmotionIntensity NewIntensity = EElevenLabsEmotionIntensity::Medium;
|
||||
EPS_AI_Agent_EmotionIntensity NewIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
|
||||
const FString* IntensityStr = ToolCall.Parameters.Find(TEXT("intensity"));
|
||||
if (IntensityStr)
|
||||
{
|
||||
const FString Lower = IntensityStr->ToLower();
|
||||
if (Lower == TEXT("low") || Lower == TEXT("subtle") || Lower == TEXT("light"))
|
||||
NewIntensity = EElevenLabsEmotionIntensity::Low;
|
||||
NewIntensity = EPS_AI_Agent_EmotionIntensity::Low;
|
||||
else if (Lower == TEXT("medium") || Lower == TEXT("moderate") || Lower == TEXT("normal"))
|
||||
NewIntensity = EElevenLabsEmotionIntensity::Medium;
|
||||
NewIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
|
||||
else if (Lower == TEXT("high") || Lower == TEXT("strong") || Lower == TEXT("extreme") || Lower == TEXT("intense"))
|
||||
NewIntensity = EElevenLabsEmotionIntensity::High;
|
||||
NewIntensity = EPS_AI_Agent_EmotionIntensity::High;
|
||||
else
|
||||
UE_LOG(LogElevenLabsAgent, Warning, TEXT("Unknown intensity '%s', defaulting to Medium."), **IntensityStr);
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning, TEXT("Unknown intensity '%s', defaulting to Medium."), **IntensityStr);
|
||||
}
|
||||
|
||||
CurrentEmotion = NewEmotion;
|
||||
CurrentEmotionIntensity = NewIntensity;
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Log, TEXT("[T+%.2fs] Agent emotion changed to: %s (%s)"),
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log, TEXT("[T+%.2fs] Agent emotion changed to: %s (%s)"),
|
||||
T, *UEnum::GetValueAsString(NewEmotion), *UEnum::GetValueAsString(NewIntensity));
|
||||
|
||||
OnAgentEmotionChanged.Broadcast(NewEmotion, NewIntensity);
|
||||
@ -616,21 +616,21 @@ void UElevenLabsConversationalAgentComponent::HandleClientToolCall(const FEleven
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Audio playback
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsConversationalAgentComponent::InitAudioPlayback()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::InitAudioPlayback()
|
||||
{
|
||||
AActor* Owner = GetOwner();
|
||||
if (!Owner) return;
|
||||
|
||||
// USoundWaveProcedural lets us push raw PCM data at runtime.
|
||||
ProceduralSoundWave = NewObject<USoundWaveProcedural>(this);
|
||||
ProceduralSoundWave->SetSampleRate(ElevenLabsAudio::SampleRate);
|
||||
ProceduralSoundWave->NumChannels = ElevenLabsAudio::Channels;
|
||||
ProceduralSoundWave->SetSampleRate(PS_AI_Agent_Audio_ElevenLabs::SampleRate);
|
||||
ProceduralSoundWave->NumChannels = PS_AI_Agent_Audio_ElevenLabs::Channels;
|
||||
ProceduralSoundWave->Duration = INDEFINITELY_LOOPING_DURATION;
|
||||
ProceduralSoundWave->SoundGroup = SOUNDGROUP_Voice;
|
||||
ProceduralSoundWave->bLooping = false;
|
||||
|
||||
// Create the audio component attached to the owner.
|
||||
AudioPlaybackComponent = NewObject<UAudioComponent>(Owner, TEXT("ElevenLabsAudioPlayback"));
|
||||
AudioPlaybackComponent = NewObject<UAudioComponent>(Owner, TEXT("PS_AI_Agent_Audio_ElevenLabsPlayback"));
|
||||
AudioPlaybackComponent->RegisterComponent();
|
||||
AudioPlaybackComponent->bAutoActivate = false;
|
||||
AudioPlaybackComponent->SetSound(ProceduralSoundWave);
|
||||
@ -638,10 +638,10 @@ void UElevenLabsConversationalAgentComponent::InitAudioPlayback()
|
||||
// When the procedural sound wave needs more audio data, pull from our queue.
|
||||
ProceduralSoundWave->OnSoundWaveProceduralUnderflow =
|
||||
FOnSoundWaveProceduralUnderflow::CreateUObject(
|
||||
this, &UElevenLabsConversationalAgentComponent::OnProceduralUnderflow);
|
||||
this, &UPS_AI_Agent_Conv_ElevenLabsComponent::OnProceduralUnderflow);
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::OnProceduralUnderflow(
|
||||
USoundWaveProcedural* InProceduralWave, const int32 SamplesRequired)
|
||||
{
|
||||
FScopeLock Lock(&AudioQueueLock);
|
||||
@ -672,7 +672,7 @@ void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
|
||||
{
|
||||
bQueueWasDry = false;
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] AudioQueue recovered — feeding real data again (%d bytes remaining)."),
|
||||
T, LastClosedTurnIndex, AudioQueue.Num());
|
||||
}
|
||||
@ -684,7 +684,7 @@ void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
|
||||
{
|
||||
bQueueWasDry = true;
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
|
||||
TEXT("[T+%.2fs] [Turn %d] AudioQueue DRY — waiting for next TTS chunk (requested %d samples)."),
|
||||
T, LastClosedTurnIndex, SamplesRequired);
|
||||
}
|
||||
@ -702,7 +702,7 @@ void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uint8>& PCMData)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::EnqueueAgentAudio(const TArray<uint8>& PCMData)
|
||||
{
|
||||
{
|
||||
FScopeLock Lock(&AudioQueueLock);
|
||||
@ -721,7 +721,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
|
||||
|
||||
const double T = AgentSpeakStart - SessionStartTime;
|
||||
const double LatencyFromTurnEnd = TurnEndTime > 0.0 ? AgentSpeakStart - TurnEndTime : 0.0;
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Agent speaking — first audio chunk. (%.2fs after turn end)"),
|
||||
T, LastClosedTurnIndex, LatencyFromTurnEnd);
|
||||
|
||||
@ -735,7 +735,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
|
||||
bPreBuffering = true;
|
||||
PreBufferStartTime = FPlatformTime::Seconds();
|
||||
const double Tpb2 = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Pre-buffering %dms before starting playback."),
|
||||
Tpb2, LastClosedTurnIndex, AudioPreBufferMs);
|
||||
}
|
||||
@ -752,7 +752,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
|
||||
const double NowPb = FPlatformTime::Seconds();
|
||||
const double BufferedMs = (NowPb - PreBufferStartTime) * 1000.0;
|
||||
const double Tpb3 = NowPb - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Pre-buffer: second chunk arrived (%.0fms buffered). Starting playback."),
|
||||
Tpb3, LastClosedTurnIndex, BufferedMs);
|
||||
if (AudioPlaybackComponent && !AudioPlaybackComponent->IsPlaying())
|
||||
@ -768,7 +768,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
|
||||
if (AudioPlaybackComponent && !AudioPlaybackComponent->IsPlaying())
|
||||
{
|
||||
const double Tbr = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsAgent, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Warning,
|
||||
TEXT("[T+%.2fs] [Turn %d] Audio component stopped during speech (buffer underrun). Restarting playback."),
|
||||
Tbr, LastClosedTurnIndex);
|
||||
AudioPlaybackComponent->Play();
|
||||
@ -778,7 +778,7 @@ void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uin
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsConversationalAgentComponent::StopAgentAudio()
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::StopAgentAudio()
|
||||
{
|
||||
// Flush the ProceduralSoundWave's internal buffer BEFORE stopping.
|
||||
// QueueAudio() pushes data into the wave's internal ring buffer during
|
||||
@ -824,7 +824,7 @@ void UElevenLabsConversationalAgentComponent::StopAgentAudio()
|
||||
const double T = Now - SessionStartTime;
|
||||
const double AgentSpokeDuration = AgentSpeakStart > 0.0 ? Now - AgentSpeakStart : 0.0;
|
||||
const double TotalTurnDuration = TurnEndTime > 0.0 ? Now - TurnEndTime : 0.0;
|
||||
UE_LOG(LogElevenLabsAgent, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Conv_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] [Turn %d] Agent stopped speaking (spoke %.2fs, full turn round-trip %.2fs)."),
|
||||
T, LastClosedTurnIndex, AgentSpokeDuration, TotalTurnDuration);
|
||||
|
||||
@ -835,7 +835,7 @@ void UElevenLabsConversationalAgentComponent::StopAgentAudio()
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Microphone → WebSocket
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured(const TArray<float>& FloatPCM)
|
||||
void UPS_AI_Agent_Conv_ElevenLabsComponent::OnMicrophoneDataCaptured(const TArray<float>& FloatPCM)
|
||||
{
|
||||
if (!IsConnected() || !bIsListening) return;
|
||||
|
||||
@ -844,7 +844,7 @@ void UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured(const TAr
|
||||
// which would confuse the server's VAD and STT.
|
||||
// In Server VAD + interruption mode, keep sending audio so the server can
|
||||
// detect the user speaking over the agent and trigger an interruption.
|
||||
if (bAgentSpeaking && !(TurnMode == EElevenLabsTurnMode::Server && bAllowInterruption))
|
||||
if (bAgentSpeaking && !(TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Server && bAllowInterruption))
|
||||
{
|
||||
return;
|
||||
}
|
||||
@ -867,7 +867,7 @@ void UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured(const TAr
|
||||
}
|
||||
}
|
||||
|
||||
TArray<uint8> UElevenLabsConversationalAgentComponent::FloatPCMToInt16Bytes(const TArray<float>& FloatPCM)
|
||||
TArray<uint8> UPS_AI_Agent_Conv_ElevenLabsComponent::FloatPCMToInt16Bytes(const TArray<float>& FloatPCM)
|
||||
{
|
||||
TArray<uint8> Out;
|
||||
Out.Reserve(FloatPCM.Num() * 2);
|
||||
@ -1,18 +1,18 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "ElevenLabsFacialExpressionComponent.h"
|
||||
#include "ElevenLabsConversationalAgentComponent.h"
|
||||
#include "ElevenLabsEmotionPoseMap.h"
|
||||
#include "PS_AI_Agent_FacialExpressionComponent.h"
|
||||
#include "PS_AI_Agent_Conv_ElevenLabsComponent.h"
|
||||
#include "PS_AI_Agent_EmotionPoseMap.h"
|
||||
#include "Animation/AnimSequence.h"
|
||||
#include "Animation/AnimData/IAnimationDataModel.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsFacialExpr, Log, All);
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_FacialExpr, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Construction
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
UElevenLabsFacialExpressionComponent::UElevenLabsFacialExpressionComponent()
|
||||
UPS_AI_Agent_FacialExpressionComponent::UPS_AI_Agent_FacialExpressionComponent()
|
||||
{
|
||||
PrimaryComponentTick.bCanEverTick = true;
|
||||
PrimaryComponentTick.TickGroup = TG_PrePhysics;
|
||||
@ -22,32 +22,32 @@ UElevenLabsFacialExpressionComponent::UElevenLabsFacialExpressionComponent()
|
||||
// BeginPlay / EndPlay
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsFacialExpressionComponent::BeginPlay()
|
||||
void UPS_AI_Agent_FacialExpressionComponent::BeginPlay()
|
||||
{
|
||||
Super::BeginPlay();
|
||||
|
||||
AActor* Owner = GetOwner();
|
||||
if (!Owner)
|
||||
{
|
||||
UE_LOG(LogElevenLabsFacialExpr, Warning, TEXT("No owner actor — facial expressions disabled."));
|
||||
UE_LOG(LogPS_AI_Agent_FacialExpr, Warning, TEXT("No owner actor — facial expressions disabled."));
|
||||
return;
|
||||
}
|
||||
|
||||
// Find and bind to agent component
|
||||
auto* Agent = Owner->FindComponentByClass<UElevenLabsConversationalAgentComponent>();
|
||||
auto* Agent = Owner->FindComponentByClass<UPS_AI_Agent_Conv_ElevenLabsComponent>();
|
||||
if (Agent)
|
||||
{
|
||||
AgentComponent = Agent;
|
||||
Agent->OnAgentEmotionChanged.AddDynamic(
|
||||
this, &UElevenLabsFacialExpressionComponent::OnEmotionChanged);
|
||||
this, &UPS_AI_Agent_FacialExpressionComponent::OnEmotionChanged);
|
||||
|
||||
UE_LOG(LogElevenLabsFacialExpr, Log,
|
||||
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
|
||||
TEXT("Facial expression bound to agent component on %s."), *Owner->GetName());
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsFacialExpr, Warning,
|
||||
TEXT("No ElevenLabsConversationalAgentComponent found on %s — "
|
||||
UE_LOG(LogPS_AI_Agent_FacialExpr, Warning,
|
||||
TEXT("No PS_AI_Agent_Conv_ElevenLabsComponent found on %s — "
|
||||
"facial expression will not respond to emotion changes."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
@ -63,17 +63,17 @@ void UElevenLabsFacialExpressionComponent::BeginPlay()
|
||||
{
|
||||
ActivePlaybackTime = 0.0f;
|
||||
CrossfadeAlpha = 1.0f; // No crossfade needed on startup
|
||||
UE_LOG(LogElevenLabsFacialExpr, Log,
|
||||
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
|
||||
TEXT("Auto-started default emotion anim: %s"), *ActiveAnim->GetName());
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsFacialExpressionComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
|
||||
void UPS_AI_Agent_FacialExpressionComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
|
||||
{
|
||||
if (AgentComponent.IsValid())
|
||||
{
|
||||
AgentComponent->OnAgentEmotionChanged.RemoveDynamic(
|
||||
this, &UElevenLabsFacialExpressionComponent::OnEmotionChanged);
|
||||
this, &UPS_AI_Agent_FacialExpressionComponent::OnEmotionChanged);
|
||||
}
|
||||
|
||||
Super::EndPlay(EndPlayReason);
|
||||
@ -83,11 +83,11 @@ void UElevenLabsFacialExpressionComponent::EndPlay(const EEndPlayReason::Type En
|
||||
// Validation
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsFacialExpressionComponent::ValidateEmotionPoses()
|
||||
void UPS_AI_Agent_FacialExpressionComponent::ValidateEmotionPoses()
|
||||
{
|
||||
if (!EmotionPoseMap || EmotionPoseMap->EmotionPoses.Num() == 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsFacialExpr, Log,
|
||||
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
|
||||
TEXT("No emotion poses assigned in EmotionPoseMap — facial expressions disabled."));
|
||||
return;
|
||||
}
|
||||
@ -95,13 +95,13 @@ void UElevenLabsFacialExpressionComponent::ValidateEmotionPoses()
|
||||
int32 AnimCount = 0;
|
||||
for (const auto& EmotionPair : EmotionPoseMap->EmotionPoses)
|
||||
{
|
||||
const FElevenLabsEmotionPoseSet& PoseSet = EmotionPair.Value;
|
||||
const FPS_AI_Agent_EmotionPoseSet& PoseSet = EmotionPair.Value;
|
||||
if (PoseSet.Normal) ++AnimCount;
|
||||
if (PoseSet.Medium) ++AnimCount;
|
||||
if (PoseSet.Extreme) ++AnimCount;
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsFacialExpr, Log,
|
||||
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
|
||||
TEXT("=== Emotion poses: %d emotions, %d anim slots available ==="),
|
||||
EmotionPoseMap->EmotionPoses.Num(), AnimCount);
|
||||
}
|
||||
@ -110,21 +110,21 @@ void UElevenLabsFacialExpressionComponent::ValidateEmotionPoses()
|
||||
// Find AnimSequence for emotion + intensity (with fallback)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
UAnimSequence* UElevenLabsFacialExpressionComponent::FindAnimForEmotion(
|
||||
EElevenLabsEmotion Emotion, EElevenLabsEmotionIntensity Intensity) const
|
||||
UAnimSequence* UPS_AI_Agent_FacialExpressionComponent::FindAnimForEmotion(
|
||||
EPS_AI_Agent_Emotion Emotion, EPS_AI_Agent_EmotionIntensity Intensity) const
|
||||
{
|
||||
if (!EmotionPoseMap) return nullptr;
|
||||
|
||||
const FElevenLabsEmotionPoseSet* PoseSet = EmotionPoseMap->EmotionPoses.Find(Emotion);
|
||||
const FPS_AI_Agent_EmotionPoseSet* PoseSet = EmotionPoseMap->EmotionPoses.Find(Emotion);
|
||||
if (!PoseSet) return nullptr;
|
||||
|
||||
// Direct match
|
||||
UAnimSequence* Anim = nullptr;
|
||||
switch (Intensity)
|
||||
{
|
||||
case EElevenLabsEmotionIntensity::Low: Anim = PoseSet->Normal; break;
|
||||
case EElevenLabsEmotionIntensity::Medium: Anim = PoseSet->Medium; break;
|
||||
case EElevenLabsEmotionIntensity::High: Anim = PoseSet->Extreme; break;
|
||||
case EPS_AI_Agent_EmotionIntensity::Low: Anim = PoseSet->Normal; break;
|
||||
case EPS_AI_Agent_EmotionIntensity::Medium: Anim = PoseSet->Medium; break;
|
||||
case EPS_AI_Agent_EmotionIntensity::High: Anim = PoseSet->Extreme; break;
|
||||
}
|
||||
|
||||
if (Anim) return Anim;
|
||||
@ -141,7 +141,7 @@ UAnimSequence* UElevenLabsFacialExpressionComponent::FindAnimForEmotion(
|
||||
// Evaluate all FloatCurves from an AnimSequence at a given time
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
TMap<FName, float> UElevenLabsFacialExpressionComponent::EvaluateAnimCurves(
|
||||
TMap<FName, float> UPS_AI_Agent_FacialExpressionComponent::EvaluateAnimCurves(
|
||||
UAnimSequence* AnimSeq, float Time) const
|
||||
{
|
||||
TMap<FName, float> CurveValues;
|
||||
@ -167,8 +167,8 @@ TMap<FName, float> UElevenLabsFacialExpressionComponent::EvaluateAnimCurves(
|
||||
// Emotion change handler
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsFacialExpressionComponent::OnEmotionChanged(
|
||||
EElevenLabsEmotion Emotion, EElevenLabsEmotionIntensity Intensity)
|
||||
void UPS_AI_Agent_FacialExpressionComponent::OnEmotionChanged(
|
||||
EPS_AI_Agent_Emotion Emotion, EPS_AI_Agent_EmotionIntensity Intensity)
|
||||
{
|
||||
if (Emotion == ActiveEmotion && Intensity == ActiveEmotionIntensity)
|
||||
return; // No change
|
||||
@ -190,7 +190,7 @@ void UElevenLabsFacialExpressionComponent::OnEmotionChanged(
|
||||
// Begin crossfade
|
||||
CrossfadeAlpha = 0.0f;
|
||||
|
||||
UE_LOG(LogElevenLabsFacialExpr, Log,
|
||||
UE_LOG(LogPS_AI_Agent_FacialExpr, Log,
|
||||
TEXT("Emotion changed: %s (%s) — anim: %s, crossfading over %.1fs..."),
|
||||
*UEnum::GetValueAsString(Emotion), *UEnum::GetValueAsString(Intensity),
|
||||
NewAnim ? *NewAnim->GetName() : TEXT("(none)"), EmotionBlendDuration);
|
||||
@ -200,7 +200,7 @@ void UElevenLabsFacialExpressionComponent::OnEmotionChanged(
|
||||
// Tick — play emotion animation and crossfade
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsFacialExpressionComponent::TickComponent(
|
||||
void UPS_AI_Agent_FacialExpressionComponent::TickComponent(
|
||||
float DeltaTime, ELevelTick TickType, FActorComponentTickFunction* ThisTickFunction)
|
||||
{
|
||||
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
|
||||
@ -286,7 +286,7 @@ void UElevenLabsFacialExpressionComponent::TickComponent(
|
||||
// Mouth curve classification
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
bool UElevenLabsFacialExpressionComponent::IsMouthCurve(const FName& CurveName)
|
||||
bool UPS_AI_Agent_FacialExpressionComponent::IsMouthCurve(const FName& CurveName)
|
||||
{
|
||||
const FString Name = CurveName.ToString().ToLower();
|
||||
return Name.Contains(TEXT("jaw"))
|
||||
@ -1,9 +1,9 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "ElevenLabsLipSyncComponent.h"
|
||||
#include "ElevenLabsFacialExpressionComponent.h"
|
||||
#include "ElevenLabsLipSyncPoseMap.h"
|
||||
#include "ElevenLabsConversationalAgentComponent.h"
|
||||
#include "PS_AI_Agent_LipSyncComponent.h"
|
||||
#include "PS_AI_Agent_FacialExpressionComponent.h"
|
||||
#include "PS_AI_Agent_LipSyncPoseMap.h"
|
||||
#include "PS_AI_Agent_Conv_ElevenLabsComponent.h"
|
||||
#include "Components/SkeletalMeshComponent.h"
|
||||
#include "Engine/SkeletalMesh.h"
|
||||
#include "Animation/MorphTarget.h"
|
||||
@ -11,13 +11,13 @@
|
||||
#include "Animation/AnimData/IAnimationDataModel.h"
|
||||
#include "GameFramework/Actor.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsLipSync, Log, All);
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_LipSync, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Static data
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
const TArray<FName> UElevenLabsLipSyncComponent::VisemeNames = {
|
||||
const TArray<FName> UPS_AI_Agent_LipSyncComponent::VisemeNames = {
|
||||
FName("sil"), FName("PP"), FName("FF"), FName("TH"), FName("DD"),
|
||||
FName("kk"), FName("CH"), FName("SS"), FName("nn"), FName("RR"),
|
||||
FName("aa"), FName("E"), FName("ih"), FName("oh"), FName("ou")
|
||||
@ -43,7 +43,7 @@ static const TSet<FName> ExpressionCurveNames = {
|
||||
// OVR Viseme → ARKit blendshape mapping.
|
||||
// Each viseme activates a combination of ARKit morph targets with specific weights.
|
||||
// These values are tuned for MetaHuman faces and can be adjusted per project.
|
||||
TMap<FName, TMap<FName, float>> UElevenLabsLipSyncComponent::CreateVisemeToBlendshapeMap()
|
||||
TMap<FName, TMap<FName, float>> UPS_AI_Agent_LipSyncComponent::CreateVisemeToBlendshapeMap()
|
||||
{
|
||||
TMap<FName, TMap<FName, float>> Map;
|
||||
|
||||
@ -189,14 +189,14 @@ TMap<FName, TMap<FName, float>> UElevenLabsLipSyncComponent::CreateVisemeToBlend
|
||||
return Map;
|
||||
}
|
||||
|
||||
const TMap<FName, TMap<FName, float>> UElevenLabsLipSyncComponent::VisemeToBlendshapeMap =
|
||||
UElevenLabsLipSyncComponent::CreateVisemeToBlendshapeMap();
|
||||
const TMap<FName, TMap<FName, float>> UPS_AI_Agent_LipSyncComponent::VisemeToBlendshapeMap =
|
||||
UPS_AI_Agent_LipSyncComponent::CreateVisemeToBlendshapeMap();
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Constructor / Destructor
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
UElevenLabsLipSyncComponent::UElevenLabsLipSyncComponent()
|
||||
UPS_AI_Agent_LipSyncComponent::UPS_AI_Agent_LipSyncComponent()
|
||||
{
|
||||
PrimaryComponentTick.bCanEverTick = true;
|
||||
PrimaryComponentTick.TickInterval = 1.0f / 60.0f; // 60 fps for smooth animation
|
||||
@ -211,13 +211,13 @@ UElevenLabsLipSyncComponent::UElevenLabsLipSyncComponent()
|
||||
SmoothedVisemes.FindOrAdd(FName("sil")) = 1.0f;
|
||||
}
|
||||
|
||||
UElevenLabsLipSyncComponent::~UElevenLabsLipSyncComponent() = default;
|
||||
UPS_AI_Agent_LipSyncComponent::~UPS_AI_Agent_LipSyncComponent() = default;
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Lifecycle
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
void UPS_AI_Agent_LipSyncComponent::BeginPlay()
|
||||
{
|
||||
Super::BeginPlay();
|
||||
|
||||
@ -226,54 +226,54 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
Settings.FFTSize = Audio::FSpectrumAnalyzerSettings::EFFTSize::Medium_512;
|
||||
Settings.WindowType = Audio::EWindowType::Hann;
|
||||
SpectrumAnalyzer = MakeUnique<Audio::FSpectrumAnalyzer>(
|
||||
Settings, static_cast<float>(ElevenLabsAudio::SampleRate));
|
||||
Settings, static_cast<float>(PS_AI_Agent_Audio_ElevenLabs::SampleRate));
|
||||
|
||||
// Auto-discover the agent component on the same actor
|
||||
AActor* Owner = GetOwner();
|
||||
if (!Owner) return;
|
||||
|
||||
UElevenLabsConversationalAgentComponent* Agent =
|
||||
Owner->FindComponentByClass<UElevenLabsConversationalAgentComponent>();
|
||||
UPS_AI_Agent_Conv_ElevenLabsComponent* Agent =
|
||||
Owner->FindComponentByClass<UPS_AI_Agent_Conv_ElevenLabsComponent>();
|
||||
|
||||
if (Agent)
|
||||
{
|
||||
AgentComponent = Agent;
|
||||
AudioDataHandle = Agent->OnAgentAudioData.AddUObject(
|
||||
this, &UElevenLabsLipSyncComponent::OnAudioChunkReceived);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnAudioChunkReceived);
|
||||
|
||||
// Bind to text response delegates for text-driven lip sync.
|
||||
// Partial text (streaming) provides text BEFORE audio arrives.
|
||||
// Full text provides the complete sentence (arrives just after audio).
|
||||
Agent->OnAgentPartialResponse.AddDynamic(
|
||||
this, &UElevenLabsLipSyncComponent::OnPartialTextReceived);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnPartialTextReceived);
|
||||
Agent->OnAgentTextResponse.AddDynamic(
|
||||
this, &UElevenLabsLipSyncComponent::OnTextResponseReceived);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnTextResponseReceived);
|
||||
|
||||
// Bind to interruption/stop events so lip sync resets immediately
|
||||
// when the agent is cut off or finishes speaking.
|
||||
Agent->OnAgentInterrupted.AddDynamic(
|
||||
this, &UElevenLabsLipSyncComponent::OnAgentInterrupted);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnAgentInterrupted);
|
||||
Agent->OnAgentStoppedSpeaking.AddDynamic(
|
||||
this, &UElevenLabsLipSyncComponent::OnAgentStopped);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnAgentStopped);
|
||||
|
||||
// Enable partial response streaming if not already enabled
|
||||
Agent->bEnableAgentPartialResponse = true;
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Lip sync bound to agent component on %s (audio + text + interruption)."), *Owner->GetName());
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Warning,
|
||||
TEXT("No ElevenLabsConversationalAgentComponent found on %s. Lip sync will not work."),
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
|
||||
TEXT("No PS_AI_Agent_Conv_ElevenLabsComponent found on %s. Lip sync will not work."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
|
||||
// Cache the facial expression component for emotion-aware blending.
|
||||
CachedFacialExprComp = Owner->FindComponentByClass<UElevenLabsFacialExpressionComponent>();
|
||||
CachedFacialExprComp = Owner->FindComponentByClass<UPS_AI_Agent_FacialExpressionComponent>();
|
||||
if (CachedFacialExprComp.IsValid())
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Facial expression component found — emotion blending enabled (blend=%.2f)."),
|
||||
EmotionExpressionBlend);
|
||||
}
|
||||
@ -286,7 +286,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
Owner->GetComponents<USkeletalMeshComponent>(SkeletalMeshes);
|
||||
|
||||
// Log all found skeletal mesh components for debugging
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Found %d SkeletalMeshComponent(s) on %s:"),
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Found %d SkeletalMeshComponent(s) on %s:"),
|
||||
SkeletalMeshes.Num(), *Owner->GetName());
|
||||
for (const USkeletalMeshComponent* Mesh : SkeletalMeshes)
|
||||
{
|
||||
@ -297,7 +297,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
{
|
||||
MorphCount = Mesh->GetSkeletalMeshAsset()->GetMorphTargets().Num();
|
||||
}
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT(" - '%s' (%d morph targets)"),
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT(" - '%s' (%d morph targets)"),
|
||||
*Mesh->GetName(), MorphCount);
|
||||
}
|
||||
}
|
||||
@ -308,7 +308,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
if (Mesh && Mesh->GetFName().ToString().Contains(TEXT("Face")))
|
||||
{
|
||||
TargetMesh = Mesh;
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Auto-detected face mesh by name: %s"), *Mesh->GetName());
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Auto-detected face mesh by name: %s"), *Mesh->GetName());
|
||||
break;
|
||||
}
|
||||
}
|
||||
@ -322,7 +322,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
&& Mesh->GetSkeletalMeshAsset()->GetMorphTargets().Num() > 0)
|
||||
{
|
||||
TargetMesh = Mesh;
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Auto-detected face mesh by morph targets: %s (%d morphs)"),
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Auto-detected face mesh by morph targets: %s (%d morphs)"),
|
||||
*Mesh->GetName(), Mesh->GetSkeletalMeshAsset()->GetMorphTargets().Num());
|
||||
break;
|
||||
}
|
||||
@ -337,7 +337,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
if (Mesh)
|
||||
{
|
||||
TargetMesh = Mesh;
|
||||
UE_LOG(LogElevenLabsLipSync, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
|
||||
TEXT("No mesh with morph targets found. Using fallback: %s (lip sync may not work)"),
|
||||
*Mesh->GetName());
|
||||
break;
|
||||
@ -347,7 +347,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
|
||||
if (!TargetMesh)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
|
||||
TEXT("No SkeletalMeshComponent found on %s. Set TargetMesh manually or use GetCurrentBlendshapes() in Blueprint."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
@ -357,7 +357,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
if (TargetMesh && TargetMesh->GetSkeletalMeshAsset())
|
||||
{
|
||||
const TArray<UMorphTarget*>& MorphTargets = TargetMesh->GetSkeletalMeshAsset()->GetMorphTargets();
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT("TargetMesh '%s' has %d morph targets."),
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("TargetMesh '%s' has %d morph targets."),
|
||||
*TargetMesh->GetName(), MorphTargets.Num());
|
||||
|
||||
// Log first 20 morph target names to verify ARKit naming
|
||||
@ -374,7 +374,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
}
|
||||
if (Count > 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Morph target sample: %s"), *Names);
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Morph target sample: %s"), *Names);
|
||||
}
|
||||
|
||||
// Verify our blendshape names exist as morph targets on this mesh
|
||||
@ -390,7 +390,7 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
break;
|
||||
}
|
||||
}
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT(" Morph target '%s': %s"),
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT(" Morph target '%s': %s"),
|
||||
*TestName.ToString(), bFound ? TEXT("FOUND") : TEXT("NOT FOUND"));
|
||||
}
|
||||
}
|
||||
@ -403,12 +403,12 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
if (MorphCount == 0)
|
||||
{
|
||||
bUseCurveMode = true;
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("No morph targets found — switching to MetaHuman curve mode (CTRL_expressions_*)."));
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Found %d morph targets — using standard SetMorphTarget mode."), MorphCount);
|
||||
}
|
||||
}
|
||||
@ -422,14 +422,14 @@ void UElevenLabsLipSyncComponent::BeginPlay()
|
||||
// Phoneme pose curve extraction
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAnimSequence* AnimSeq)
|
||||
void UPS_AI_Agent_LipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAnimSequence* AnimSeq)
|
||||
{
|
||||
if (!AnimSeq) return;
|
||||
|
||||
const IAnimationDataModel* DataModel = AnimSeq->GetDataModel();
|
||||
if (!DataModel)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
|
||||
TEXT("Pose '%s' (%s): No DataModel available — skipping."),
|
||||
*VisemeName.ToString(), *AnimSeq->GetName());
|
||||
return;
|
||||
@ -452,7 +452,7 @@ void UElevenLabsLipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAn
|
||||
if (!bPosesUseCTRLNaming && PoseExtractedCurveMap.Num() == 0 && CurveValues.Num() == 1)
|
||||
{
|
||||
bPosesUseCTRLNaming = CurveName.ToString().StartsWith(TEXT("CTRL_"));
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Pose curve naming detected: %s (from curve '%s')"),
|
||||
bPosesUseCTRLNaming ? TEXT("CTRL_expressions_*") : TEXT("ARKit / other"),
|
||||
*CurveName.ToString());
|
||||
@ -462,7 +462,7 @@ void UElevenLabsLipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAn
|
||||
if (CurveValues.Num() > 0)
|
||||
{
|
||||
PoseExtractedCurveMap.Add(VisemeName, MoveTemp(CurveValues));
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Pose '%s' (%s): Extracted %d non-zero curves."),
|
||||
*VisemeName.ToString(), *AnimSeq->GetName(),
|
||||
PoseExtractedCurveMap[VisemeName].Num());
|
||||
@ -471,25 +471,25 @@ void UElevenLabsLipSyncComponent::ExtractPoseCurves(const FName& VisemeName, UAn
|
||||
{
|
||||
// Still add an empty map so we know this viseme was assigned (silence pose)
|
||||
PoseExtractedCurveMap.Add(VisemeName, TMap<FName, float>());
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Pose '%s' (%s): All curves are zero — neutral/silence pose."),
|
||||
*VisemeName.ToString(), *AnimSeq->GetName());
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsLipSyncComponent::InitializePoseMappings()
|
||||
void UPS_AI_Agent_LipSyncComponent::InitializePoseMappings()
|
||||
{
|
||||
PoseExtractedCurveMap.Reset();
|
||||
bPosesUseCTRLNaming = false;
|
||||
|
||||
if (!PoseMap)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("No PoseMap assigned — using hardcoded ARKit mapping."));
|
||||
return;
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Loading pose map: %s"), *PoseMap->GetName());
|
||||
|
||||
// Map each PoseMap property to its corresponding OVR viseme name
|
||||
@ -524,18 +524,18 @@ void UElevenLabsLipSyncComponent::InitializePoseMappings()
|
||||
|
||||
if (AssignedCount > 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("=== Pose-based lip sync: %d/%d poses assigned, %d curve mappings extracted ==="),
|
||||
AssignedCount, Mappings.Num(), PoseExtractedCurveMap.Num());
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Curve naming: %s"),
|
||||
bPosesUseCTRLNaming ? TEXT("CTRL_expressions_* (MetaHuman native)") : TEXT("ARKit / standard"));
|
||||
|
||||
if (bPosesUseCTRLNaming)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
|
||||
TEXT("IMPORTANT: Poses use CTRL_expressions_* curves. "
|
||||
"Move the ElevenLabs Lip Sync AnimNode AFTER mh_arkit_mapping_pose "
|
||||
"Move the PS AI Agent Lip Sync AnimNode AFTER mh_arkit_mapping_pose "
|
||||
"in the Face AnimBP for correct results."));
|
||||
}
|
||||
|
||||
@ -553,7 +553,7 @@ void UElevenLabsLipSyncComponent::InitializePoseMappings()
|
||||
*Pair.Key.ToString(), Pair.Value);
|
||||
if (++Count >= 8) { CurveList += TEXT(" ..."); break; }
|
||||
}
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT(" Sample '%s': [%s]"),
|
||||
*PoseEntry.Key.ToString(), *CurveList);
|
||||
break; // Only log one sample
|
||||
@ -562,13 +562,13 @@ void UElevenLabsLipSyncComponent::InitializePoseMappings()
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("No phoneme pose AnimSequences assigned — using hardcoded ARKit mapping."));
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
void UElevenLabsLipSyncComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
|
||||
void UPS_AI_Agent_LipSyncComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
|
||||
{
|
||||
// Unbind from agent component
|
||||
if (AgentComponent.IsValid())
|
||||
@ -579,13 +579,13 @@ void UElevenLabsLipSyncComponent::EndPlay(const EEndPlayReason::Type EndPlayReas
|
||||
AudioDataHandle.Reset();
|
||||
}
|
||||
AgentComponent->OnAgentPartialResponse.RemoveDynamic(
|
||||
this, &UElevenLabsLipSyncComponent::OnPartialTextReceived);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnPartialTextReceived);
|
||||
AgentComponent->OnAgentTextResponse.RemoveDynamic(
|
||||
this, &UElevenLabsLipSyncComponent::OnTextResponseReceived);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnTextResponseReceived);
|
||||
AgentComponent->OnAgentInterrupted.RemoveDynamic(
|
||||
this, &UElevenLabsLipSyncComponent::OnAgentInterrupted);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnAgentInterrupted);
|
||||
AgentComponent->OnAgentStoppedSpeaking.RemoveDynamic(
|
||||
this, &UElevenLabsLipSyncComponent::OnAgentStopped);
|
||||
this, &UPS_AI_Agent_LipSyncComponent::OnAgentStopped);
|
||||
}
|
||||
AgentComponent.Reset();
|
||||
SpectrumAnalyzer.Reset();
|
||||
@ -597,7 +597,7 @@ void UElevenLabsLipSyncComponent::EndPlay(const EEndPlayReason::Type EndPlayReas
|
||||
// Tick — smooth visemes and apply morph targets
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::TickComponent(float DeltaTime, ELevelTick TickType,
|
||||
void UPS_AI_Agent_LipSyncComponent::TickComponent(float DeltaTime, ELevelTick TickType,
|
||||
FActorComponentTickFunction* ThisTickFunction)
|
||||
{
|
||||
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
|
||||
@ -626,7 +626,7 @@ void UElevenLabsLipSyncComponent::TickComponent(float DeltaTime, ELevelTick Tick
|
||||
// Timeout — start playback with spectral shapes as fallback
|
||||
bWaitingForText = false;
|
||||
PlaybackTimer = 0.0f;
|
||||
UE_LOG(LogElevenLabsLipSync, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
|
||||
TEXT("Text wait timeout (%.0fms). Starting lip sync with spectral shapes (Queue=%d)."),
|
||||
WaitElapsed * 1000.0, VisemeQueue.Num());
|
||||
}
|
||||
@ -913,18 +913,18 @@ void UElevenLabsLipSyncComponent::TickComponent(float DeltaTime, ELevelTick Tick
|
||||
// Interruption / stop handlers
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::OnAgentInterrupted()
|
||||
void UPS_AI_Agent_LipSyncComponent::OnAgentInterrupted()
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Agent interrupted — resetting lip sync to neutral."));
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Agent interrupted — resetting lip sync to neutral."));
|
||||
ResetToNeutral();
|
||||
}
|
||||
|
||||
void UElevenLabsLipSyncComponent::OnAgentStopped()
|
||||
void UPS_AI_Agent_LipSyncComponent::OnAgentStopped()
|
||||
{
|
||||
// Don't clear text state here — it's already handled by TickComponent's
|
||||
// "queue runs dry" logic which checks bFullTextReceived.
|
||||
// Just clear the queues so the mouth returns to neutral immediately.
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Agent stopped speaking — clearing lip sync queues."));
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Agent stopped speaking — clearing lip sync queues."));
|
||||
VisemeQueue.Reset();
|
||||
AmplitudeQueue.Reset();
|
||||
PlaybackTimer = 0.0f;
|
||||
@ -938,7 +938,7 @@ void UElevenLabsLipSyncComponent::OnAgentStopped()
|
||||
TotalActiveFramesSeen = 0;
|
||||
}
|
||||
|
||||
void UElevenLabsLipSyncComponent::ResetToNeutral()
|
||||
void UPS_AI_Agent_LipSyncComponent::ResetToNeutral()
|
||||
{
|
||||
// Clear all queued viseme and amplitude data
|
||||
VisemeQueue.Reset();
|
||||
@ -978,7 +978,7 @@ void UElevenLabsLipSyncComponent::ResetToNeutral()
|
||||
// Audio analysis
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMData)
|
||||
void UPS_AI_Agent_LipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMData)
|
||||
{
|
||||
if (!SpectrumAnalyzer) return;
|
||||
|
||||
@ -989,7 +989,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
|
||||
static bool bFirstChunkLogged = false;
|
||||
if (!bFirstChunkLogged)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose, TEXT("First audio chunk received: %d bytes (%d samples)"), PCMData.Num(), NumSamples);
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose, TEXT("First audio chunk received: %d bytes (%d samples)"), PCMData.Num(), NumSamples);
|
||||
bFirstChunkLogged = true;
|
||||
}
|
||||
|
||||
@ -1275,7 +1275,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
|
||||
|
||||
if (PseudoCount > 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
|
||||
TEXT("Pseudo-speech: %d active frames (%d syllables)"),
|
||||
PseudoCount, (PseudoCount + SyllableFrames - 1) / SyllableFrames);
|
||||
}
|
||||
@ -1309,7 +1309,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
|
||||
|
||||
if (FixedCount > 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
|
||||
TEXT("Late start fix: overrode %d leading silent frames with min amplitude %.2f"),
|
||||
FixedCount, MinStartAmplitude);
|
||||
}
|
||||
@ -1325,7 +1325,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
|
||||
else
|
||||
ApplyTextVisemesToQueue();
|
||||
PlaybackTimer = 0.0f;
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
|
||||
TEXT("Text already available (%d visemes). Starting lip sync immediately."),
|
||||
TextVisemeSequence.Num());
|
||||
}
|
||||
@ -1335,7 +1335,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
|
||||
bWaitingForText = true;
|
||||
WaitingForTextStartTime = FPlatformTime::Seconds();
|
||||
PlaybackTimer = 0.0f;
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Waiting for text before starting lip sync (%d frames queued)."),
|
||||
WindowsQueued);
|
||||
}
|
||||
@ -1349,7 +1349,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
|
||||
ApplyTextVisemesToQueue();
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Audio chunk: %d samples → %d windows | Amp=[%.2f-%.2f] | Queue=%d (%.1fs) | TextVisemes=%d"),
|
||||
NumSamples, WindowsQueued,
|
||||
MinAmp, MaxAmp, VisemeQueue.Num(),
|
||||
@ -1360,7 +1360,7 @@ void UElevenLabsLipSyncComponent::OnAudioChunkReceived(const TArray<uint8>& PCMD
|
||||
// Text-driven lip sync
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::OnPartialTextReceived(const FString& PartialText)
|
||||
void UPS_AI_Agent_LipSyncComponent::OnPartialTextReceived(const FString& PartialText)
|
||||
{
|
||||
// If the previous utterance's full text was already received,
|
||||
// this partial text belongs to a NEW utterance — start fresh.
|
||||
@ -1378,7 +1378,7 @@ void UElevenLabsLipSyncComponent::OnPartialTextReceived(const FString& PartialTe
|
||||
// Convert accumulated text to viseme sequence progressively
|
||||
ConvertTextToVisemes(AccumulatedText);
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Partial text: \"%s\" → %d visemes (accumulated: \"%s\")"),
|
||||
*PartialText, TextVisemeSequence.Num(), *AccumulatedText);
|
||||
|
||||
@ -1397,20 +1397,20 @@ void UElevenLabsLipSyncComponent::OnPartialTextReceived(const FString& PartialTe
|
||||
bWaitingForText = false;
|
||||
PlaybackTimer = 0.0f; // Start consuming now
|
||||
const double WaitElapsed = FPlatformTime::Seconds() - WaitingForTextStartTime;
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
|
||||
TEXT("Text arrived after %.0fms wait. Starting lip sync playback."),
|
||||
WaitElapsed * 1000.0);
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsLipSyncComponent::OnTextResponseReceived(const FString& ResponseText)
|
||||
void UPS_AI_Agent_LipSyncComponent::OnTextResponseReceived(const FString& ResponseText)
|
||||
{
|
||||
// Full text arrived — use it as the definitive source
|
||||
bFullTextReceived = true;
|
||||
AccumulatedText = ResponseText;
|
||||
ConvertTextToVisemes(ResponseText);
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Full text: \"%s\" → %d visemes"), *ResponseText, TextVisemeSequence.Num());
|
||||
|
||||
// Apply to any remaining queued frames (or extend timeline in pose mode)
|
||||
@ -1428,7 +1428,7 @@ void UElevenLabsLipSyncComponent::OnTextResponseReceived(const FString& Response
|
||||
bWaitingForText = false;
|
||||
PlaybackTimer = 0.0f;
|
||||
const double WaitElapsed = FPlatformTime::Seconds() - WaitingForTextStartTime;
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
|
||||
TEXT("Full text arrived after %.0fms wait. Starting lip sync playback."),
|
||||
WaitElapsed * 1000.0);
|
||||
}
|
||||
@ -1443,7 +1443,7 @@ void UElevenLabsLipSyncComponent::OnTextResponseReceived(const FString& Response
|
||||
VisSeq += V.ToString();
|
||||
if (++Count >= 50) { VisSeq += TEXT(" ..."); break; }
|
||||
}
|
||||
UE_LOG(LogElevenLabsLipSync, Log, TEXT("Viseme sequence (%d): [%s]"),
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log, TEXT("Viseme sequence (%d): [%s]"),
|
||||
TextVisemeSequence.Num(), *VisSeq);
|
||||
}
|
||||
|
||||
@ -1453,7 +1453,7 @@ void UElevenLabsLipSyncComponent::OnTextResponseReceived(const FString& Response
|
||||
// persists and corrupts the next utterance's partial text accumulation.
|
||||
}
|
||||
|
||||
void UElevenLabsLipSyncComponent::ConvertTextToVisemes(const FString& Text)
|
||||
void UPS_AI_Agent_LipSyncComponent::ConvertTextToVisemes(const FString& Text)
|
||||
{
|
||||
TextVisemeSequence.Reset();
|
||||
|
||||
@ -1835,7 +1835,7 @@ void UElevenLabsLipSyncComponent::ConvertTextToVisemes(const FString& Text)
|
||||
}
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Syllable filter: %d raw → %d visible (removed %d invisible consonants/sil)"),
|
||||
RawCount, Filtered.Num(), RawCount - Filtered.Num());
|
||||
|
||||
@ -1866,7 +1866,7 @@ static float GetVisemeDurationWeight(const FName& Viseme)
|
||||
return 1.0f;
|
||||
}
|
||||
|
||||
void UElevenLabsLipSyncComponent::ApplyTextVisemesToQueue()
|
||||
void UPS_AI_Agent_LipSyncComponent::ApplyTextVisemesToQueue()
|
||||
{
|
||||
if (TextVisemeSequence.Num() == 0 || VisemeQueue.Num() == 0) return;
|
||||
|
||||
@ -1980,7 +1980,7 @@ void UElevenLabsLipSyncComponent::ApplyTextVisemesToQueue()
|
||||
|
||||
const float FinalRatio = static_cast<float>(ActiveFrames) / static_cast<float>(SeqNum);
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Applied %d syllable-visemes to %d active frames (of %d total) — %.1f frames/viseme (%.0fms/viseme)"),
|
||||
SeqNum, ActiveFrames, VisemeQueue.Num(),
|
||||
FinalRatio, FinalRatio * 32.0f);
|
||||
@ -1990,7 +1990,7 @@ void UElevenLabsLipSyncComponent::ApplyTextVisemesToQueue()
|
||||
// Decoupled viseme timeline (pose mode)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::BuildVisemeTimeline()
|
||||
void UPS_AI_Agent_LipSyncComponent::BuildVisemeTimeline()
|
||||
{
|
||||
if (TextVisemeSequence.Num() == 0) return;
|
||||
|
||||
@ -2063,14 +2063,14 @@ void UElevenLabsLipSyncComponent::BuildVisemeTimeline()
|
||||
bVisemeTimelineActive = true;
|
||||
bTextVisemesApplied = true;
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("Built viseme timeline: %d entries (from %d, max %d), audio=%.0fms, scale=%.2f → %.0fms/viseme avg"),
|
||||
FinalSequence.Num(), TextVisemeSequence.Num(), MaxVisemes,
|
||||
AudioDurationSec * 1000.0f, Scale,
|
||||
(FinalSequence.Num() > 0) ? (CursorSec * 1000.0f / FinalSequence.Num()) : 0.0f);
|
||||
}
|
||||
|
||||
void UElevenLabsLipSyncComponent::AnalyzeSpectrum()
|
||||
void UPS_AI_Agent_LipSyncComponent::AnalyzeSpectrum()
|
||||
{
|
||||
if (!SpectrumAnalyzer) return;
|
||||
|
||||
@ -2086,14 +2086,14 @@ void UElevenLabsLipSyncComponent::AnalyzeSpectrum()
|
||||
|
||||
const float TotalEnergy = VoiceEnergy + F1Energy + F2Energy + F3Energy + SibilantEnergy;
|
||||
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose,
|
||||
TEXT("Spectrum: Total=%.4f F1=%.4f F2=%.4f F3=%.4f Sibilant=%.4f"),
|
||||
TotalEnergy, F1Energy, F2Energy, F3Energy, SibilantEnergy);
|
||||
|
||||
EstimateVisemes(TotalEnergy, F1Energy, F2Energy, F3Energy, SibilantEnergy);
|
||||
}
|
||||
|
||||
float UElevenLabsLipSyncComponent::GetBandEnergy(float LowFreq, float HighFreq, int32 NumSamples) const
|
||||
float UPS_AI_Agent_LipSyncComponent::GetBandEnergy(float LowFreq, float HighFreq, int32 NumSamples) const
|
||||
{
|
||||
if (!SpectrumAnalyzer || NumSamples <= 0) return 0.0f;
|
||||
|
||||
@ -2113,7 +2113,7 @@ float UElevenLabsLipSyncComponent::GetBandEnergy(float LowFreq, float HighFreq,
|
||||
// Viseme estimation from spectral analysis
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::EstimateVisemes(float TotalEnergy,
|
||||
void UPS_AI_Agent_LipSyncComponent::EstimateVisemes(float TotalEnergy,
|
||||
float F1Energy, float F2Energy, float F3Energy, float SibilantEnergy)
|
||||
{
|
||||
// Reset all visemes to zero
|
||||
@ -2226,7 +2226,7 @@ void UElevenLabsLipSyncComponent::EstimateVisemes(float TotalEnergy,
|
||||
// Viseme → ARKit blendshape mapping
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::MapVisemesToBlendshapes()
|
||||
void UPS_AI_Agent_LipSyncComponent::MapVisemesToBlendshapes()
|
||||
{
|
||||
CurrentBlendshapes.Reset();
|
||||
|
||||
@ -2261,7 +2261,7 @@ void UElevenLabsLipSyncComponent::MapVisemesToBlendshapes()
|
||||
if (!WarnedVisemes.Contains(VisemeName))
|
||||
{
|
||||
WarnedVisemes.Add(VisemeName);
|
||||
UE_LOG(LogElevenLabsLipSync, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Warning,
|
||||
TEXT("No pose assigned for viseme '%s' — falling back to hardcoded ARKit mapping. "
|
||||
"Mixing pose and hardcoded visemes may produce inconsistent results."),
|
||||
*VisemeName.ToString());
|
||||
@ -2325,7 +2325,7 @@ void UElevenLabsLipSyncComponent::MapVisemesToBlendshapes()
|
||||
// Morph target application
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsLipSyncComponent::ApplyMorphTargets()
|
||||
void UPS_AI_Agent_LipSyncComponent::ApplyMorphTargets()
|
||||
{
|
||||
if (!TargetMesh) return;
|
||||
|
||||
@ -2350,14 +2350,14 @@ void UElevenLabsLipSyncComponent::ApplyMorphTargets()
|
||||
}
|
||||
if (DebugStr.Len() > 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose, TEXT("%s: %s"),
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose, TEXT("%s: %s"),
|
||||
bUseCurveMode ? TEXT("Curves") : TEXT("Blendshapes"), *DebugStr);
|
||||
}
|
||||
}
|
||||
|
||||
if (bUseCurveMode)
|
||||
{
|
||||
// MetaHuman mode: curves are injected by the ElevenLabs Lip Sync AnimNode
|
||||
// MetaHuman mode: curves are injected by the PS AI Agent Lip Sync AnimNode
|
||||
// placed in the Face AnimBP (AnimGraph evaluation context).
|
||||
// OverrideCurveValue() does NOT work because the AnimGraph resets curves
|
||||
// before the Control Rig reads them. The custom AnimNode injects curves
|
||||
@ -2365,9 +2365,9 @@ void UElevenLabsLipSyncComponent::ApplyMorphTargets()
|
||||
static bool bLoggedOnce = false;
|
||||
if (!bLoggedOnce)
|
||||
{
|
||||
UE_LOG(LogElevenLabsLipSync, Log,
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Log,
|
||||
TEXT("MetaHuman curve mode: lip sync curves will be injected by the "
|
||||
"ElevenLabs Lip Sync AnimNode in the Face AnimBP. "
|
||||
"PS AI Agent Lip Sync AnimNode in the Face AnimBP. "
|
||||
"Ensure the node is placed before the Control Rig in the AnimGraph."));
|
||||
bLoggedOnce = true;
|
||||
}
|
||||
@ -2387,7 +2387,7 @@ void UElevenLabsLipSyncComponent::ApplyMorphTargets()
|
||||
// ARKit → MetaHuman curve name conversion
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
FName UElevenLabsLipSyncComponent::ARKitToMetaHumanCurveName(const FName& ARKitName) const
|
||||
FName UPS_AI_Agent_LipSyncComponent::ARKitToMetaHumanCurveName(const FName& ARKitName) const
|
||||
{
|
||||
// Check cache first to avoid per-frame string allocations
|
||||
if (const FName* Cached = CurveNameCache.Find(ARKitName))
|
||||
@ -2411,10 +2411,10 @@ FName UElevenLabsLipSyncComponent::ARKitToMetaHumanCurveName(const FName& ARKitN
|
||||
FName Result = FName(*(TEXT("CTRL_expressions_") + Name));
|
||||
|
||||
// Cache for next frame
|
||||
const_cast<UElevenLabsLipSyncComponent*>(this)->CurveNameCache.Add(ARKitName, Result);
|
||||
const_cast<UPS_AI_Agent_LipSyncComponent*>(this)->CurveNameCache.Add(ARKitName, Result);
|
||||
|
||||
// Log once per curve name for debugging
|
||||
UE_LOG(LogElevenLabsLipSync, Verbose, TEXT("Curve mapping: %s → %s"), *ARKitName.ToString(), *Result.ToString());
|
||||
UE_LOG(LogPS_AI_Agent_LipSync, Verbose, TEXT("Curve mapping: %s → %s"), *ARKitName.ToString(), *Result.ToString());
|
||||
|
||||
return Result;
|
||||
}
|
||||
@ -1,17 +1,17 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "ElevenLabsMicrophoneCaptureComponent.h"
|
||||
#include "ElevenLabsDefinitions.h"
|
||||
#include "PS_AI_Agent_MicrophoneCaptureComponent.h"
|
||||
#include "PS_AI_Agent_Definitions.h"
|
||||
|
||||
#include "AudioCaptureCore.h"
|
||||
#include "Async/Async.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsMic, Log, All);
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_Mic, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Constructor
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UElevenLabsMicrophoneCaptureComponent::UElevenLabsMicrophoneCaptureComponent()
|
||||
UPS_AI_Agent_MicrophoneCaptureComponent::UPS_AI_Agent_MicrophoneCaptureComponent()
|
||||
{
|
||||
PrimaryComponentTick.bCanEverTick = false;
|
||||
}
|
||||
@ -19,7 +19,7 @@ UElevenLabsMicrophoneCaptureComponent::UElevenLabsMicrophoneCaptureComponent()
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Lifecycle
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsMicrophoneCaptureComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
|
||||
void UPS_AI_Agent_MicrophoneCaptureComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
|
||||
{
|
||||
StopCapture();
|
||||
Super::EndPlay(EndPlayReason);
|
||||
@ -28,11 +28,11 @@ void UElevenLabsMicrophoneCaptureComponent::EndPlay(const EEndPlayReason::Type E
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Capture control
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsMicrophoneCaptureComponent::StartCapture()
|
||||
void UPS_AI_Agent_MicrophoneCaptureComponent::StartCapture()
|
||||
{
|
||||
if (bCapturing)
|
||||
{
|
||||
UE_LOG(LogElevenLabsMic, Warning, TEXT("StartCapture called while already capturing."));
|
||||
UE_LOG(LogPS_AI_Agent_Mic, Warning, TEXT("StartCapture called while already capturing."));
|
||||
return;
|
||||
}
|
||||
|
||||
@ -47,7 +47,7 @@ void UElevenLabsMicrophoneCaptureComponent::StartCapture()
|
||||
|
||||
if (!AudioCapture.OpenAudioCaptureStream(DeviceParams, MoveTemp(CaptureCallback), 1024))
|
||||
{
|
||||
UE_LOG(LogElevenLabsMic, Error, TEXT("Failed to open default audio capture stream."));
|
||||
UE_LOG(LogPS_AI_Agent_Mic, Error, TEXT("Failed to open default audio capture stream."));
|
||||
return;
|
||||
}
|
||||
|
||||
@ -57,36 +57,36 @@ void UElevenLabsMicrophoneCaptureComponent::StartCapture()
|
||||
{
|
||||
DeviceSampleRate = DeviceInfo.PreferredSampleRate;
|
||||
DeviceChannels = DeviceInfo.InputChannels;
|
||||
UE_LOG(LogElevenLabsMic, Log, TEXT("Capture device: %s | Rate=%d | Channels=%d"),
|
||||
UE_LOG(LogPS_AI_Agent_Mic, Log, TEXT("Capture device: %s | Rate=%d | Channels=%d"),
|
||||
*DeviceInfo.DeviceName, DeviceSampleRate, DeviceChannels);
|
||||
}
|
||||
|
||||
AudioCapture.StartStream();
|
||||
bCapturing = true;
|
||||
UE_LOG(LogElevenLabsMic, Log, TEXT("Audio capture started."));
|
||||
UE_LOG(LogPS_AI_Agent_Mic, Log, TEXT("Audio capture started."));
|
||||
}
|
||||
|
||||
void UElevenLabsMicrophoneCaptureComponent::StopCapture()
|
||||
void UPS_AI_Agent_MicrophoneCaptureComponent::StopCapture()
|
||||
{
|
||||
if (!bCapturing) return;
|
||||
|
||||
AudioCapture.StopStream();
|
||||
AudioCapture.CloseStream();
|
||||
bCapturing = false;
|
||||
UE_LOG(LogElevenLabsMic, Log, TEXT("Audio capture stopped."));
|
||||
UE_LOG(LogPS_AI_Agent_Mic, Log, TEXT("Audio capture stopped."));
|
||||
}
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Audio callback (background thread)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsMicrophoneCaptureComponent::OnAudioGenerate(
|
||||
void UPS_AI_Agent_MicrophoneCaptureComponent::OnAudioGenerate(
|
||||
const void* InAudio, int32 NumFrames,
|
||||
int32 InNumChannels, int32 InSampleRate,
|
||||
double StreamTime, bool bOverflow)
|
||||
{
|
||||
if (bOverflow)
|
||||
{
|
||||
UE_LOG(LogElevenLabsMic, Verbose, TEXT("Audio capture buffer overflow."));
|
||||
UE_LOG(LogPS_AI_Agent_Mic, Verbose, TEXT("Audio capture buffer overflow."));
|
||||
}
|
||||
|
||||
// Echo suppression: skip resampling + broadcasting entirely when agent is speaking.
|
||||
@ -129,11 +129,11 @@ void UElevenLabsMicrophoneCaptureComponent::OnAudioGenerate(
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Resampling
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
TArray<float> UElevenLabsMicrophoneCaptureComponent::ResampleTo16000(
|
||||
TArray<float> UPS_AI_Agent_MicrophoneCaptureComponent::ResampleTo16000(
|
||||
const float* InAudio, int32 NumFrames,
|
||||
int32 InChannels, int32 InSampleRate)
|
||||
{
|
||||
const int32 TargetRate = ElevenLabsAudio::SampleRate; // 16000
|
||||
const int32 TargetRate = PS_AI_Agent_Audio_ElevenLabs::SampleRate; // 16000
|
||||
|
||||
// --- Step 1: Downmix to mono ---
|
||||
// NOTE: NumFrames is the number of audio frames (not total samples).
|
||||
@ -1,12 +1,12 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "ElevenLabsPostureComponent.h"
|
||||
#include "PS_AI_Agent_PostureComponent.h"
|
||||
#include "Components/SkeletalMeshComponent.h"
|
||||
#include "GameFramework/Actor.h"
|
||||
#include "Math/UnrealMathUtility.h"
|
||||
#include "DrawDebugHelpers.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY(LogElevenLabsPosture);
|
||||
DEFINE_LOG_CATEGORY(LogPS_AI_Agent_Posture);
|
||||
|
||||
// ── ARKit eye curve names ────────────────────────────────────────────────────
|
||||
static const FName EyeLookUpLeft(TEXT("eyeLookUpLeft"));
|
||||
@ -30,7 +30,7 @@ static constexpr float ARKitEyeRangeVertical = 35.0f;
|
||||
// Construction
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
UElevenLabsPostureComponent::UElevenLabsPostureComponent()
|
||||
UPS_AI_Agent_PostureComponent::UPS_AI_Agent_PostureComponent()
|
||||
{
|
||||
PrimaryComponentTick.bCanEverTick = true;
|
||||
PrimaryComponentTick.TickGroup = TG_PrePhysics;
|
||||
@ -46,14 +46,14 @@ UElevenLabsPostureComponent::UElevenLabsPostureComponent()
|
||||
// BeginPlay
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsPostureComponent::BeginPlay()
|
||||
void UPS_AI_Agent_PostureComponent::BeginPlay()
|
||||
{
|
||||
Super::BeginPlay();
|
||||
|
||||
AActor* Owner = GetOwner();
|
||||
if (!Owner)
|
||||
{
|
||||
UE_LOG(LogElevenLabsPosture, Warning, TEXT("No owner actor — posture disabled."));
|
||||
UE_LOG(LogPS_AI_Agent_Posture, Warning, TEXT("No owner actor — posture disabled."));
|
||||
return;
|
||||
}
|
||||
|
||||
@ -82,11 +82,11 @@ void UElevenLabsPostureComponent::BeginPlay()
|
||||
}
|
||||
if (!CachedMesh.IsValid())
|
||||
{
|
||||
UE_LOG(LogElevenLabsPosture, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_Posture, Warning,
|
||||
TEXT("No SkeletalMeshComponent found on %s — head bone lookup will be unavailable."),
|
||||
*Owner->GetName());
|
||||
}
|
||||
UE_LOG(LogElevenLabsPosture, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Posture, Log,
|
||||
TEXT("Mesh cache: Body=%s Face=%s"),
|
||||
CachedMesh.IsValid() ? *CachedMesh->GetName() : TEXT("NONE"),
|
||||
CachedFaceMesh.IsValid() ? *CachedFaceMesh->GetName() : TEXT("NONE"));
|
||||
@ -96,7 +96,7 @@ void UElevenLabsPostureComponent::BeginPlay()
|
||||
OriginalActorYaw = Owner->GetActorRotation().Yaw + MeshForwardYawOffset;
|
||||
TargetBodyWorldYaw = Owner->GetActorRotation().Yaw;
|
||||
|
||||
UE_LOG(LogElevenLabsPosture, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Posture, Log,
|
||||
TEXT("Posture initialized on %s. MeshOffset=%.0f OriginalYaw=%.0f MaxEye=%.0f/%.0f MaxHead=%.0f/%.0f"),
|
||||
*Owner->GetName(), MeshForwardYawOffset, OriginalActorYaw,
|
||||
MaxEyeHorizontal, MaxEyeVertical, MaxHeadYaw, MaxHeadPitch);
|
||||
@ -106,7 +106,7 @@ void UElevenLabsPostureComponent::BeginPlay()
|
||||
// Map eye angles to 8 ARKit eye curves
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsPostureComponent::UpdateEyeCurves(float EyeYaw, float EyePitch)
|
||||
void UPS_AI_Agent_PostureComponent::UpdateEyeCurves(float EyeYaw, float EyePitch)
|
||||
{
|
||||
CurrentEyeCurves.Reset();
|
||||
|
||||
@ -176,7 +176,7 @@ void UElevenLabsPostureComponent::UpdateEyeCurves(float EyeYaw, float EyePitch)
|
||||
// point shifts so small movements don't re-trigger higher layers.
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
void UElevenLabsPostureComponent::TickComponent(
|
||||
void UPS_AI_Agent_PostureComponent::TickComponent(
|
||||
float DeltaTime, ELevelTick TickType, FActorComponentTickFunction* ThisTickFunction)
|
||||
{
|
||||
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
|
||||
@ -446,7 +446,7 @@ void UElevenLabsPostureComponent::TickComponent(
|
||||
const float TgtYaw = FVector(Dir.X, Dir.Y, 0.0f).Rotation().Yaw;
|
||||
const float Delta = FMath::FindDeltaAngleDegrees(FacingYaw, TgtYaw);
|
||||
|
||||
UE_LOG(LogElevenLabsPosture, Log,
|
||||
UE_LOG(LogPS_AI_Agent_Posture, Log,
|
||||
TEXT("Posture [%s -> %s]: Delta=%.1f | Head=%.1f/%.1f | Eyes=%.1f/%.1f | EyeGap=%.1f"),
|
||||
*Owner->GetName(), *TargetActor->GetName(),
|
||||
Delta,
|
||||
@ -1,7 +1,7 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "ElevenLabsWebSocketProxy.h"
|
||||
#include "PS_AI_Agent_ElevenLabs.h"
|
||||
#include "PS_AI_Agent_WebSocket_ElevenLabsProxy.h"
|
||||
#include "PS_AI_ConvAgent.h"
|
||||
|
||||
#include "WebSocketsModule.h"
|
||||
#include "IWebSocket.h"
|
||||
@ -10,7 +10,7 @@
|
||||
#include "JsonUtilities.h"
|
||||
#include "Misc/Base64.h"
|
||||
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsWS, Log, All);
|
||||
DEFINE_LOG_CATEGORY_STATIC(LogPS_AI_Agent_WS_ElevenLabs, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Helpers
|
||||
@ -24,18 +24,18 @@ static void EL_LOG(bool bVerbose, const TCHAR* Format, ...)
|
||||
TCHAR Buffer[2048];
|
||||
FCString::GetVarArgs(Buffer, UE_ARRAY_COUNT(Buffer), Format, Args);
|
||||
va_end(Args);
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("%s"), Buffer);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("%s"), Buffer);
|
||||
}
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Connect / Disconnect
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsWebSocketProxy::Connect(const FString& AgentIDOverride, const FString& APIKeyOverride)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::Connect(const FString& AgentIDOverride, const FString& APIKeyOverride)
|
||||
{
|
||||
if (ConnectionState == EElevenLabsConnectionState::Connected ||
|
||||
ConnectionState == EElevenLabsConnectionState::Connecting)
|
||||
if (ConnectionState == EPS_AI_Agent_ConnectionState_ElevenLabs::Connected ||
|
||||
ConnectionState == EPS_AI_Agent_ConnectionState_ElevenLabs::Connecting)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("Connect called but already connecting/connected. Ignoring."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("Connect called but already connecting/connected. Ignoring."));
|
||||
return;
|
||||
}
|
||||
|
||||
@ -48,19 +48,19 @@ void UElevenLabsWebSocketProxy::Connect(const FString& AgentIDOverride, const FS
|
||||
if (URL.IsEmpty())
|
||||
{
|
||||
const FString Msg = TEXT("Cannot connect: no Agent ID configured. Set it in Project Settings or pass it to Connect().");
|
||||
UE_LOG(LogElevenLabsWS, Error, TEXT("%s"), *Msg);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Error, TEXT("%s"), *Msg);
|
||||
OnError.Broadcast(Msg);
|
||||
ConnectionState = EElevenLabsConnectionState::Error;
|
||||
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Error;
|
||||
return;
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("Connecting to ElevenLabs: %s"), *URL);
|
||||
ConnectionState = EElevenLabsConnectionState::Connecting;
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Connecting to ElevenLabs: %s"), *URL);
|
||||
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Connecting;
|
||||
|
||||
// Headers: the ElevenLabs Conversational AI WS endpoint accepts the
|
||||
// xi-api-key header on the initial HTTP upgrade request.
|
||||
TMap<FString, FString> UpgradeHeaders;
|
||||
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
|
||||
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
|
||||
const FString ResolvedKey = APIKeyOverride.IsEmpty() ? Settings->API_Key : APIKeyOverride;
|
||||
if (!ResolvedKey.IsEmpty())
|
||||
{
|
||||
@ -69,36 +69,36 @@ void UElevenLabsWebSocketProxy::Connect(const FString& AgentIDOverride, const FS
|
||||
|
||||
WebSocket = FWebSocketsModule::Get().CreateWebSocket(URL, TEXT(""), UpgradeHeaders);
|
||||
|
||||
WebSocket->OnConnected().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsConnected);
|
||||
WebSocket->OnConnectionError().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsConnectionError);
|
||||
WebSocket->OnClosed().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsClosed);
|
||||
WebSocket->OnConnected().AddUObject(this, &UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsConnected);
|
||||
WebSocket->OnConnectionError().AddUObject(this, &UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsConnectionError);
|
||||
WebSocket->OnClosed().AddUObject(this, &UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsClosed);
|
||||
// NOTE: We bind ONLY OnRawMessage (binary frames), NOT OnMessage (text frames).
|
||||
// UE's WebSocket implementation fires BOTH callbacks for the same frame when using
|
||||
// the libwebsockets backend — binding both causes every audio packet to be decoded
|
||||
// and played twice. OnRawMessage handles all frame types: raw binary audio AND
|
||||
// text-framed JSON (detected by peeking first byte for '{').
|
||||
WebSocket->OnRawMessage().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsBinaryMessage);
|
||||
WebSocket->OnRawMessage().AddUObject(this, &UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsBinaryMessage);
|
||||
|
||||
WebSocket->Connect();
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::Disconnect()
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::Disconnect()
|
||||
{
|
||||
if (WebSocket.IsValid() && WebSocket->IsConnected())
|
||||
{
|
||||
WebSocket->Close(1000, TEXT("Client disconnected"));
|
||||
}
|
||||
ConnectionState = EElevenLabsConnectionState::Disconnected;
|
||||
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Disconnected;
|
||||
}
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Audio & turn control
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsWebSocketProxy::SendAudioChunk(const TArray<uint8>& PCMData)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendAudioChunk(const TArray<uint8>& PCMData)
|
||||
{
|
||||
if (!IsConnected())
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("SendAudioChunk: not connected (state=%d). Audio dropped."),
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("SendAudioChunk: not connected (state=%d). Audio dropped."),
|
||||
(int32)ConnectionState);
|
||||
return;
|
||||
}
|
||||
@ -118,7 +118,7 @@ void UElevenLabsWebSocketProxy::SendAudioChunk(const TArray<uint8>& PCMData)
|
||||
const FString AudioJson = FString::Printf(TEXT("{\"user_audio_chunk\":\"%s\"}"), *Base64Audio);
|
||||
|
||||
// Per-chunk log at Verbose only — Log level is too spammy (10+ lines per second).
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("SendAudioChunk: %d bytes"), PCMData.Num());
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("SendAudioChunk: %d bytes"), PCMData.Num());
|
||||
|
||||
{
|
||||
FScopeLock Lock(&WebSocketSendLock);
|
||||
@ -129,9 +129,9 @@ void UElevenLabsWebSocketProxy::SendAudioChunk(const TArray<uint8>& PCMData)
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::SendUserTurnStart()
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendUserTurnStart()
|
||||
{
|
||||
// No-op: the ElevenLabs API does not require a "start speaking" signal.
|
||||
// No-op: the PS_AI_Agent|ElevenLabs API does not require a "start speaking" signal.
|
||||
// The server's VAD detects speech from the audio chunks we send.
|
||||
// user_activity is a keep-alive/timeout-reset message and should NOT be
|
||||
// sent here — it would delay the agent's turn after the user stops.
|
||||
@ -145,12 +145,12 @@ void UElevenLabsWebSocketProxy::SendUserTurnStart()
|
||||
bAgentResponseStartedFired = false;
|
||||
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+%.2fs] User turn started — mic open, audio chunks will follow."), T);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+%.2fs] User turn started — mic open, audio chunks will follow."), T);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::SendUserTurnEnd()
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendUserTurnEnd()
|
||||
{
|
||||
// No explicit "end turn" message exists in the ElevenLabs API.
|
||||
// No explicit "end turn" message exists in the PS_AI_Agent|ElevenLabs API.
|
||||
// The server detects end-of-speech via VAD when we stop sending audio chunks.
|
||||
UserTurnEndTime = FPlatformTime::Seconds();
|
||||
bWaitingForResponse = true;
|
||||
@ -162,42 +162,42 @@ void UElevenLabsWebSocketProxy::SendUserTurnEnd()
|
||||
// The flag is only reset in SendUserTurnStart() at the beginning of a new user turn.
|
||||
|
||||
const double T = UserTurnEndTime - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+%.2fs] User turn ended — server VAD silence detection started (turn_timeout=1s)."), T);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+%.2fs] User turn ended — server VAD silence detection started (turn_timeout=1s)."), T);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::SendTextMessage(const FString& Text)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendTextMessage(const FString& Text)
|
||||
{
|
||||
if (!IsConnected())
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("SendTextMessage: not connected."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("SendTextMessage: not connected."));
|
||||
return;
|
||||
}
|
||||
if (Text.IsEmpty()) return;
|
||||
|
||||
// API: { "type": "user_message", "text": "Hello agent" }
|
||||
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
|
||||
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::UserMessage);
|
||||
Msg->SetStringField(TEXT("type"), PS_AI_Agent_MessageType_ElevenLabs::UserMessage);
|
||||
Msg->SetStringField(TEXT("text"), Text);
|
||||
SendJsonMessage(Msg);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::SendInterrupt()
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendInterrupt()
|
||||
{
|
||||
if (!IsConnected()) return;
|
||||
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("Sending interrupt."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Sending interrupt."));
|
||||
|
||||
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
|
||||
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::Interrupt);
|
||||
Msg->SetStringField(TEXT("type"), PS_AI_Agent_MessageType_ElevenLabs::Interrupt);
|
||||
SendJsonMessage(Msg);
|
||||
}
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// WebSocket callbacks
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsWebSocketProxy::OnWsConnected()
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsConnected()
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("WebSocket connected. Sending conversation_initiation_client_data..."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("WebSocket connected. Sending conversation_initiation_client_data..."));
|
||||
// State stays Connecting until we receive conversation_initiation_metadata from the server.
|
||||
|
||||
// ElevenLabs requires this message immediately after the WebSocket handshake to
|
||||
@ -231,7 +231,7 @@ void UElevenLabsWebSocketProxy::OnWsConnected()
|
||||
// In Server VAD mode, the config override is empty (matches C++ sample exactly).
|
||||
TSharedPtr<FJsonObject> ConversationConfigOverride = MakeShareable(new FJsonObject());
|
||||
|
||||
if (TurnMode == EElevenLabsTurnMode::Client)
|
||||
if (TurnMode == EPS_AI_Agent_TurnMode_ElevenLabs::Client)
|
||||
{
|
||||
// turn_timeout: how long the server waits after VAD detects silence before
|
||||
// processing the user's turn. Default is ~3s. In push-to-talk mode this
|
||||
@ -267,7 +267,7 @@ void UElevenLabsWebSocketProxy::OnWsConnected()
|
||||
// agent's configured custom_llm_extra_body with nothing. Omit entirely.
|
||||
|
||||
TSharedPtr<FJsonObject> InitMsg = MakeShareable(new FJsonObject());
|
||||
InitMsg->SetStringField(TEXT("type"), ElevenLabsMessageType::ConversationClientData);
|
||||
InitMsg->SetStringField(TEXT("type"), PS_AI_Agent_MessageType_ElevenLabs::ConversationClientData);
|
||||
InitMsg->SetObjectField(TEXT("conversation_config_override"), ConversationConfigOverride);
|
||||
|
||||
// NOTE: We bypass SendJsonMessage() here intentionally.
|
||||
@ -277,38 +277,38 @@ void UElevenLabsWebSocketProxy::OnWsConnected()
|
||||
FString InitJson;
|
||||
TSharedRef<TJsonWriter<>> InitWriter = TJsonWriterFactory<>::Create(&InitJson);
|
||||
FJsonSerializer::Serialize(InitMsg.ToSharedRef(), InitWriter);
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("Sending initiation: %s"), *InitJson);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Sending initiation: %s"), *InitJson);
|
||||
WebSocket->Send(InitJson);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::OnWsConnectionError(const FString& Error)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsConnectionError(const FString& Error)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Error, TEXT("WebSocket connection error: %s"), *Error);
|
||||
ConnectionState = EElevenLabsConnectionState::Error;
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Error, TEXT("WebSocket connection error: %s"), *Error);
|
||||
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Error;
|
||||
OnError.Broadcast(Error);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::OnWsClosed(int32 StatusCode, const FString& Reason, bool bWasClean)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsClosed(int32 StatusCode, const FString& Reason, bool bWasClean)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("WebSocket closed. Code=%d Reason=%s Clean=%d"), StatusCode, *Reason, bWasClean);
|
||||
ConnectionState = EElevenLabsConnectionState::Disconnected;
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("WebSocket closed. Code=%d Reason=%s Clean=%d"), StatusCode, *Reason, bWasClean);
|
||||
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Disconnected;
|
||||
WebSocket.Reset();
|
||||
OnDisconnected.Broadcast(StatusCode, Reason);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsMessage(const FString& Message)
|
||||
{
|
||||
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
|
||||
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
|
||||
if (Settings->bVerboseLogging)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT(">> %s"), *Message);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT(">> %s"), *Message);
|
||||
}
|
||||
|
||||
TSharedPtr<FJsonObject> Root;
|
||||
TSharedRef<TJsonReader<>> Reader = TJsonReaderFactory<>::Create(Message);
|
||||
if (!FJsonSerializer::Deserialize(Reader, Root) || !Root.IsValid())
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("Failed to parse WebSocket message as JSON (first 80 chars): %.80s"), *Message);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("Failed to parse WebSocket message as JSON (first 80 chars): %.80s"), *Message);
|
||||
return;
|
||||
}
|
||||
|
||||
@ -318,26 +318,26 @@ void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
|
||||
{
|
||||
// Fallback: some messages use the top-level key as the type
|
||||
// e.g. { "user_audio_chunk": "..." } from ourselves (shouldn't arrive)
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Message has no 'type' field, ignoring."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Message has no 'type' field, ignoring."));
|
||||
return;
|
||||
}
|
||||
|
||||
// Suppress ping from the visible log — they arrive every ~2s and flood the output.
|
||||
// Handle ping early before the generic type log.
|
||||
if (MsgType == ElevenLabsMessageType::PingEvent)
|
||||
if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::PingEvent)
|
||||
{
|
||||
HandlePing(Root);
|
||||
return;
|
||||
}
|
||||
|
||||
// Log every non-ping message type received from the server.
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("Received message type: %s"), *MsgType);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Received message type: %s"), *MsgType);
|
||||
|
||||
if (MsgType == ElevenLabsMessageType::ConversationInitiation)
|
||||
if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::ConversationInitiation)
|
||||
{
|
||||
HandleConversationInitiation(Root);
|
||||
}
|
||||
else if (MsgType == ElevenLabsMessageType::AudioResponse)
|
||||
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::AudioResponse)
|
||||
{
|
||||
// Log time-to-first-audio: latency between end of user turn and first agent audio.
|
||||
if (bWaitingForResponse && !bFirstAudioResponseLogged)
|
||||
@ -346,14 +346,14 @@ void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
|
||||
const double T = Now - SessionStartTime;
|
||||
const double LatencyFromTurnEnd = (Now - UserTurnEndTime) * 1000.0;
|
||||
const double LatencyFromLastChunk = (Now - LastAudioChunkSentTime) * 1000.0;
|
||||
UE_LOG(LogElevenLabsWS, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning,
|
||||
TEXT("[T+%.2fs] [LATENCY] First audio: %.0f ms after turn end (%.0f ms after last chunk)"),
|
||||
T, LatencyFromTurnEnd, LatencyFromLastChunk);
|
||||
bFirstAudioResponseLogged = true;
|
||||
}
|
||||
HandleAudioResponse(Root);
|
||||
}
|
||||
else if (MsgType == ElevenLabsMessageType::UserTranscript)
|
||||
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::UserTranscript)
|
||||
{
|
||||
// Log transcription latency.
|
||||
if (bWaitingForResponse)
|
||||
@ -361,14 +361,14 @@ void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
|
||||
const double Now = FPlatformTime::Seconds();
|
||||
const double T = Now - SessionStartTime;
|
||||
const double LatencyFromTurnEnd = (Now - UserTurnEndTime) * 1000.0;
|
||||
UE_LOG(LogElevenLabsWS, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning,
|
||||
TEXT("[T+%.2fs] [LATENCY] User transcript: %.0f ms after turn end"),
|
||||
T, LatencyFromTurnEnd);
|
||||
bWaitingForResponse = false;
|
||||
}
|
||||
HandleTranscript(Root);
|
||||
}
|
||||
else if (MsgType == ElevenLabsMessageType::AgentResponse)
|
||||
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::AgentResponse)
|
||||
{
|
||||
// Log agent text response latency.
|
||||
if (UserTurnEndTime > 0.0)
|
||||
@ -376,36 +376,36 @@ void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
|
||||
const double Now = FPlatformTime::Seconds();
|
||||
const double T = Now - SessionStartTime;
|
||||
const double LatencyFromTurnEnd = (Now - UserTurnEndTime) * 1000.0;
|
||||
UE_LOG(LogElevenLabsWS, Warning,
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning,
|
||||
TEXT("[T+%.2fs] [LATENCY] Agent text response: %.0f ms after turn end"),
|
||||
T, LatencyFromTurnEnd);
|
||||
}
|
||||
HandleAgentResponse(Root);
|
||||
}
|
||||
else if (MsgType == ElevenLabsMessageType::AgentChatResponsePart)
|
||||
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::AgentChatResponsePart)
|
||||
{
|
||||
HandleAgentChatResponsePart(Root);
|
||||
}
|
||||
else if (MsgType == ElevenLabsMessageType::AgentResponseCorrection)
|
||||
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::AgentResponseCorrection)
|
||||
{
|
||||
// Silently ignore — corrected text after interruption.
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("agent_response_correction received (ignored)."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("agent_response_correction received (ignored)."));
|
||||
}
|
||||
else if (MsgType == ElevenLabsMessageType::ClientToolCall)
|
||||
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::ClientToolCall)
|
||||
{
|
||||
HandleClientToolCall(Root);
|
||||
}
|
||||
else if (MsgType == ElevenLabsMessageType::InterruptionEvent)
|
||||
else if (MsgType == PS_AI_Agent_MessageType_ElevenLabs::InterruptionEvent)
|
||||
{
|
||||
HandleInterruption(Root);
|
||||
}
|
||||
else
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Unhandled message type: %s"), *MsgType);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Unhandled message type: %s"), *MsgType);
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size, SIZE_T BytesRemaining)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size, SIZE_T BytesRemaining)
|
||||
{
|
||||
// Accumulate fragments until BytesRemaining == 0.
|
||||
const uint8* Bytes = static_cast<const uint8*>(Data);
|
||||
@ -430,10 +430,10 @@ void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size,
|
||||
reinterpret_cast<const char*>(BinaryFrameBuffer.GetData())));
|
||||
BinaryFrameBuffer.Reset();
|
||||
|
||||
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
|
||||
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
|
||||
if (Settings->bVerboseLogging)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Binary JSON frame (%d bytes): %.120s"), TotalSize, *JsonString);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Binary JSON frame (%d bytes): %.120s"), TotalSize, *JsonString);
|
||||
}
|
||||
|
||||
OnWsMessage(JsonString);
|
||||
@ -442,7 +442,7 @@ void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size,
|
||||
{
|
||||
// Raw binary audio frame — PCM bytes sent directly without Base64/JSON wrapper.
|
||||
// Log first few bytes as hex to help diagnose the format.
|
||||
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
|
||||
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
|
||||
if (Settings->bVerboseLogging)
|
||||
{
|
||||
FString HexPreview;
|
||||
@ -451,7 +451,7 @@ void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size,
|
||||
{
|
||||
HexPreview += FString::Printf(TEXT("%02X "), BinaryFrameBuffer[i]);
|
||||
}
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Binary audio frame: %d bytes | first bytes: %s"), TotalSize, *HexPreview);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Binary audio frame: %d bytes | first bytes: %s"), TotalSize, *HexPreview);
|
||||
}
|
||||
|
||||
// Broadcast raw PCM bytes directly to the audio queue.
|
||||
@ -464,7 +464,7 @@ void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size,
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Message handlers
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsWebSocketProxy::HandleConversationInitiation(const TSharedPtr<FJsonObject>& Root)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleConversationInitiation(const TSharedPtr<FJsonObject>& Root)
|
||||
{
|
||||
// Expected structure:
|
||||
// { "type": "conversation_initiation_metadata",
|
||||
@ -480,12 +480,12 @@ void UElevenLabsWebSocketProxy::HandleConversationInitiation(const TSharedPtr<FJ
|
||||
}
|
||||
|
||||
SessionStartTime = FPlatformTime::Seconds();
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+0.00s] Conversation initiated. ID=%s"), *ConversationInfo.ConversationID);
|
||||
ConnectionState = EElevenLabsConnectionState::Connected;
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+0.00s] Conversation initiated. ID=%s"), *ConversationInfo.ConversationID);
|
||||
ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Connected;
|
||||
OnConnected.Broadcast(ConversationInfo);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::HandleAudioResponse(const TSharedPtr<FJsonObject>& Root)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleAudioResponse(const TSharedPtr<FJsonObject>& Root)
|
||||
{
|
||||
// Expected structure:
|
||||
// { "type": "audio",
|
||||
@ -494,7 +494,7 @@ void UElevenLabsWebSocketProxy::HandleAudioResponse(const TSharedPtr<FJsonObject
|
||||
const TSharedPtr<FJsonObject>* AudioEvent = nullptr;
|
||||
if (!Root->TryGetObjectField(TEXT("audio_event"), AudioEvent) || !AudioEvent)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("audio message missing 'audio_event' field."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("audio message missing 'audio_event' field."));
|
||||
return;
|
||||
}
|
||||
|
||||
@ -505,28 +505,28 @@ void UElevenLabsWebSocketProxy::HandleAudioResponse(const TSharedPtr<FJsonObject
|
||||
(*AudioEvent)->TryGetNumberField(TEXT("event_id"), EventId);
|
||||
if (EventId > 0 && EventId <= LastInterruptEventId)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Discarding audio event_id=%d (interrupted at %d)."), EventId, LastInterruptEventId);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("Discarding audio event_id=%d (interrupted at %d)."), EventId, LastInterruptEventId);
|
||||
return;
|
||||
}
|
||||
|
||||
FString Base64Audio;
|
||||
if (!(*AudioEvent)->TryGetStringField(TEXT("audio_base_64"), Base64Audio))
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("audio_event missing 'audio_base_64' field."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("audio_event missing 'audio_base_64' field."));
|
||||
return;
|
||||
}
|
||||
|
||||
TArray<uint8> PCMData;
|
||||
if (!FBase64::Decode(Base64Audio, PCMData))
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("Failed to Base64-decode audio data."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("Failed to Base64-decode audio data."));
|
||||
return;
|
||||
}
|
||||
|
||||
OnAudioReceived.Broadcast(PCMData);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::HandleTranscript(const TSharedPtr<FJsonObject>& Root)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleTranscript(const TSharedPtr<FJsonObject>& Root)
|
||||
{
|
||||
// API structure:
|
||||
// { "type": "user_transcript",
|
||||
@ -536,11 +536,11 @@ void UElevenLabsWebSocketProxy::HandleTranscript(const TSharedPtr<FJsonObject>&
|
||||
const TSharedPtr<FJsonObject>* TranscriptEvent = nullptr;
|
||||
if (!Root->TryGetObjectField(TEXT("user_transcription_event"), TranscriptEvent) || !TranscriptEvent)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("user_transcript message missing 'user_transcription_event' field."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("user_transcript message missing 'user_transcription_event' field."));
|
||||
return;
|
||||
}
|
||||
|
||||
FElevenLabsTranscriptSegment Segment;
|
||||
FPS_AI_Agent_TranscriptSegment_ElevenLabs Segment;
|
||||
Segment.Speaker = TEXT("user");
|
||||
(*TranscriptEvent)->TryGetStringField(TEXT("user_transcript"), Segment.Text);
|
||||
// user_transcript messages are always final (interim results are not sent for user speech)
|
||||
@ -549,7 +549,7 @@ void UElevenLabsWebSocketProxy::HandleTranscript(const TSharedPtr<FJsonObject>&
|
||||
OnTranscript.Broadcast(Segment);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::HandleAgentResponse(const TSharedPtr<FJsonObject>& Root)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleAgentResponse(const TSharedPtr<FJsonObject>& Root)
|
||||
{
|
||||
// ISSUE-22: reset bAgentResponseStartedFired so OnAgentResponseStarted fires again on
|
||||
// the next turn. In Server VAD mode SendUserTurnStart() is never called — it is the only
|
||||
@ -574,7 +574,7 @@ void UElevenLabsWebSocketProxy::HandleAgentResponse(const TSharedPtr<FJsonObject
|
||||
OnAgentResponse.Broadcast(ResponseText);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::HandleAgentChatResponsePart(const TSharedPtr<FJsonObject>& Root)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleAgentChatResponsePart(const TSharedPtr<FJsonObject>& Root)
|
||||
{
|
||||
// agent_chat_response_part = the server is actively generating a response (LLM token stream).
|
||||
// Fire OnAgentResponseStarted once per turn so the component can auto-stop the microphone
|
||||
@ -591,7 +591,7 @@ void UElevenLabsWebSocketProxy::HandleAgentChatResponsePart(const TSharedPtr<FJs
|
||||
// of the previous response).
|
||||
if (LastInterruptEventId > 0)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Log,
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log,
|
||||
TEXT("New generation started — resetting LastInterruptEventId (was %d)."),
|
||||
LastInterruptEventId);
|
||||
LastInterruptEventId = 0;
|
||||
@ -600,7 +600,7 @@ void UElevenLabsWebSocketProxy::HandleAgentChatResponsePart(const TSharedPtr<FJs
|
||||
const double Now = FPlatformTime::Seconds();
|
||||
const double T = Now - SessionStartTime;
|
||||
const double LatencyFromTurnEnd = UserTurnEndTime > 0.0 ? (Now - UserTurnEndTime) * 1000.0 : 0.0;
|
||||
UE_LOG(LogElevenLabsWS, Log,
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log,
|
||||
TEXT("[T+%.2fs] Agent started generating (%.0f ms after turn end — includes VAD silence timeout + LLM start)."),
|
||||
T, LatencyFromTurnEnd);
|
||||
OnAgentResponseStarted.Broadcast();
|
||||
@ -643,7 +643,7 @@ void UElevenLabsWebSocketProxy::HandleAgentChatResponsePart(const TSharedPtr<FJs
|
||||
}
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::HandleInterruption(const TSharedPtr<FJsonObject>& Root)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleInterruption(const TSharedPtr<FJsonObject>& Root)
|
||||
{
|
||||
// Extract the interrupt event_id so we can filter stale audio frames.
|
||||
// { "type": "interruption", "interruption_event": { "event_id": 42 } }
|
||||
@ -658,11 +658,11 @@ void UElevenLabsWebSocketProxy::HandleInterruption(const TSharedPtr<FJsonObject>
|
||||
}
|
||||
}
|
||||
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("Agent interrupted (server ack, LastInterruptEventId=%d)."), LastInterruptEventId);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("Agent interrupted (server ack, LastInterruptEventId=%d)."), LastInterruptEventId);
|
||||
OnInterrupted.Broadcast();
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::HandleClientToolCall(const TSharedPtr<FJsonObject>& Root)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandleClientToolCall(const TSharedPtr<FJsonObject>& Root)
|
||||
{
|
||||
// Incoming: { "type": "client_tool_call", "client_tool_call": {
|
||||
// "tool_name": "set_emotion", "tool_call_id": "abc123",
|
||||
@ -670,11 +670,11 @@ void UElevenLabsWebSocketProxy::HandleClientToolCall(const TSharedPtr<FJsonObjec
|
||||
const TSharedPtr<FJsonObject>* ToolCallObj = nullptr;
|
||||
if (!Root->TryGetObjectField(TEXT("client_tool_call"), ToolCallObj) || !ToolCallObj)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("client_tool_call: missing client_tool_call object."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("client_tool_call: missing client_tool_call object."));
|
||||
return;
|
||||
}
|
||||
|
||||
FElevenLabsClientToolCall ToolCall;
|
||||
FPS_AI_Agent_ClientToolCall_ElevenLabs ToolCall;
|
||||
(*ToolCallObj)->TryGetStringField(TEXT("tool_name"), ToolCall.ToolName);
|
||||
(*ToolCallObj)->TryGetStringField(TEXT("tool_call_id"), ToolCall.ToolCallId);
|
||||
|
||||
@ -698,29 +698,29 @@ void UElevenLabsWebSocketProxy::HandleClientToolCall(const TSharedPtr<FJsonObjec
|
||||
}
|
||||
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+%.2fs] Client tool call: %s (id=%s, %d params)"),
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+%.2fs] Client tool call: %s (id=%s, %d params)"),
|
||||
T, *ToolCall.ToolName, *ToolCall.ToolCallId, ToolCall.Parameters.Num());
|
||||
|
||||
OnClientToolCall.Broadcast(ToolCall);
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::SendClientToolResult(const FString& ToolCallId, const FString& Result, bool bIsError)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendClientToolResult(const FString& ToolCallId, const FString& Result, bool bIsError)
|
||||
{
|
||||
// Outgoing: { "type": "client_tool_result", "tool_call_id": "abc123",
|
||||
// "result": "emotion set to surprise", "is_error": false }
|
||||
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
|
||||
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::ClientToolResult);
|
||||
Msg->SetStringField(TEXT("type"), PS_AI_Agent_MessageType_ElevenLabs::ClientToolResult);
|
||||
Msg->SetStringField(TEXT("tool_call_id"), ToolCallId);
|
||||
Msg->SetStringField(TEXT("result"), Result);
|
||||
Msg->SetBoolField(TEXT("is_error"), bIsError);
|
||||
SendJsonMessage(Msg);
|
||||
|
||||
const double T = FPlatformTime::Seconds() - SessionStartTime;
|
||||
UE_LOG(LogElevenLabsWS, Log, TEXT("[T+%.2fs] Sent client_tool_result for %s: %s (error=%s)"),
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Log, TEXT("[T+%.2fs] Sent client_tool_result for %s: %s (error=%s)"),
|
||||
T, *ToolCallId, *Result, bIsError ? TEXT("true") : TEXT("false"));
|
||||
}
|
||||
|
||||
void UElevenLabsWebSocketProxy::HandlePing(const TSharedPtr<FJsonObject>& Root)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::HandlePing(const TSharedPtr<FJsonObject>& Root)
|
||||
{
|
||||
// Reply with a pong to keep the connection alive.
|
||||
// Incoming: { "type": "ping", "ping_event": { "event_id": 1, "ping_ms": 150 } }
|
||||
@ -741,11 +741,11 @@ void UElevenLabsWebSocketProxy::HandlePing(const TSharedPtr<FJsonObject>& Root)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Helpers
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
void UElevenLabsWebSocketProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& JsonObj)
|
||||
void UPS_AI_Agent_WebSocket_ElevenLabsProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& JsonObj)
|
||||
{
|
||||
if (!WebSocket.IsValid() || !WebSocket->IsConnected())
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Warning, TEXT("SendJsonMessage: WebSocket not connected."));
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Warning, TEXT("SendJsonMessage: WebSocket not connected."));
|
||||
return;
|
||||
}
|
||||
|
||||
@ -753,10 +753,10 @@ void UElevenLabsWebSocketProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& J
|
||||
TSharedRef<TJsonWriter<>> Writer = TJsonWriterFactory<>::Create(&Out);
|
||||
FJsonSerializer::Serialize(JsonObj.ToSharedRef(), Writer);
|
||||
|
||||
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
|
||||
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
|
||||
if (Settings->bVerboseLogging)
|
||||
{
|
||||
UE_LOG(LogElevenLabsWS, Verbose, TEXT("<< %s"), *Out);
|
||||
UE_LOG(LogPS_AI_Agent_WS_ElevenLabs, Verbose, TEXT("<< %s"), *Out);
|
||||
}
|
||||
|
||||
{
|
||||
@ -765,9 +765,9 @@ void UElevenLabsWebSocketProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& J
|
||||
}
|
||||
}
|
||||
|
||||
FString UElevenLabsWebSocketProxy::BuildWebSocketURL(const FString& AgentIDOverride, const FString& APIKeyOverride) const
|
||||
FString UPS_AI_Agent_WebSocket_ElevenLabsProxy::BuildWebSocketURL(const FString& AgentIDOverride, const FString& APIKeyOverride) const
|
||||
{
|
||||
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
|
||||
const UPS_AI_Agent_Settings_ElevenLabs* Settings = FPS_AI_ConvAgentModule::Get().GetSettings();
|
||||
|
||||
// Custom URL override takes full precedence
|
||||
if (!Settings->CustomWebSocketURL.IsEmpty())
|
||||
@ -0,0 +1,50 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "PS_AI_ConvAgent.h"
|
||||
#include "Developer/Settings/Public/ISettingsModule.h"
|
||||
#include "UObject/UObjectGlobals.h"
|
||||
#include "UObject/Package.h"
|
||||
|
||||
IMPLEMENT_MODULE(FPS_AI_ConvAgentModule, PS_AI_ConvAgent)
|
||||
|
||||
#define LOCTEXT_NAMESPACE "PS_AI_ConvAgent"
|
||||
|
||||
void FPS_AI_ConvAgentModule::StartupModule()
|
||||
{
|
||||
Settings = NewObject<UPS_AI_Agent_Settings_ElevenLabs>(GetTransientPackage(), "PS_AI_Agent_Settings_ElevenLabs", RF_Standalone);
|
||||
Settings->AddToRoot();
|
||||
|
||||
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
|
||||
{
|
||||
SettingsModule->RegisterSettings(
|
||||
"Project", "Plugins", "PS_AI_ConvAgent_ElevenLabs",
|
||||
LOCTEXT("SettingsName", "PS AI Agent - ElevenLabs"),
|
||||
LOCTEXT("SettingsDescription", "Configure the PS AI Agent - ElevenLabs plugin"),
|
||||
Settings);
|
||||
}
|
||||
}
|
||||
|
||||
void FPS_AI_ConvAgentModule::ShutdownModule()
|
||||
{
|
||||
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
|
||||
{
|
||||
SettingsModule->UnregisterSettings("Project", "Plugins", "PS_AI_ConvAgent_ElevenLabs");
|
||||
}
|
||||
|
||||
if (!GExitPurge)
|
||||
{
|
||||
Settings->RemoveFromRoot();
|
||||
}
|
||||
else
|
||||
{
|
||||
Settings = nullptr;
|
||||
}
|
||||
}
|
||||
|
||||
UPS_AI_Agent_Settings_ElevenLabs* FPS_AI_ConvAgentModule::GetSettings() const
|
||||
{
|
||||
check(Settings);
|
||||
return Settings;
|
||||
}
|
||||
|
||||
#undef LOCTEXT_NAMESPACE
|
||||
@ -4,28 +4,28 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "Animation/AnimNodeBase.h"
|
||||
#include "AnimNode_ElevenLabsFacialExpression.generated.h"
|
||||
#include "AnimNode_PS_AI_Agent_FacialExpression.generated.h"
|
||||
|
||||
class UElevenLabsFacialExpressionComponent;
|
||||
class UPS_AI_Agent_FacialExpressionComponent;
|
||||
|
||||
/**
|
||||
* Animation node that injects ElevenLabs facial expression curves into the AnimGraph.
|
||||
* Animation node that injects facial expression curves into the AnimGraph.
|
||||
*
|
||||
* Place this node in the MetaHuman Face AnimBP BEFORE the ElevenLabs Lip Sync node.
|
||||
* Place this node in the MetaHuman Face AnimBP BEFORE the PS AI Agent Lip Sync node.
|
||||
* It reads emotion curves (eyes, eyebrows, cheeks, mouth mood) from the
|
||||
* ElevenLabsFacialExpressionComponent on the same Actor and outputs them as
|
||||
* PS_AI_Agent_FacialExpressionComponent on the same Actor and outputs them as
|
||||
* CTRL_expressions_* animation curves.
|
||||
*
|
||||
* The Lip Sync node placed AFTER this one will override mouth-area curves
|
||||
* during speech, while non-mouth emotion curves (eyes, brows) pass through.
|
||||
*
|
||||
* Graph layout:
|
||||
* [Live Link Pose] → [ElevenLabs Facial Expression] → [ElevenLabs Lip Sync] → [mh_arkit_mapping_pose] → ...
|
||||
* [Live Link Pose] → [PS AI Agent Facial Expression] → [PS AI Agent Lip Sync] → [mh_arkit_mapping_pose] → ...
|
||||
*
|
||||
* The node auto-discovers the FacialExpressionComponent — no manual wiring needed.
|
||||
*/
|
||||
USTRUCT(BlueprintInternalUseOnly)
|
||||
struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsFacialExpression : public FAnimNode_Base
|
||||
struct PS_AI_CONVAGENT_API FAnimNode_PS_AI_Agent_FacialExpression : public FAnimNode_Base
|
||||
{
|
||||
GENERATED_USTRUCT_BODY()
|
||||
|
||||
@ -43,7 +43,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsFacialExpression : public
|
||||
|
||||
private:
|
||||
/** Cached reference to the facial expression component on the owning actor. */
|
||||
TWeakObjectPtr<UElevenLabsFacialExpressionComponent> FacialExpressionComponent;
|
||||
TWeakObjectPtr<UPS_AI_Agent_FacialExpressionComponent> FacialExpressionComponent;
|
||||
|
||||
/** Emotion expression curves to inject (CTRL_expressions_* format).
|
||||
* Copied from the component during Update (game thread safe). */
|
||||
@ -4,26 +4,26 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "Animation/AnimNodeBase.h"
|
||||
#include "AnimNode_ElevenLabsLipSync.generated.h"
|
||||
#include "AnimNode_PS_AI_Agent_LipSync.generated.h"
|
||||
|
||||
class UElevenLabsLipSyncComponent;
|
||||
class UPS_AI_Agent_LipSyncComponent;
|
||||
|
||||
/**
|
||||
* Animation node that injects ElevenLabs lip sync curves into the AnimGraph.
|
||||
* Animation node that injects lip sync curves into the AnimGraph.
|
||||
*
|
||||
* Place this node in the MetaHuman Face AnimBP BEFORE the mh_arkit_mapping_pose
|
||||
* node. It reads ARKit blendshape weights from the ElevenLabsLipSyncComponent
|
||||
* node. It reads ARKit blendshape weights from the PS_AI_Agent_LipSyncComponent
|
||||
* on the same Actor and outputs them as animation curves (jawOpen, mouthFunnel,
|
||||
* etc.). The downstream mh_arkit_mapping_pose node then converts these ARKit
|
||||
* names to CTRL_expressions_* curves that drive the MetaHuman facial bones.
|
||||
*
|
||||
* Graph layout:
|
||||
* [Live Link Pose] → [ElevenLabs Lip Sync] → [mh_arkit_mapping_pose] → ...
|
||||
* [Live Link Pose] → [PS AI Agent Lip Sync] → [mh_arkit_mapping_pose] → ...
|
||||
*
|
||||
* The node auto-discovers the LipSyncComponent — no manual wiring needed.
|
||||
*/
|
||||
USTRUCT(BlueprintInternalUseOnly)
|
||||
struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsLipSync : public FAnimNode_Base
|
||||
struct PS_AI_CONVAGENT_API FAnimNode_PS_AI_Agent_LipSync : public FAnimNode_Base
|
||||
{
|
||||
GENERATED_USTRUCT_BODY()
|
||||
|
||||
@ -41,7 +41,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsLipSync : public FAnimNode
|
||||
|
||||
private:
|
||||
/** Cached reference to the lip sync component on the owning actor. */
|
||||
TWeakObjectPtr<UElevenLabsLipSyncComponent> LipSyncComponent;
|
||||
TWeakObjectPtr<UPS_AI_Agent_LipSyncComponent> LipSyncComponent;
|
||||
|
||||
/** ARKit blendshape curves to inject (jawOpen, mouthFunnel, etc.).
|
||||
* Copied from the component during Update (game thread safe). */
|
||||
@ -5,12 +5,12 @@
|
||||
#include "CoreMinimal.h"
|
||||
#include "Animation/AnimNodeBase.h"
|
||||
#include "BoneContainer.h"
|
||||
#include "AnimNode_ElevenLabsPosture.generated.h"
|
||||
#include "AnimNode_PS_AI_Agent_Posture.generated.h"
|
||||
|
||||
class UElevenLabsPostureComponent;
|
||||
class UPS_AI_Agent_PostureComponent;
|
||||
|
||||
/**
|
||||
* Animation node that injects ElevenLabs posture data into the AnimGraph.
|
||||
* Animation node that injects posture data into the AnimGraph.
|
||||
*
|
||||
* Handles two types of output (each can be toggled independently):
|
||||
* 1. Head bone rotation (yaw + pitch) — applied directly to the bone transform
|
||||
@ -25,10 +25,10 @@ class UElevenLabsPostureComponent;
|
||||
* → Injects ARKit eye curves before mh_arkit_mapping_pose.
|
||||
* → Graph: [Source] → [Facial Expression] → [Posture] → [Lip Sync] → [mh_arkit] → ...
|
||||
*
|
||||
* The node auto-discovers the ElevenLabsPostureComponent — no manual wiring needed.
|
||||
* The node auto-discovers the PS_AI_Agent_PostureComponent — no manual wiring needed.
|
||||
*/
|
||||
USTRUCT(BlueprintInternalUseOnly)
|
||||
struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsPosture : public FAnimNode_Base
|
||||
struct PS_AI_CONVAGENT_API FAnimNode_PS_AI_Agent_Posture : public FAnimNode_Base
|
||||
{
|
||||
GENERATED_USTRUCT_BODY()
|
||||
|
||||
@ -56,7 +56,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FAnimNode_ElevenLabsPosture : public FAnimNode
|
||||
|
||||
private:
|
||||
/** Cached reference to the posture component on the owning actor. */
|
||||
TWeakObjectPtr<UElevenLabsPostureComponent> PostureComponent;
|
||||
TWeakObjectPtr<UPS_AI_Agent_PostureComponent> PostureComponent;
|
||||
|
||||
/** Eye gaze curves to inject (8 ARKit eye look curves).
|
||||
* Copied from the component during Update (game thread safe). */
|
||||
@ -4,20 +4,20 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "Components/ActorComponent.h"
|
||||
#include "ElevenLabsDefinitions.h"
|
||||
#include "ElevenLabsWebSocketProxy.h"
|
||||
#include "PS_AI_Agent_Definitions.h"
|
||||
#include "PS_AI_Agent_WebSocket_ElevenLabsProxy.h"
|
||||
#include "Sound/SoundWaveProcedural.h"
|
||||
#include <atomic>
|
||||
#include "ElevenLabsConversationalAgentComponent.generated.h"
|
||||
#include "PS_AI_Agent_Conv_ElevenLabsComponent.generated.h"
|
||||
|
||||
class UAudioComponent;
|
||||
class UElevenLabsMicrophoneCaptureComponent;
|
||||
class UPS_AI_Agent_MicrophoneCaptureComponent;
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Delegates exposed to Blueprint
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentConnected,
|
||||
const FElevenLabsConversationInfo&, ConversationInfo);
|
||||
const FPS_AI_Agent_ConversationInfo_ElevenLabs&, ConversationInfo);
|
||||
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnAgentDisconnected,
|
||||
int32, StatusCode, const FString&, Reason);
|
||||
@ -26,7 +26,7 @@ DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentError,
|
||||
const FString&, ErrorMessage);
|
||||
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentTranscript,
|
||||
const FElevenLabsTranscriptSegment&, Segment);
|
||||
const FPS_AI_Agent_TranscriptSegment_ElevenLabs&, Segment);
|
||||
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentTextResponse,
|
||||
const FString&, ResponseText);
|
||||
@ -68,8 +68,8 @@ DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnAgentResponseTimeout);
|
||||
* The emotion changes BEFORE the corresponding audio arrives, giving time to blend.
|
||||
*/
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnAgentEmotionChanged,
|
||||
EElevenLabsEmotion, Emotion,
|
||||
EElevenLabsEmotionIntensity, Intensity);
|
||||
EPS_AI_Agent_Emotion, Emotion,
|
||||
EPS_AI_Agent_EmotionIntensity, Intensity);
|
||||
|
||||
/**
|
||||
* Fired for any client tool call that is NOT automatically handled (i.e. not "set_emotion").
|
||||
@ -77,14 +77,14 @@ DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnAgentEmotionChanged,
|
||||
* You MUST call SendClientToolResult on the WebSocketProxy to acknowledge the call.
|
||||
*/
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentClientToolCall,
|
||||
const FElevenLabsClientToolCall&, ToolCall);
|
||||
const FPS_AI_Agent_ClientToolCall_ElevenLabs&, ToolCall);
|
||||
|
||||
// Non-dynamic delegate for raw agent audio (high-frequency, C++ consumers only).
|
||||
// Delivers PCM chunks as int16, 16kHz mono, little-endian.
|
||||
DECLARE_MULTICAST_DELEGATE_OneParam(FOnAgentAudioData, const TArray<uint8>& /*PCMData*/);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// UElevenLabsConversationalAgentComponent
|
||||
// UPS_AI_Agent_Conv_ElevenLabsComponent
|
||||
//
|
||||
// Attach this to any Actor (e.g. a character NPC) to give it a voice powered by
|
||||
// the ElevenLabs Conversational AI API.
|
||||
@ -96,60 +96,60 @@ DECLARE_MULTICAST_DELEGATE_OneParam(FOnAgentAudioData, const TArray<uint8>& /*PC
|
||||
// 4. React to events (OnAgentTranscript, OnAgentTextResponse, etc.) in Blueprint.
|
||||
// 5. Call EndConversation() when done.
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "ElevenLabs Conversational Agent")
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsConversationalAgentComponent : public UActorComponent
|
||||
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "PS AI Agent - ElevenLabs Conv")
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_Conv_ElevenLabsComponent : public UActorComponent
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
public:
|
||||
UElevenLabsConversationalAgentComponent();
|
||||
UPS_AI_Agent_Conv_ElevenLabsComponent();
|
||||
|
||||
// ── Configuration ─────────────────────────────────────────────────────────
|
||||
|
||||
/** ElevenLabs Agent ID used for this conversation. Leave empty to use the default from Project Settings > ElevenLabs. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
|
||||
/** ElevenLabs Agent ID used for this conversation. Leave empty to use the default from Project Settings > PS AI Agent - ElevenLabs. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
|
||||
meta = (ToolTip = "ElevenLabs Agent ID. Leave empty to use the project default from Project Settings."))
|
||||
FString AgentID;
|
||||
|
||||
/** How turn-taking is managed between the user and the agent.\n- Server VAD (recommended): ElevenLabs automatically detects when the user stops speaking.\n- Client Controlled: You manually call StartListening/StopListening (push-to-talk with a key). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
|
||||
meta = (ToolTip = "Turn-taking mode.\n- Server VAD: ElevenLabs detects end-of-speech automatically (hands-free).\n- Client Controlled: You call StartListening/StopListening manually (push-to-talk)."))
|
||||
EElevenLabsTurnMode TurnMode = EElevenLabsTurnMode::Server;
|
||||
EPS_AI_Agent_TurnMode_ElevenLabs TurnMode = EPS_AI_Agent_TurnMode_ElevenLabs::Server;
|
||||
|
||||
/** Automatically open the microphone as soon as the WebSocket connection is established. Only applies in Server VAD mode. In Client (push-to-talk) mode, you must call StartListening manually. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
|
||||
meta = (ToolTip = "Auto-open the microphone when the conversation starts.\nOnly applies in Server VAD mode. In push-to-talk mode, call StartListening() manually."))
|
||||
bool bAutoStartListening = true;
|
||||
|
||||
/** Let the LLM start generating a response during silence, before the VAD is fully confident the user has finished speaking. Saves 200-500ms of latency but may be unstable in long multi-turn sessions. Disabled by default. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Latency",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Latency",
|
||||
meta = (ToolTip = "Speculative turn: the LLM begins generating during silence before full turn-end confidence.\nReduces latency by 200-500ms. May be unstable in long sessions — test before enabling in production."))
|
||||
bool bSpeculativeTurn = false;
|
||||
|
||||
/** How many milliseconds of microphone audio to accumulate before sending a chunk to ElevenLabs. Lower values reduce latency but may degrade voice detection accuracy. Higher values are more reliable but add delay. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Latency",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Latency",
|
||||
meta = (ClampMin = "20", ClampMax = "500",
|
||||
ToolTip = "Mic audio chunk duration sent to ElevenLabs.\n- 50-80ms: lower latency, less reliable voice detection.\n- 100ms (default): good balance.\n- 150-250ms: more reliable, higher latency."))
|
||||
int32 MicChunkDurationMs = 100;
|
||||
|
||||
/** Allow the user to interrupt the agent while it is speaking.\n- In Server VAD mode: the microphone stays open during agent speech and the server detects interruptions automatically.\n- In Client (push-to-talk) mode: pressing the talk key while the agent speaks sends an interrupt signal.\n- When disabled: the user must wait for the agent to finish speaking before talking. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
|
||||
meta = (ToolTip = "Allow the user to interrupt the agent while it speaks.\n- Server VAD: mic stays open, server detects user voice automatically.\n- Push-to-talk: pressing the talk key interrupts the agent.\n- Disabled: user must wait for the agent to finish."))
|
||||
bool bAllowInterruption = true;
|
||||
|
||||
/** Enable the OnAgentTranscript event, which provides real-time speech-to-text of what the user is saying. Disable if you don't need to display user speech to reduce processing overhead. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fire OnAgentTranscript with real-time speech-to-text of user speech.\nDisable if you don't need to display what the user said."))
|
||||
bool bEnableUserTranscript = true;
|
||||
|
||||
/** Enable the OnAgentTextResponse event, which provides the agent's complete text response once fully generated. Disable if you only need the audio output. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fire OnAgentTextResponse with the agent's complete text once fully generated.\nDisable if you only need the audio output."))
|
||||
bool bEnableAgentTextResponse = true;
|
||||
|
||||
/** Enable the OnAgentPartialResponse event, which streams the agent's text word-by-word as the LLM generates it. Use this for real-time subtitles that appear while the agent speaks, rather than waiting for the full response. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fire OnAgentPartialResponse with streaming text fragments as the LLM generates them.\nIdeal for real-time subtitles. Each event gives one text chunk, not the accumulated text."))
|
||||
bool bEnableAgentPartialResponse = false;
|
||||
|
||||
@ -157,13 +157,13 @@ public:
|
||||
* Delays playback start so early TTS chunks can accumulate, preventing
|
||||
* mid-sentence pauses when the second chunk hasn't arrived yet.
|
||||
* Set to 0 for immediate playback. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Latency",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs|Latency",
|
||||
meta = (ClampMin = "0", ClampMax = "4000",
|
||||
ToolTip = "Pre-buffer delay in ms before starting audio playback.\nHigher values reduce mid-sentence pauses but add initial latency.\n0 = immediate playback."))
|
||||
int32 AudioPreBufferMs = 2000;
|
||||
|
||||
/** Safety timeout: if the server does not start generating a response within this many seconds after the user stops speaking, fire OnAgentResponseTimeout. Set to 0 to disable. A normal response starts within 0.1-0.8s. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS AI Agent|ElevenLabs",
|
||||
meta = (ClampMin = "0.0",
|
||||
ToolTip = "Seconds to wait for a server response after the user stops speaking.\nFires OnAgentResponseTimeout if exceeded. Normal latency is 0.1-0.8s.\nSet to 0 to disable. Default: 10s."))
|
||||
float ResponseTimeoutSeconds = 10.0f;
|
||||
@ -171,81 +171,81 @@ public:
|
||||
// ── Events ────────────────────────────────────────────────────────────────
|
||||
|
||||
/** Fired when the WebSocket connection is established and the conversation session is ready. Provides the ConversationID and AgentID. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires when the connection to ElevenLabs is established and the conversation is ready to begin."))
|
||||
FOnAgentConnected OnAgentConnected;
|
||||
|
||||
/** Fired when the WebSocket connection is closed (gracefully or due to an error). Provides the status code and reason. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires when the connection to ElevenLabs is closed. Check StatusCode and Reason for details."))
|
||||
FOnAgentDisconnected OnAgentDisconnected;
|
||||
|
||||
/** Fired on any connection or protocol error. The error message describes what went wrong. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires on connection or protocol errors. The ErrorMessage describes the issue."))
|
||||
FOnAgentError OnAgentError;
|
||||
|
||||
/** Fired with real-time speech-to-text of the user's voice. Includes both tentative (in-progress) and final transcripts. Requires bEnableUserTranscript to be true. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Real-time speech-to-text of the user's voice.\nIncludes tentative and final transcripts. Enable with bEnableUserTranscript."))
|
||||
FOnAgentTranscript OnAgentTranscript;
|
||||
|
||||
/** Fired once when the agent's complete text response is available. This is the full text that corresponds to the audio the agent speaks. Requires bEnableAgentTextResponse to be true. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "The agent's complete text response (matches the spoken audio).\nFires once when the full text is ready. Enable with bEnableAgentTextResponse."))
|
||||
FOnAgentTextResponse OnAgentTextResponse;
|
||||
|
||||
/** Fired repeatedly as the LLM generates text, providing one word/fragment at a time. Use for real-time subtitles. Each call gives a new fragment, NOT the accumulated text. Requires bEnableAgentPartialResponse to be true. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Streaming text fragments as the LLM generates them (word by word).\nIdeal for real-time subtitles. Enable with bEnableAgentPartialResponse."))
|
||||
FOnAgentPartialResponse OnAgentPartialResponse;
|
||||
|
||||
/** Fired when the agent begins playing audio (first audio chunk received). Use this to trigger speech animations or UI indicators. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires when the agent starts speaking (first audio chunk). Use for lip-sync or UI feedback."))
|
||||
FOnAgentStartedSpeaking OnAgentStartedSpeaking;
|
||||
|
||||
/** Fired when the agent finishes playing all audio. Use this to re-open the microphone (in Server VAD mode without interruption) or update UI. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires when the agent finishes speaking. Use to re-open the mic or update UI."))
|
||||
FOnAgentStoppedSpeaking OnAgentStoppedSpeaking;
|
||||
|
||||
/** Fired when the agent's speech is interrupted (either by the user speaking over it, or by a manual InterruptAgent call). The audio playback is automatically stopped. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires when the agent is interrupted mid-speech. Audio is automatically stopped."))
|
||||
FOnAgentInterrupted OnAgentInterrupted;
|
||||
|
||||
/** Fired when the server starts generating a response (before any audio arrives). Use this for "thinking..." UI feedback. In push-to-talk mode, the microphone is automatically closed when this fires. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires when the server starts generating (before audio arrives).\nUse for 'thinking...' UI. Mic is auto-closed in push-to-talk mode."))
|
||||
FOnAgentStartedGenerating OnAgentStartedGenerating;
|
||||
|
||||
/** Fired if the server does not start generating a response within ResponseTimeoutSeconds after the user stops speaking. Use this to show a "try again" message or automatically re-open the microphone. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires if the server doesn't respond within ResponseTimeoutSeconds.\nUse to show 'try again' or re-open the mic automatically."))
|
||||
FOnAgentResponseTimeout OnAgentResponseTimeout;
|
||||
|
||||
/** Fired when the agent changes emotion via the "set_emotion" client tool. The emotion is set BEFORE the corresponding audio arrives, giving you time to smoothly blend facial expressions. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires when the agent sets an emotion (joy, sadness, surprise, fear, anger, disgust).\nDriven by the 'set_emotion' client tool. Arrives before the audio."))
|
||||
FOnAgentEmotionChanged OnAgentEmotionChanged;
|
||||
|
||||
/** Fired for client tool calls that are NOT automatically handled (i.e. not "set_emotion"). You must call GetWebSocketProxy()->SendClientToolResult() to respond. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events",
|
||||
meta = (ToolTip = "Fires for custom client tool calls (not set_emotion).\nYou must respond via GetWebSocketProxy()->SendClientToolResult()."))
|
||||
FOnAgentClientToolCall OnAgentClientToolCall;
|
||||
|
||||
/** The current emotion of the agent, as set by the "set_emotion" client tool. Defaults to Neutral. */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
EElevenLabsEmotion CurrentEmotion = EElevenLabsEmotion::Neutral;
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
EPS_AI_Agent_Emotion CurrentEmotion = EPS_AI_Agent_Emotion::Neutral;
|
||||
|
||||
/** The current emotion intensity. Defaults to Medium. */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
EElevenLabsEmotionIntensity CurrentEmotionIntensity = EElevenLabsEmotionIntensity::Medium;
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
EPS_AI_Agent_EmotionIntensity CurrentEmotionIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
|
||||
|
||||
// ── Raw audio data (C++ only, used by LipSync component) ────────────────
|
||||
/** Raw PCM audio from the agent (int16, 16kHz mono). Fires for each WebSocket audio chunk.
|
||||
* Used internally by UElevenLabsLipSyncComponent for spectral analysis. */
|
||||
* Used internally by UPS_AI_Agent_LipSyncComponent for spectral analysis. */
|
||||
FOnAgentAudioData OnAgentAudioData;
|
||||
|
||||
// ── Control ───────────────────────────────────────────────────────────────
|
||||
@ -254,25 +254,25 @@ public:
|
||||
* Open the WebSocket connection and start the conversation.
|
||||
* If bAutoStartListening is true, microphone capture also starts once connected.
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void StartConversation();
|
||||
|
||||
/** Close the WebSocket and stop all audio. */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void EndConversation();
|
||||
|
||||
/**
|
||||
* Start capturing microphone audio and streaming it to ElevenLabs.
|
||||
* In Client turn mode, also sends a UserTurnStart signal.
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void StartListening();
|
||||
|
||||
/**
|
||||
* Stop capturing microphone audio.
|
||||
* In Client turn mode, also sends a UserTurnEnd signal.
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void StopListening();
|
||||
|
||||
/**
|
||||
@ -280,35 +280,35 @@ public:
|
||||
* The agent will respond with audio and text just as if it heard you speak.
|
||||
* Useful for testing in the Editor or for text-based interaction.
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void SendTextMessage(const FString& Text);
|
||||
|
||||
/** Interrupt the agent's current utterance. */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void InterruptAgent();
|
||||
|
||||
// ── State queries ─────────────────────────────────────────────────────────
|
||||
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
bool IsConnected() const;
|
||||
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
bool IsListening() const { return bIsListening; }
|
||||
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
bool IsAgentSpeaking() const { return bAgentSpeaking; }
|
||||
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
const FElevenLabsConversationInfo& GetConversationInfo() const;
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
const FPS_AI_Agent_ConversationInfo_ElevenLabs& GetConversationInfo() const;
|
||||
|
||||
/** True while audio is being pre-buffered (playback hasn't started yet).
|
||||
* Used by the LipSync component to pause viseme queue consumption. */
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
bool IsPreBuffering() const { return bPreBuffering; }
|
||||
|
||||
/** Access the underlying WebSocket proxy (advanced use). */
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
UElevenLabsWebSocketProxy* GetWebSocketProxy() const { return WebSocketProxy; }
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
UPS_AI_Agent_WebSocket_ElevenLabsProxy* GetWebSocketProxy() const { return WebSocketProxy; }
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────
|
||||
// UActorComponent overrides
|
||||
@ -321,7 +321,7 @@ public:
|
||||
private:
|
||||
// ── Internal event handlers ───────────────────────────────────────────────
|
||||
UFUNCTION()
|
||||
void HandleConnected(const FElevenLabsConversationInfo& Info);
|
||||
void HandleConnected(const FPS_AI_Agent_ConversationInfo_ElevenLabs& Info);
|
||||
|
||||
UFUNCTION()
|
||||
void HandleDisconnected(int32 StatusCode, const FString& Reason);
|
||||
@ -333,7 +333,7 @@ private:
|
||||
void HandleAudioReceived(const TArray<uint8>& PCMData);
|
||||
|
||||
UFUNCTION()
|
||||
void HandleTranscript(const FElevenLabsTranscriptSegment& Segment);
|
||||
void HandleTranscript(const FPS_AI_Agent_TranscriptSegment_ElevenLabs& Segment);
|
||||
|
||||
UFUNCTION()
|
||||
void HandleAgentResponse(const FString& ResponseText);
|
||||
@ -348,7 +348,7 @@ private:
|
||||
void HandleAgentResponsePart(const FString& PartialText);
|
||||
|
||||
UFUNCTION()
|
||||
void HandleClientToolCall(const FElevenLabsClientToolCall& ToolCall);
|
||||
void HandleClientToolCall(const FPS_AI_Agent_ClientToolCall_ElevenLabs& ToolCall);
|
||||
|
||||
// ── Audio playback ────────────────────────────────────────────────────────
|
||||
void InitAudioPlayback();
|
||||
@ -364,7 +364,7 @@ private:
|
||||
|
||||
// ── Sub-objects ───────────────────────────────────────────────────────────
|
||||
UPROPERTY()
|
||||
UElevenLabsWebSocketProxy* WebSocketProxy = nullptr;
|
||||
UPS_AI_Agent_WebSocket_ElevenLabsProxy* WebSocketProxy = nullptr;
|
||||
|
||||
UPROPERTY()
|
||||
UAudioComponent* AudioPlaybackComponent = nullptr;
|
||||
@ -3,13 +3,13 @@
|
||||
#pragma once
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "ElevenLabsDefinitions.generated.h"
|
||||
#include "PS_AI_Agent_Definitions.generated.h"
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Connection state
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UENUM(BlueprintType)
|
||||
enum class EElevenLabsConnectionState : uint8
|
||||
enum class EPS_AI_Agent_ConnectionState_ElevenLabs : uint8
|
||||
{
|
||||
Disconnected UMETA(DisplayName = "Disconnected"),
|
||||
Connecting UMETA(DisplayName = "Connecting"),
|
||||
@ -21,7 +21,7 @@ enum class EElevenLabsConnectionState : uint8
|
||||
// Agent turn mode
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UENUM(BlueprintType)
|
||||
enum class EElevenLabsTurnMode : uint8
|
||||
enum class EPS_AI_Agent_TurnMode_ElevenLabs : uint8
|
||||
{
|
||||
/** ElevenLabs server decides when the user has finished speaking (default). */
|
||||
Server UMETA(DisplayName = "Server VAD"),
|
||||
@ -32,7 +32,7 @@ enum class EElevenLabsTurnMode : uint8
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// WebSocket message type helpers (internal, not exposed to Blueprint)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
namespace ElevenLabsMessageType
|
||||
namespace PS_AI_Agent_MessageType_ElevenLabs
|
||||
{
|
||||
// Client → Server
|
||||
static const FString AudioChunk = TEXT("user_audio_chunk");
|
||||
@ -62,7 +62,7 @@ namespace ElevenLabsMessageType
|
||||
// Audio format exchanged with ElevenLabs
|
||||
// PCM 16-bit signed, 16000 Hz, mono, little-endian.
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
namespace ElevenLabsAudio
|
||||
namespace PS_AI_Agent_Audio_ElevenLabs
|
||||
{
|
||||
static constexpr int32 SampleRate = 16000;
|
||||
static constexpr int32 Channels = 1;
|
||||
@ -75,16 +75,16 @@ namespace ElevenLabsAudio
|
||||
// Conversation metadata received on successful connection
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
USTRUCT(BlueprintType)
|
||||
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsConversationInfo
|
||||
struct PS_AI_CONVAGENT_API FPS_AI_Agent_ConversationInfo_ElevenLabs
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
/** Unique ID of this conversation session assigned by ElevenLabs. */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
FString ConversationID;
|
||||
|
||||
/** Agent ID that is responding. */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
FString AgentID;
|
||||
};
|
||||
|
||||
@ -92,20 +92,20 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsConversationInfo
|
||||
// Transcript segment
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
USTRUCT(BlueprintType)
|
||||
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsTranscriptSegment
|
||||
struct PS_AI_CONVAGENT_API FPS_AI_Agent_TranscriptSegment_ElevenLabs
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
/** Transcribed text. */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
FString Text;
|
||||
|
||||
/** "user" or "agent". */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
FString Speaker;
|
||||
|
||||
/** Whether this is a final transcript or a tentative (in-progress) one. */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
bool bIsFinal = false;
|
||||
};
|
||||
|
||||
@ -113,7 +113,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsTranscriptSegment
|
||||
// Agent emotion (driven by client tool "set_emotion" from the LLM)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UENUM(BlueprintType)
|
||||
enum class EElevenLabsEmotion : uint8
|
||||
enum class EPS_AI_Agent_Emotion : uint8
|
||||
{
|
||||
Neutral UMETA(DisplayName = "Neutral"),
|
||||
Joy UMETA(DisplayName = "Joy"),
|
||||
@ -128,7 +128,7 @@ enum class EElevenLabsEmotion : uint8
|
||||
// Emotion intensity (maps to Normal/Medium/Extreme pose variants)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UENUM(BlueprintType)
|
||||
enum class EElevenLabsEmotionIntensity : uint8
|
||||
enum class EPS_AI_Agent_EmotionIntensity : uint8
|
||||
{
|
||||
Low UMETA(DisplayName = "Low (Normal)"),
|
||||
Medium UMETA(DisplayName = "Medium"),
|
||||
@ -139,19 +139,19 @@ enum class EElevenLabsEmotionIntensity : uint8
|
||||
// Client tool call received from ElevenLabs server
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
USTRUCT(BlueprintType)
|
||||
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsClientToolCall
|
||||
struct PS_AI_CONVAGENT_API FPS_AI_Agent_ClientToolCall_ElevenLabs
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
/** Name of the tool the agent wants to invoke (e.g. "set_emotion"). */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
FString ToolName;
|
||||
|
||||
/** Unique ID for this tool invocation — must be echoed back in client_tool_result. */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
FString ToolCallId;
|
||||
|
||||
/** Raw JSON parameters as key-value string pairs. */
|
||||
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
|
||||
UPROPERTY(BlueprintReadOnly, Category = "PS AI Agent|ElevenLabs")
|
||||
TMap<FString, FString> Parameters;
|
||||
};
|
||||
@ -5,8 +5,8 @@
|
||||
#include "CoreMinimal.h"
|
||||
#include "Engine/DataAsset.h"
|
||||
#include "Engine/AssetManager.h"
|
||||
#include "ElevenLabsDefinitions.h"
|
||||
#include "ElevenLabsEmotionPoseMap.generated.h"
|
||||
#include "PS_AI_Agent_Definitions.h"
|
||||
#include "PS_AI_Agent_EmotionPoseMap.generated.h"
|
||||
|
||||
class UAnimSequence;
|
||||
|
||||
@ -14,7 +14,7 @@ class UAnimSequence;
|
||||
// Emotion pose set: 3 intensity levels (Normal / Medium / Extreme)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
USTRUCT(BlueprintType)
|
||||
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsEmotionPoseSet
|
||||
struct PS_AI_CONVAGENT_API FPS_AI_Agent_EmotionPoseSet
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
@ -38,16 +38,16 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsEmotionPoseSet
|
||||
* Reusable data asset that maps emotions to facial expression AnimSequences.
|
||||
*
|
||||
* Create ONE instance of this asset in the Content Browser
|
||||
* (right-click → Miscellaneous → Data Asset → ElevenLabsEmotionPoseMap),
|
||||
* (right-click → Miscellaneous → Data Asset → PS_AI_Agent_EmotionPoseMap),
|
||||
* assign your emotion AnimSequences, then reference this asset
|
||||
* on the ElevenLabs Facial Expression component.
|
||||
* on the PS AI Agent Facial Expression component.
|
||||
*
|
||||
* The component plays the AnimSequence in real-time (looping) to drive
|
||||
* emotion-based facial expressions (eyes, eyebrows, cheeks, mouth mood).
|
||||
* Lip sync overrides the mouth-area curves on top.
|
||||
*/
|
||||
UCLASS(BlueprintType, Blueprintable, DisplayName = "ElevenLabs Emotion Pose Map")
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsEmotionPoseMap : public UPrimaryDataAsset
|
||||
UCLASS(BlueprintType, Blueprintable, DisplayName = "PS AI Agent Emotion Pose Map")
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_EmotionPoseMap : public UPrimaryDataAsset
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
@ -57,5 +57,5 @@ public:
|
||||
* Neutral is recommended — it plays by default at startup (blinking, breathing). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "Emotion Poses",
|
||||
meta = (ToolTip = "Emotion → AnimSequence mapping with 3 intensity levels.\nThese drive the base facial expression (eyes, brows, cheeks).\nLip sync overrides the mouth area on top."))
|
||||
TMap<EElevenLabsEmotion, FElevenLabsEmotionPoseSet> EmotionPoses;
|
||||
TMap<EPS_AI_Agent_Emotion, FPS_AI_Agent_EmotionPoseSet> EmotionPoses;
|
||||
};
|
||||
@ -4,18 +4,18 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "Components/ActorComponent.h"
|
||||
#include "ElevenLabsDefinitions.h"
|
||||
#include "ElevenLabsFacialExpressionComponent.generated.h"
|
||||
#include "PS_AI_Agent_Definitions.h"
|
||||
#include "PS_AI_Agent_FacialExpressionComponent.generated.h"
|
||||
|
||||
class UElevenLabsConversationalAgentComponent;
|
||||
class UElevenLabsEmotionPoseMap;
|
||||
class UPS_AI_Agent_Conv_ElevenLabsComponent;
|
||||
class UPS_AI_Agent_EmotionPoseMap;
|
||||
class UAnimSequence;
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// UElevenLabsFacialExpressionComponent
|
||||
// UPS_AI_Agent_FacialExpressionComponent
|
||||
//
|
||||
// Drives emotion-based facial expressions on a MetaHuman (or any skeletal mesh)
|
||||
// as a BASE layer. Lip sync (from ElevenLabsLipSyncComponent) modulates on top,
|
||||
// as a BASE layer. Lip sync (from PS_AI_Agent_LipSyncComponent) modulates on top,
|
||||
// overriding only mouth-area curves during speech.
|
||||
//
|
||||
// Emotion AnimSequences are played back in real-time (looping), not sampled
|
||||
@ -24,31 +24,31 @@ class UAnimSequence;
|
||||
//
|
||||
// Workflow:
|
||||
// 1. Assign a PoseMap data asset with Emotion Poses filled in.
|
||||
// 2. Add the AnimNode "ElevenLabs Facial Expression" in the Face AnimBP
|
||||
// BEFORE the "ElevenLabs Lip Sync" node.
|
||||
// 2. Add the AnimNode "PS AI Agent Facial Expression" in the Face AnimBP
|
||||
// BEFORE the "PS AI Agent Lip Sync" node.
|
||||
// 3. The component listens to OnAgentEmotionChanged from the agent component.
|
||||
// 4. Emotion animations crossfade smoothly (configurable duration).
|
||||
// 5. The AnimNode reads GetCurrentEmotionCurves() and injects them into the pose.
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "ElevenLabs Facial Expression")
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsFacialExpressionComponent : public UActorComponent
|
||||
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "PS AI Agent Facial Expression")
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_FacialExpressionComponent : public UActorComponent
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
public:
|
||||
UElevenLabsFacialExpressionComponent();
|
||||
UPS_AI_Agent_FacialExpressionComponent();
|
||||
|
||||
// ── Configuration ─────────────────────────────────────────────────────────
|
||||
|
||||
/** Emotion pose map asset containing emotion AnimSequences (Normal / Medium / Extreme per emotion).
|
||||
* Create a dedicated ElevenLabsEmotionPoseMap asset in the Content Browser. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|FacialExpression",
|
||||
meta = (ToolTip = "Dedicated Emotion Pose Map asset.\nRight-click Content Browser → Miscellaneous → ElevenLabs Emotion Pose Map."))
|
||||
TObjectPtr<UElevenLabsEmotionPoseMap> EmotionPoseMap;
|
||||
* Create a dedicated PS_AI_Agent_EmotionPoseMap asset in the Content Browser. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|FacialExpression",
|
||||
meta = (ToolTip = "Dedicated Emotion Pose Map asset.\nRight-click Content Browser → Miscellaneous → PS AI Agent Emotion Pose Map."))
|
||||
TObjectPtr<UPS_AI_Agent_EmotionPoseMap> EmotionPoseMap;
|
||||
|
||||
/** Emotion crossfade duration in seconds. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|FacialExpression",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|FacialExpression",
|
||||
meta = (ClampMin = "0.1", ClampMax = "3.0",
|
||||
ToolTip = "How long (seconds) to crossfade between emotions.\n0.5 = snappy, 1.5 = smooth."))
|
||||
float EmotionBlendDuration = 0.5f;
|
||||
@ -56,19 +56,19 @@ public:
|
||||
// ── Getters ───────────────────────────────────────────────────────────────
|
||||
|
||||
/** Get the current emotion curves evaluated from the playing AnimSequence. */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs|FacialExpression")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|FacialExpression")
|
||||
const TMap<FName, float>& GetCurrentEmotionCurves() const { return CurrentEmotionCurves; }
|
||||
|
||||
/** Get the active emotion. */
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs|FacialExpression")
|
||||
EElevenLabsEmotion GetActiveEmotion() const { return ActiveEmotion; }
|
||||
UFUNCTION(BlueprintPure, Category = "PS_AI_Agent|FacialExpression")
|
||||
EPS_AI_Agent_Emotion GetActiveEmotion() const { return ActiveEmotion; }
|
||||
|
||||
/** Get the active emotion intensity. */
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs|FacialExpression")
|
||||
EElevenLabsEmotionIntensity GetActiveIntensity() const { return ActiveEmotionIntensity; }
|
||||
UFUNCTION(BlueprintPure, Category = "PS_AI_Agent|FacialExpression")
|
||||
EPS_AI_Agent_EmotionIntensity GetActiveIntensity() const { return ActiveEmotionIntensity; }
|
||||
|
||||
/** Check if a curve name belongs to the mouth area (overridden by lip sync). */
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs|FacialExpression")
|
||||
UFUNCTION(BlueprintPure, Category = "PS_AI_Agent|FacialExpression")
|
||||
static bool IsMouthCurve(const FName& CurveName);
|
||||
|
||||
// ── UActorComponent overrides ─────────────────────────────────────────────
|
||||
@ -82,7 +82,7 @@ private:
|
||||
|
||||
/** Called when the agent changes emotion via client tool. */
|
||||
UFUNCTION()
|
||||
void OnEmotionChanged(EElevenLabsEmotion Emotion, EElevenLabsEmotionIntensity Intensity);
|
||||
void OnEmotionChanged(EPS_AI_Agent_Emotion Emotion, EPS_AI_Agent_EmotionIntensity Intensity);
|
||||
|
||||
// ── Helpers ───────────────────────────────────────────────────────────────
|
||||
|
||||
@ -90,7 +90,7 @@ private:
|
||||
void ValidateEmotionPoses();
|
||||
|
||||
/** Find the best AnimSequence for a given emotion + intensity (with fallback). */
|
||||
UAnimSequence* FindAnimForEmotion(EElevenLabsEmotion Emotion, EElevenLabsEmotionIntensity Intensity) const;
|
||||
UAnimSequence* FindAnimForEmotion(EPS_AI_Agent_Emotion Emotion, EPS_AI_Agent_EmotionIntensity Intensity) const;
|
||||
|
||||
/** Evaluate all FloatCurves from an AnimSequence at a given time. */
|
||||
TMap<FName, float> EvaluateAnimCurves(UAnimSequence* AnimSeq, float Time) const;
|
||||
@ -118,9 +118,9 @@ private:
|
||||
TMap<FName, float> CurrentEmotionCurves;
|
||||
|
||||
/** Active emotion (for change detection). */
|
||||
EElevenLabsEmotion ActiveEmotion = EElevenLabsEmotion::Neutral;
|
||||
EElevenLabsEmotionIntensity ActiveEmotionIntensity = EElevenLabsEmotionIntensity::Medium;
|
||||
EPS_AI_Agent_Emotion ActiveEmotion = EPS_AI_Agent_Emotion::Neutral;
|
||||
EPS_AI_Agent_EmotionIntensity ActiveEmotionIntensity = EPS_AI_Agent_EmotionIntensity::Medium;
|
||||
|
||||
/** Cached reference to the agent component on the same Actor. */
|
||||
TWeakObjectPtr<UElevenLabsConversationalAgentComponent> AgentComponent;
|
||||
TWeakObjectPtr<UPS_AI_Agent_Conv_ElevenLabsComponent> AgentComponent;
|
||||
};
|
||||
@ -5,11 +5,11 @@
|
||||
#include "CoreMinimal.h"
|
||||
#include "Components/ActorComponent.h"
|
||||
#include "DSP/SpectrumAnalyzer.h"
|
||||
#include "ElevenLabsLipSyncComponent.generated.h"
|
||||
#include "PS_AI_Agent_LipSyncComponent.generated.h"
|
||||
|
||||
class UElevenLabsConversationalAgentComponent;
|
||||
class UElevenLabsFacialExpressionComponent;
|
||||
class UElevenLabsLipSyncPoseMap;
|
||||
class UPS_AI_Agent_Conv_ElevenLabsComponent;
|
||||
class UPS_AI_Agent_FacialExpressionComponent;
|
||||
class UPS_AI_Agent_LipSyncPoseMap;
|
||||
class USkeletalMeshComponent;
|
||||
|
||||
/** A single entry in the decoupled viseme timeline.
|
||||
@ -22,10 +22,10 @@ struct FVisemeTimelineEntry
|
||||
};
|
||||
|
||||
// Fired every tick when viseme/blendshape data has been updated.
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsVisemesReady);
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnPS_AI_Agent_VisemesReady);
|
||||
|
||||
/**
|
||||
* Real-time lip sync component for ElevenLabs Conversational AI.
|
||||
* Real-time lip sync component for conversational AI agents.
|
||||
*
|
||||
* Attaches to the same Actor as the Conversational Agent component.
|
||||
* Receives the agent's audio stream, performs spectral analysis,
|
||||
@ -38,26 +38,26 @@ DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsVisemesReady);
|
||||
* 3. Conversation starts → lip sync works automatically.
|
||||
* 4. (Optional) Bind OnVisemesReady for custom Blueprint handling.
|
||||
*/
|
||||
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "ElevenLabs Lip Sync")
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsLipSyncComponent : public UActorComponent
|
||||
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "PS AI Agent Lip Sync")
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_LipSyncComponent : public UActorComponent
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
public:
|
||||
UElevenLabsLipSyncComponent();
|
||||
~UElevenLabsLipSyncComponent();
|
||||
UPS_AI_Agent_LipSyncComponent();
|
||||
~UPS_AI_Agent_LipSyncComponent();
|
||||
|
||||
// ── Configuration ─────────────────────────────────────────────────────────
|
||||
|
||||
/** Target skeletal mesh to auto-apply morph targets. Leave empty to handle
|
||||
* visemes manually via OnVisemesReady + GetCurrentBlendshapes(). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ToolTip = "Skeletal mesh to drive morph targets on.\nLeave empty to read values manually via GetCurrentBlendshapes()."))
|
||||
TObjectPtr<USkeletalMeshComponent> TargetMesh;
|
||||
|
||||
/** Overall mouth movement intensity multiplier. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ClampMin = "0.0", ClampMax = "3.0",
|
||||
ToolTip = "Lip sync intensity.\n1.0 = normal, higher = more expressive, lower = subtler."))
|
||||
float LipSyncStrength = 1.0f;
|
||||
@ -65,26 +65,26 @@ public:
|
||||
/** Scales the audio amplitude driving mouth movement.
|
||||
* Lower values produce subtler animation, higher values are more pronounced.
|
||||
* Use this to tone down overly strong lip movement without changing the shape. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ClampMin = "0.5", ClampMax = "1.0",
|
||||
ToolTip = "Audio amplitude scale.\n0.5 = subtle, 0.75 = balanced, 1.0 = full.\nReduces overall mouth movement without affecting viseme shape."))
|
||||
float AmplitudeScale = 1.0f;
|
||||
|
||||
/** How quickly viseme weights interpolate towards new values each frame. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ClampMin = "35.0", ClampMax = "65.0",
|
||||
ToolTip = "Smoothing speed for viseme transitions.\n35 = smooth and soft, 50 = balanced, 65 = sharp and responsive."))
|
||||
float SmoothingSpeed = 50.0f;
|
||||
|
||||
// ── Emotion Expression Blend ─────────────────────────────────────────────
|
||||
|
||||
/** How much facial emotion (from ElevenLabsFacialExpressionComponent) bleeds through
|
||||
/** How much facial emotion (from PS_AI_Agent_FacialExpressionComponent) bleeds through
|
||||
* during speech on expression curves (smile, frown, sneer, dimple, etc.).
|
||||
* Shape curves (jaw, tongue, lip shape) are always 100% lip sync.
|
||||
* 0.0 = lip sync completely overrides expression curves (old behavior).
|
||||
* 1.0 = emotion fully additive on expression curves during speech.
|
||||
* 0.5 = balanced blend (recommended). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ClampMin = "0.0", ClampMax = "1.0",
|
||||
ToolTip = "Emotion bleed-through during speech.\n0 = pure lip sync, 0.5 = balanced, 1 = full emotion on expression curves.\nOnly affects expression curves (smile, frown, etc.), not lip shape."))
|
||||
float EmotionExpressionBlend = 0.5f;
|
||||
@ -94,7 +94,7 @@ public:
|
||||
/** Envelope attack time in milliseconds.
|
||||
* Controls how fast the mouth opens when speech starts or gets louder.
|
||||
* Lower = snappier onset, higher = gentler opening. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ClampMin = "1.0", ClampMax = "50.0",
|
||||
ToolTip = "Envelope attack (ms).\n5 = snappy, 10 = balanced, 25 = gentle.\nHow fast the mouth opens on speech onset."))
|
||||
float EnvelopeAttackMs = 10.0f;
|
||||
@ -102,7 +102,7 @@ public:
|
||||
/** Envelope release time in milliseconds.
|
||||
* Controls how slowly the mouth closes when speech gets quieter.
|
||||
* Higher = smoother, more natural decay. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ClampMin = "10.0", ClampMax = "200.0",
|
||||
ToolTip = "Envelope release (ms).\n25 = responsive, 50 = balanced, 120 = smooth/cinematic.\nHow slowly the mouth closes between syllables."))
|
||||
float EnvelopeReleaseMs = 50.0f;
|
||||
@ -110,31 +110,31 @@ public:
|
||||
// ── Phoneme Pose Map ─────────────────────────────────────────────────────
|
||||
|
||||
/** Optional pose map asset mapping OVR visemes to phoneme AnimSequences.
|
||||
* Create once (right-click → Miscellaneous → Data Asset → ElevenLabsLipSyncPoseMap),
|
||||
* Create once (right-click → Miscellaneous → Data Asset → PS_AI_Agent_LipSyncPoseMap),
|
||||
* assign your MHF_* poses, then reference it on every MetaHuman.
|
||||
* When assigned, curve data is extracted at BeginPlay and replaces the
|
||||
* hardcoded ARKit blendshape mapping with artist-crafted poses.
|
||||
* Leave empty to use the built-in ARKit mapping (backward compatible). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ToolTip = "Reusable pose map asset.\nAssign your MHF_* AnimSequences once, reuse on all MetaHumans.\nLeave empty for built-in ARKit mapping."))
|
||||
TObjectPtr<UElevenLabsLipSyncPoseMap> PoseMap;
|
||||
TObjectPtr<UPS_AI_Agent_LipSyncPoseMap> PoseMap;
|
||||
|
||||
// ── Events ────────────────────────────────────────────────────────────────
|
||||
|
||||
/** Fires every tick when viseme data has been updated.
|
||||
* Use GetCurrentVisemes() or GetCurrentBlendshapes() to read values. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|LipSync",
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS_AI_Agent|LipSync",
|
||||
meta = (ToolTip = "Fires each frame with updated viseme data.\nCall GetCurrentVisemes() or GetCurrentBlendshapes() to read values."))
|
||||
FOnElevenLabsVisemesReady OnVisemesReady;
|
||||
FOnPS_AI_Agent_VisemesReady OnVisemesReady;
|
||||
|
||||
// ── Getters ───────────────────────────────────────────────────────────────
|
||||
|
||||
/** Get current OVR viseme weights (15 values: sil, PP, FF, TH, DD, kk, CH, SS, nn, RR, aa, E, ih, oh, ou). */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs|LipSync")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|LipSync")
|
||||
TMap<FName, float> GetCurrentVisemes() const { return SmoothedVisemes; }
|
||||
|
||||
/** Get current ARKit blendshape weights (MetaHuman compatible: jawOpen, mouthFunnel, mouthClose, etc.). */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs|LipSync")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|LipSync")
|
||||
TMap<FName, float> GetCurrentBlendshapes() const { return CurrentBlendshapes; }
|
||||
|
||||
// ── UActorComponent overrides ─────────────────────────────────────────────
|
||||
@ -285,11 +285,11 @@ private:
|
||||
double WaitingForTextStartTime = 0.0;
|
||||
|
||||
// Cached reference to the agent component on the same Actor
|
||||
TWeakObjectPtr<UElevenLabsConversationalAgentComponent> AgentComponent;
|
||||
TWeakObjectPtr<UPS_AI_Agent_Conv_ElevenLabsComponent> AgentComponent;
|
||||
FDelegateHandle AudioDataHandle;
|
||||
|
||||
// Cached reference to the facial expression component for emotion blending
|
||||
TWeakObjectPtr<UElevenLabsFacialExpressionComponent> CachedFacialExprComp;
|
||||
TWeakObjectPtr<UPS_AI_Agent_FacialExpressionComponent> CachedFacialExprComp;
|
||||
|
||||
// ── Pose-extracted curve data ────────────────────────────────────────────
|
||||
|
||||
@ -5,7 +5,7 @@
|
||||
#include "CoreMinimal.h"
|
||||
#include "Engine/DataAsset.h"
|
||||
#include "Engine/AssetManager.h"
|
||||
#include "ElevenLabsLipSyncPoseMap.generated.h"
|
||||
#include "PS_AI_Agent_LipSyncPoseMap.generated.h"
|
||||
|
||||
class UAnimSequence;
|
||||
|
||||
@ -13,16 +13,16 @@ class UAnimSequence;
|
||||
* Reusable data asset that maps OVR visemes to phoneme pose AnimSequences.
|
||||
*
|
||||
* Create ONE instance of this asset in the Content Browser
|
||||
* (right-click → Miscellaneous → Data Asset → ElevenLabsLipSyncPoseMap),
|
||||
* (right-click → Miscellaneous → Data Asset → PS_AI_Agent_LipSyncPoseMap),
|
||||
* assign your MHF_* AnimSequences once, then reference this single asset
|
||||
* on every MetaHuman's ElevenLabs Lip Sync component.
|
||||
* on every MetaHuman's PS AI Agent Lip Sync component.
|
||||
*
|
||||
* The component extracts curve data from each pose at BeginPlay and uses it
|
||||
* to drive lip sync — replacing the hardcoded ARKit blendshape mapping with
|
||||
* artist-crafted poses that coordinate dozens of facial curves.
|
||||
*/
|
||||
UCLASS(BlueprintType, Blueprintable, DisplayName = "ElevenLabs Lip Sync Pose Map")
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsLipSyncPoseMap : public UPrimaryDataAsset
|
||||
UCLASS(BlueprintType, Blueprintable, DisplayName = "PS AI Agent Lip Sync Pose Map")
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_LipSyncPoseMap : public UPrimaryDataAsset
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
@ -6,30 +6,30 @@
|
||||
#include "Components/ActorComponent.h"
|
||||
#include "AudioCapture.h"
|
||||
#include <atomic>
|
||||
#include "ElevenLabsMicrophoneCaptureComponent.generated.h"
|
||||
#include "PS_AI_Agent_MicrophoneCaptureComponent.generated.h"
|
||||
|
||||
// Delivers captured float PCM samples (16000 Hz mono, resampled from device rate).
|
||||
DECLARE_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAudioCaptured, const TArray<float>& /*FloatPCM*/);
|
||||
DECLARE_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_AudioCaptured, const TArray<float>& /*FloatPCM*/);
|
||||
|
||||
/**
|
||||
* Lightweight microphone capture component.
|
||||
* Captures from the default audio input device, resamples to 16000 Hz mono,
|
||||
* and delivers chunks via FOnElevenLabsAudioCaptured.
|
||||
* and delivers chunks via FOnPS_AI_Agent_AudioCaptured.
|
||||
*
|
||||
* Modelled after Convai's ConvaiAudioCaptureComponent but stripped to the
|
||||
* minimal functionality needed for the ElevenLabs Conversational AI API.
|
||||
* minimal functionality needed for conversational AI agents.
|
||||
*/
|
||||
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "ElevenLabs Microphone Capture")
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsMicrophoneCaptureComponent : public UActorComponent
|
||||
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "PS AI Agent Microphone Capture")
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_MicrophoneCaptureComponent : public UActorComponent
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
public:
|
||||
UElevenLabsMicrophoneCaptureComponent();
|
||||
UPS_AI_Agent_MicrophoneCaptureComponent();
|
||||
|
||||
/** Multiplier applied to the microphone input volume before sending to ElevenLabs. Increase if the agent has trouble hearing you, decrease if your audio is clipping. Default: 1.0 (no change). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Microphone",
|
||||
/** Multiplier applied to the microphone input volume before sending to the agent. Increase if the agent has trouble hearing you, decrease if your audio is clipping. Default: 1.0 (no change). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|MicCapture",
|
||||
meta = (ClampMin = "0.0", ClampMax = "4.0",
|
||||
ToolTip = "Microphone volume multiplier.\n1.0 = no change. Increase if the agent can't hear you, decrease if audio clips."))
|
||||
float VolumeMultiplier = 1.0f;
|
||||
@ -40,21 +40,21 @@ public:
|
||||
* Audio is captured on a WASAPI background thread, resampled there (with
|
||||
* echo suppression), then dispatched to the game thread for this broadcast.
|
||||
*/
|
||||
FOnElevenLabsAudioCaptured OnAudioCaptured;
|
||||
FOnPS_AI_Agent_AudioCaptured OnAudioCaptured;
|
||||
|
||||
/** Optional pointer to an atomic bool that suppresses capture when true.
|
||||
* Set by the agent component for echo suppression (skip mic while agent speaks). */
|
||||
std::atomic<bool>* EchoSuppressFlag = nullptr;
|
||||
|
||||
/** Open the default capture device and begin streaming audio. */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|MicCapture")
|
||||
void StartCapture();
|
||||
|
||||
/** Stop streaming and close the capture device. */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|MicCapture")
|
||||
void StopCapture();
|
||||
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintPure, Category = "PS_AI_Agent|MicCapture")
|
||||
bool IsCapturing() const { return bCapturing; }
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────
|
||||
@ -5,18 +5,18 @@
|
||||
#include "CoreMinimal.h"
|
||||
#include "Components/ActorComponent.h"
|
||||
#include "HAL/CriticalSection.h"
|
||||
#include "ElevenLabsPostureComponent.generated.h"
|
||||
#include "PS_AI_Agent_PostureComponent.generated.h"
|
||||
|
||||
class USkeletalMeshComponent;
|
||||
|
||||
DECLARE_LOG_CATEGORY_EXTERN(LogElevenLabsPosture, Log, All);
|
||||
DECLARE_LOG_CATEGORY_EXTERN(LogPS_AI_Agent_Posture, Log, All);
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Neck bone chain entry for distributing head rotation across multiple bones
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
USTRUCT(BlueprintType)
|
||||
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsNeckBoneEntry
|
||||
struct PS_AI_CONVAGENT_API FPS_AI_Agent_NeckBoneEntry
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
@ -31,7 +31,7 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsNeckBoneEntry
|
||||
};
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// UElevenLabsPostureComponent
|
||||
// UPS_AI_Agent_PostureComponent
|
||||
//
|
||||
// Chase-based multi-layer look-at system for MetaHuman characters.
|
||||
// Smoothly orients the character's body, head, and eyes toward a TargetActor.
|
||||
@ -47,35 +47,35 @@ struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsNeckBoneEntry
|
||||
//
|
||||
// Workflow:
|
||||
// 1. Add this component to the character Blueprint.
|
||||
// 2. Add the AnimNode "ElevenLabs Posture" in the Body AnimBP
|
||||
// 2. Add the AnimNode "PS AI Agent Posture" in the Body AnimBP
|
||||
// with bApplyHeadRotation = true, bApplyEyeCurves = false.
|
||||
// 3. Add the AnimNode "ElevenLabs Posture" in the Face AnimBP
|
||||
// 3. Add the AnimNode "PS AI Agent Posture" in the Face AnimBP
|
||||
// with bApplyHeadRotation = false, bApplyEyeCurves = true
|
||||
// (between "Facial Expression" and "Lip Sync" nodes).
|
||||
// 4. Set TargetActor to any actor (player pawn, a prop, etc.).
|
||||
// 5. Set TargetOffset for actors without a skeleton (e.g. (0,0,160) for
|
||||
// eye-level on a simple actor).
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "ElevenLabs Posture")
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsPostureComponent : public UActorComponent
|
||||
UCLASS(ClassGroup = "PS AI Agent", meta = (BlueprintSpawnableComponent),
|
||||
DisplayName = "PS AI Agent Posture")
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_PostureComponent : public UActorComponent
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
public:
|
||||
UElevenLabsPostureComponent();
|
||||
UPS_AI_Agent_PostureComponent();
|
||||
|
||||
// ── Target ───────────────────────────────────────────────────────────────
|
||||
|
||||
/** The actor to look at. Can be any actor (player, prop, etc.).
|
||||
* Set to null to smoothly return to neutral. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ToolTip = "Target actor to look at.\nSet to null to return to neutral."))
|
||||
TObjectPtr<AActor> TargetActor;
|
||||
|
||||
/** Offset from the target actor's origin to aim at.
|
||||
* Useful for actors without a skeleton (e.g. (0,0,160) for eye-level). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ToolTip = "Offset from target actor origin.\nE.g. (0,0,160) for eye-level."))
|
||||
FVector TargetOffset = FVector(0.0f, 0.0f, 0.0f);
|
||||
|
||||
@ -89,44 +89,44 @@ public:
|
||||
// so small movements around the target don't re-trigger higher layers.
|
||||
|
||||
/** Maximum head yaw rotation in degrees. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0", ClampMax = "90"))
|
||||
float MaxHeadYaw = 40.0f;
|
||||
|
||||
/** Maximum head pitch rotation in degrees. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0", ClampMax = "90"))
|
||||
float MaxHeadPitch = 30.0f;
|
||||
|
||||
/** Maximum horizontal eye angle in degrees. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0", ClampMax = "90"))
|
||||
float MaxEyeHorizontal = 15.0f;
|
||||
|
||||
/** Maximum vertical eye angle in degrees. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0", ClampMax = "90"))
|
||||
float MaxEyeVertical = 10.0f;
|
||||
|
||||
// ── Smoothing speeds ─────────────────────────────────────────────────────
|
||||
|
||||
/** Body rotation interpolation speed (lower = slower, more natural). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0.1", ClampMax = "20"))
|
||||
float BodyInterpSpeed = 4.0f;
|
||||
|
||||
/** Head rotation interpolation speed. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0.1", ClampMax = "20"))
|
||||
float HeadInterpSpeed = 4.0f;
|
||||
|
||||
/** Eye movement interpolation speed (higher = snappier). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0.1", ClampMax = "20"))
|
||||
float EyeInterpSpeed = 5.0f;
|
||||
|
||||
/** Interpolation speed when returning to neutral (TargetActor is null). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0.1", ClampMax = "20"))
|
||||
float ReturnToNeutralSpeed = 3.0f;
|
||||
|
||||
@ -140,7 +140,7 @@ public:
|
||||
* 1.0 = head always points at target regardless of animation.
|
||||
* 0.0 = posture is additive on top of animation (old behavior).
|
||||
* Default: 1.0 for conversational AI (always look at who you talk to). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0", ClampMax = "1"))
|
||||
float HeadAnimationCompensation = 0.9f;
|
||||
|
||||
@ -148,7 +148,7 @@ public:
|
||||
* 1.0 = eyes frozen on posture target, animation's eye movement removed.
|
||||
* 0.0 = animation's eyes play through, posture is additive.
|
||||
* Intermediate (e.g. 0.5) = smooth 50/50 blend. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0", ClampMax = "1"))
|
||||
float EyeAnimationCompensation = 0.6f;
|
||||
|
||||
@ -157,7 +157,7 @@ public:
|
||||
* counter-rotates the posture to keep the head pointing at the target.
|
||||
* 1.0 = full compensation — head stays locked on target even during bows.
|
||||
* 0.0 = no compensation — head follows body movement naturally. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "0", ClampMax = "1"))
|
||||
float BodyDriftCompensation = 0.8f;
|
||||
|
||||
@ -167,7 +167,7 @@ public:
|
||||
* Green = head bone → target (desired).
|
||||
* Cyan = computed gaze direction (body yaw + head + eyes).
|
||||
* Useful for verifying that eye contact is accurate. */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture")
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture")
|
||||
bool bDrawDebugGaze = false;
|
||||
|
||||
// ── Forward offset ──────────────────────────────────────────────────────
|
||||
@ -177,14 +177,14 @@ public:
|
||||
* 0 = mesh faces +X (default UE convention)
|
||||
* 90 = mesh faces +Y
|
||||
* -90 = mesh faces -Y */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture",
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture",
|
||||
meta = (ClampMin = "-180", ClampMax = "180"))
|
||||
float MeshForwardYawOffset = 90.0f;
|
||||
|
||||
// ── Head bone ────────────────────────────────────────────────────────────
|
||||
|
||||
/** Name of the head bone on the skeletal mesh (used for eye origin calculation). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture")
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture")
|
||||
FName HeadBoneName = FName(TEXT("head"));
|
||||
|
||||
// ── Neck bone chain ─────────────────────────────────────────────────────
|
||||
@ -193,14 +193,14 @@ public:
|
||||
* Order: root-to-tip (e.g. neck_01 → neck_02 → head).
|
||||
* Weights should sum to ~1.0.
|
||||
* If empty, falls back to single-bone behavior (HeadBoneName, weight 1.0). */
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Posture")
|
||||
TArray<FElevenLabsNeckBoneEntry> NeckBoneChain;
|
||||
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "PS_AI_Agent|Posture")
|
||||
TArray<FPS_AI_Agent_NeckBoneEntry> NeckBoneChain;
|
||||
|
||||
// ── Getters (read by AnimNode) ───────────────────────────────────────────
|
||||
|
||||
/** Get current eye gaze curves (8 ARKit eye look curves).
|
||||
* Returns a COPY — safe to call from any thread. */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs|Posture")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS_AI_Agent|Posture")
|
||||
TMap<FName, float> GetCurrentEyeCurves() const
|
||||
{
|
||||
FScopeLock Lock(&PostureDataLock);
|
||||
@ -220,7 +220,7 @@ public:
|
||||
FName GetHeadBoneName() const { return HeadBoneName; }
|
||||
|
||||
/** Get the neck bone chain (used by AnimNode to resolve bone indices). */
|
||||
const TArray<FElevenLabsNeckBoneEntry>& GetNeckBoneChain() const { return NeckBoneChain; }
|
||||
const TArray<FPS_AI_Agent_NeckBoneEntry>& GetNeckBoneChain() const { return NeckBoneChain; }
|
||||
|
||||
/** Get head animation compensation factor (0 = additive, 1 = full override). */
|
||||
float GetHeadAnimationCompensation() const { return HeadAnimationCompensation; }
|
||||
@ -4,63 +4,63 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "UObject/NoExportTypes.h"
|
||||
#include "ElevenLabsDefinitions.h"
|
||||
#include "PS_AI_Agent_Definitions.h"
|
||||
#include "IWebSocket.h"
|
||||
#include "ElevenLabsWebSocketProxy.generated.h"
|
||||
#include "PS_AI_Agent_WebSocket_ElevenLabsProxy.generated.h"
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Delegates (all Blueprint-assignable)
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsConnected,
|
||||
const FElevenLabsConversationInfo&, ConversationInfo);
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_Connected,
|
||||
const FPS_AI_Agent_ConversationInfo_ElevenLabs&, ConversationInfo);
|
||||
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnElevenLabsDisconnected,
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnPS_AI_Agent_ElevenLabs_Disconnected,
|
||||
int32, StatusCode, const FString&, Reason);
|
||||
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsError,
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_Error,
|
||||
const FString&, ErrorMessage);
|
||||
|
||||
/** Fired when a PCM audio chunk arrives from the agent. Raw bytes, 16-bit signed 16kHz mono. */
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAudioReceived,
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_AudioReceived,
|
||||
const TArray<uint8>&, PCMData);
|
||||
|
||||
/** Fired for user or agent transcript segments. */
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsTranscript,
|
||||
const FElevenLabsTranscriptSegment&, Segment);
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_Transcript,
|
||||
const FPS_AI_Agent_TranscriptSegment_ElevenLabs&, Segment);
|
||||
|
||||
/** Fired with the final text response from the agent. */
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAgentResponse,
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_AgentResponse,
|
||||
const FString&, ResponseText);
|
||||
|
||||
/** Fired when the agent interrupts the user. */
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsInterrupted);
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnPS_AI_Agent_ElevenLabs_Interrupted);
|
||||
|
||||
/**
|
||||
* Fired when the server starts generating a response (first agent_chat_response_part received).
|
||||
* This fires BEFORE audio arrives — useful to detect that the server is processing
|
||||
* the previous turn while the client may have restarted listening (auto-restart scenario).
|
||||
*/
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsAgentResponseStarted);
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnPS_AI_Agent_ElevenLabs_ResponseStarted);
|
||||
|
||||
/** Fired for every agent_chat_response_part — streams the LLM text as it is generated.
|
||||
* PartialText is the text fragment from this individual part (NOT accumulated). */
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAgentResponsePart,
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_ResponsePart,
|
||||
const FString&, PartialText);
|
||||
|
||||
/** Fired when the server sends a client_tool_call — the agent wants the client to execute a tool. */
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsClientToolCall,
|
||||
const FElevenLabsClientToolCall&, ToolCall);
|
||||
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnPS_AI_Agent_ElevenLabs_ClientToolCall,
|
||||
const FPS_AI_Agent_ClientToolCall_ElevenLabs&, ToolCall);
|
||||
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// WebSocket Proxy
|
||||
// Manages the lifecycle of a single ElevenLabs Conversational AI WebSocket session.
|
||||
// Instantiate via UElevenLabsConversationalAgentComponent (the component manages
|
||||
// Instantiate via UPS_AI_Agent_Conv_ElevenLabsComponent (the component manages
|
||||
// one proxy at a time), or create manually through Blueprints.
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UCLASS(BlueprintType, Blueprintable)
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsWebSocketProxy : public UObject
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_WebSocket_ElevenLabsProxy : public UObject
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
@ -68,48 +68,48 @@ public:
|
||||
// ── Events ────────────────────────────────────────────────────────────────
|
||||
|
||||
/** Called once the WebSocket handshake succeeds and the agent sends its initiation metadata. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsConnected OnConnected;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_Connected OnConnected;
|
||||
|
||||
/** Called when the WebSocket closes (graceful or remote). */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsDisconnected OnDisconnected;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_Disconnected OnDisconnected;
|
||||
|
||||
/** Called on any connection or protocol error. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsError OnError;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_Error OnError;
|
||||
|
||||
/** Raw PCM audio coming from the agent — feed this into your audio component. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsAudioReceived OnAudioReceived;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_AudioReceived OnAudioReceived;
|
||||
|
||||
/** User or agent transcript (may be tentative while the conversation is ongoing). */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsTranscript OnTranscript;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_Transcript OnTranscript;
|
||||
|
||||
/** Final text response from the agent (complements audio). */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsAgentResponse OnAgentResponse;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_AgentResponse OnAgentResponse;
|
||||
|
||||
/** The agent was interrupted by new user speech. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsInterrupted OnInterrupted;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_Interrupted OnInterrupted;
|
||||
|
||||
/**
|
||||
* Fired on the first agent_chat_response_part per turn — i.e. the moment the server
|
||||
* starts generating. Fires well before audio. The component uses this to stop the
|
||||
* microphone if it was restarted before the server finished processing the previous turn.
|
||||
*/
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsAgentResponseStarted OnAgentResponseStarted;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_ResponseStarted OnAgentResponseStarted;
|
||||
|
||||
/** Fired for every agent_chat_response_part with the streaming text fragment. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsAgentResponsePart OnAgentResponsePart;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_ResponsePart OnAgentResponsePart;
|
||||
|
||||
/** Fired when the agent invokes a client tool. Handle the call and reply with SendClientToolResult. */
|
||||
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
|
||||
FOnElevenLabsClientToolCall OnClientToolCall;
|
||||
UPROPERTY(BlueprintAssignable, Category = "PS AI Agent|ElevenLabs|Events")
|
||||
FOnPS_AI_Agent_ElevenLabs_ClientToolCall OnClientToolCall;
|
||||
|
||||
// ── Lifecycle ─────────────────────────────────────────────────────────────
|
||||
|
||||
@ -120,22 +120,22 @@ public:
|
||||
* @param AgentID ElevenLabs agent ID. Overrides the project-level default when non-empty.
|
||||
* @param APIKey API key. Overrides the project-level default when non-empty.
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void Connect(const FString& AgentID = TEXT(""), const FString& APIKey = TEXT(""));
|
||||
|
||||
/**
|
||||
* Gracefully close the WebSocket connection.
|
||||
* OnDisconnected will fire after the server acknowledges.
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void Disconnect();
|
||||
|
||||
/** Current connection state. */
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
EElevenLabsConnectionState GetConnectionState() const { return ConnectionState; }
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
EPS_AI_Agent_ConnectionState_ElevenLabs GetConnectionState() const { return ConnectionState; }
|
||||
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
bool IsConnected() const { return ConnectionState == EElevenLabsConnectionState::Connected; }
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
bool IsConnected() const { return ConnectionState == EPS_AI_Agent_ConnectionState_ElevenLabs::Connected; }
|
||||
|
||||
// ── Audio sending ─────────────────────────────────────────────────────────
|
||||
|
||||
@ -147,7 +147,7 @@ public:
|
||||
*
|
||||
* @param PCMData Raw PCM bytes (16-bit LE, 16kHz, mono).
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void SendAudioChunk(const TArray<uint8>& PCMData);
|
||||
|
||||
// ── Turn control (only relevant in Client turn mode) ──────────────────────
|
||||
@ -157,7 +157,7 @@ public:
|
||||
* Sends a { "type": "user_activity" } message to the server.
|
||||
* Call this periodically while the user is speaking (e.g. every audio chunk).
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void SendUserTurnStart();
|
||||
|
||||
/**
|
||||
@ -165,7 +165,7 @@ public:
|
||||
* No explicit API message — simply stop sending user_activity.
|
||||
* The server detects silence and hands the turn to the agent.
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void SendUserTurnEnd();
|
||||
|
||||
/**
|
||||
@ -173,11 +173,11 @@ public:
|
||||
* Useful for testing or text-only interaction.
|
||||
* Sends: { "type": "user_message", "text": "..." }
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void SendTextMessage(const FString& Text);
|
||||
|
||||
/** Ask the agent to stop the current utterance. */
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void SendInterrupt();
|
||||
|
||||
/**
|
||||
@ -188,13 +188,13 @@ public:
|
||||
* @param Result A string result to return to the agent.
|
||||
* @param bIsError True if the tool execution failed.
|
||||
*/
|
||||
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
|
||||
UFUNCTION(BlueprintCallable, Category = "PS AI Agent|ElevenLabs")
|
||||
void SendClientToolResult(const FString& ToolCallId, const FString& Result, bool bIsError = false);
|
||||
|
||||
// ── Info ──────────────────────────────────────────────────────────────────
|
||||
|
||||
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
|
||||
const FElevenLabsConversationInfo& GetConversationInfo() const { return ConversationInfo; }
|
||||
UFUNCTION(BlueprintPure, Category = "PS AI Agent|ElevenLabs")
|
||||
const FPS_AI_Agent_ConversationInfo_ElevenLabs& GetConversationInfo() const { return ConversationInfo; }
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────
|
||||
// Internal
|
||||
@ -222,8 +222,8 @@ private:
|
||||
FString BuildWebSocketURL(const FString& AgentID, const FString& APIKey) const;
|
||||
|
||||
TSharedPtr<IWebSocket> WebSocket;
|
||||
EElevenLabsConnectionState ConnectionState = EElevenLabsConnectionState::Disconnected;
|
||||
FElevenLabsConversationInfo ConversationInfo;
|
||||
EPS_AI_Agent_ConnectionState_ElevenLabs ConnectionState = EPS_AI_Agent_ConnectionState_ElevenLabs::Disconnected;
|
||||
FPS_AI_Agent_ConversationInfo_ElevenLabs ConversationInfo;
|
||||
|
||||
// Serializes WebSocket->Send() calls — needed because SendAudioChunk can now be
|
||||
// called from the WASAPI background thread while SendJsonMessage runs on game thread.
|
||||
@ -258,9 +258,9 @@ private:
|
||||
int32 LastInterruptEventId = 0;
|
||||
|
||||
public:
|
||||
// Set by UElevenLabsConversationalAgentComponent before calling Connect().
|
||||
// Set by UPS_AI_Agent_Conv_ElevenLabsComponent before calling Connect().
|
||||
// Controls turn_timeout in conversation_initiation_client_data.
|
||||
EElevenLabsTurnMode TurnMode = EElevenLabsTurnMode::Server;
|
||||
EPS_AI_Agent_TurnMode_ElevenLabs TurnMode = EPS_AI_Agent_TurnMode_ElevenLabs::Server;
|
||||
|
||||
// Speculative turn: start LLM generation during silence before full turn confidence.
|
||||
bool bSpeculativeTurn = true;
|
||||
@ -4,18 +4,18 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "Modules/ModuleManager.h"
|
||||
#include "PS_AI_Agent_ElevenLabs.generated.h"
|
||||
#include "PS_AI_ConvAgent.generated.h"
|
||||
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Settings object – exposed in Project Settings → Plugins → ElevenLabs AI Agent
|
||||
// Settings object – exposed in Project Settings → Plugins → PS AI Agent - ElevenLabs
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
UCLASS(config = Engine, defaultconfig)
|
||||
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsSettings : public UObject
|
||||
class PS_AI_CONVAGENT_API UPS_AI_Agent_Settings_ElevenLabs : public UObject
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
public:
|
||||
UElevenLabsSettings(const FObjectInitializer& ObjectInitializer)
|
||||
UPS_AI_Agent_Settings_ElevenLabs(const FObjectInitializer& ObjectInitializer)
|
||||
: Super(ObjectInitializer)
|
||||
{
|
||||
API_Key = TEXT("");
|
||||
@ -28,14 +28,14 @@ public:
|
||||
* Obtain from https://elevenlabs.io – used to authenticate WebSocket connections.
|
||||
* Keep this secret; do not ship with the key hard-coded in a shipping build.
|
||||
*/
|
||||
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API")
|
||||
UPROPERTY(Config, EditAnywhere, Category = "PS_AI_Agent|ElevenLabs API")
|
||||
FString API_Key;
|
||||
|
||||
/**
|
||||
* The default ElevenLabs Conversational Agent ID to use when none is specified
|
||||
* The default ElevenLabs Agent ID to use when none is specified
|
||||
* on the component. Create agents at https://elevenlabs.io/app/conversational-ai
|
||||
*/
|
||||
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API")
|
||||
UPROPERTY(Config, EditAnywhere, Category = "PS_AI_Agent|ElevenLabs API")
|
||||
FString AgentID;
|
||||
|
||||
/**
|
||||
@ -43,7 +43,7 @@ public:
|
||||
* before connecting, so the API key is never exposed in the client.
|
||||
* Set SignedURLEndpoint to point to your server that returns the signed URL.
|
||||
*/
|
||||
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API | Security")
|
||||
UPROPERTY(Config, EditAnywhere, Category = "PS_AI_Agent|ElevenLabs API|Security")
|
||||
bool bSignedURLMode;
|
||||
|
||||
/**
|
||||
@ -51,7 +51,7 @@ public:
|
||||
* Only used when bSignedURLMode = true.
|
||||
* Expected response body: { "signed_url": "wss://..." }
|
||||
*/
|
||||
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API | Security",
|
||||
UPROPERTY(Config, EditAnywhere, Category = "PS_AI_Agent|ElevenLabs API|Security",
|
||||
meta = (EditCondition = "bSignedURLMode"))
|
||||
FString SignedURLEndpoint;
|
||||
|
||||
@ -59,11 +59,11 @@ public:
|
||||
* Override the ElevenLabs WebSocket base URL. Leave empty to use the default:
|
||||
* wss://api.elevenlabs.io/v1/convai/conversation
|
||||
*/
|
||||
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "ElevenLabs API")
|
||||
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "PS_AI_Agent|ElevenLabs API")
|
||||
FString CustomWebSocketURL;
|
||||
|
||||
/** Log verbose WebSocket messages to the Output Log (useful during development). */
|
||||
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "ElevenLabs API")
|
||||
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "PS_AI_Agent|ElevenLabs API")
|
||||
bool bVerboseLogging = false;
|
||||
};
|
||||
|
||||
@ -71,7 +71,7 @@ public:
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
// Module
|
||||
// ─────────────────────────────────────────────────────────────────────────────
|
||||
class PS_AI_AGENT_ELEVENLABS_API FPS_AI_Agent_ElevenLabsModule : public IModuleInterface
|
||||
class PS_AI_CONVAGENT_API FPS_AI_ConvAgentModule : public IModuleInterface
|
||||
{
|
||||
public:
|
||||
/** IModuleInterface implementation */
|
||||
@ -81,19 +81,19 @@ public:
|
||||
virtual bool IsGameModule() const override { return true; }
|
||||
|
||||
/** Singleton access */
|
||||
static inline FPS_AI_Agent_ElevenLabsModule& Get()
|
||||
static inline FPS_AI_ConvAgentModule& Get()
|
||||
{
|
||||
return FModuleManager::LoadModuleChecked<FPS_AI_Agent_ElevenLabsModule>("PS_AI_Agent_ElevenLabs");
|
||||
return FModuleManager::LoadModuleChecked<FPS_AI_ConvAgentModule>("PS_AI_ConvAgent");
|
||||
}
|
||||
|
||||
static inline bool IsAvailable()
|
||||
{
|
||||
return FModuleManager::Get().IsModuleLoaded("PS_AI_Agent_ElevenLabs");
|
||||
return FModuleManager::Get().IsModuleLoaded("PS_AI_ConvAgent");
|
||||
}
|
||||
|
||||
/** Access the settings object at runtime */
|
||||
UElevenLabsSettings* GetSettings() const;
|
||||
UPS_AI_Agent_Settings_ElevenLabs* GetSettings() const;
|
||||
|
||||
private:
|
||||
UElevenLabsSettings* Settings = nullptr;
|
||||
UPS_AI_Agent_Settings_ElevenLabs* Settings = nullptr;
|
||||
};
|
||||
@ -2,9 +2,9 @@
|
||||
|
||||
using UnrealBuildTool;
|
||||
|
||||
public class PS_AI_Agent_ElevenLabsEditor : ModuleRules
|
||||
public class PS_AI_ConvAgentEditor : ModuleRules
|
||||
{
|
||||
public PS_AI_Agent_ElevenLabsEditor(ReadOnlyTargetRules Target) : base(Target)
|
||||
public PS_AI_ConvAgentEditor(ReadOnlyTargetRules Target) : base(Target)
|
||||
{
|
||||
DefaultBuildSettings = BuildSettingsVersion.Latest;
|
||||
PCHUsage = PCHUsageMode.UseExplicitOrSharedPCHs;
|
||||
@ -19,8 +19,8 @@ public class PS_AI_Agent_ElevenLabsEditor : ModuleRules
|
||||
// AnimGraph editor node base class
|
||||
"AnimGraph",
|
||||
"BlueprintGraph",
|
||||
// Runtime module containing FAnimNode_ElevenLabsLipSync
|
||||
"PS_AI_Agent_ElevenLabs",
|
||||
// Runtime module containing FAnimNode_PS_AI_Agent_LipSync
|
||||
"PS_AI_ConvAgent",
|
||||
});
|
||||
}
|
||||
}
|
||||
@ -0,0 +1,32 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimGraphNode_PS_AI_Agent_FacialExpression.h"
|
||||
|
||||
#define LOCTEXT_NAMESPACE "AnimNode_PS_AI_Agent_FacialExpression"
|
||||
|
||||
FText UAnimGraphNode_PS_AI_Agent_FacialExpression::GetNodeTitle(ENodeTitleType::Type TitleType) const
|
||||
{
|
||||
return LOCTEXT("NodeTitle", "PS AI Agent Facial Expression");
|
||||
}
|
||||
|
||||
FText UAnimGraphNode_PS_AI_Agent_FacialExpression::GetTooltipText() const
|
||||
{
|
||||
return LOCTEXT("Tooltip",
|
||||
"Injects emotion expression curves from the PS AI Agent Facial Expression component.\n\n"
|
||||
"Place this node BEFORE the PS AI Agent Lip Sync node in the MetaHuman Face AnimBP.\n"
|
||||
"It outputs CTRL_expressions_* curves for eyes, eyebrows, cheeks, and mouth mood.\n"
|
||||
"The Lip Sync node placed after will override mouth-area curves during speech.");
|
||||
}
|
||||
|
||||
FString UAnimGraphNode_PS_AI_Agent_FacialExpression::GetNodeCategory() const
|
||||
{
|
||||
return TEXT("PS AI Agent");
|
||||
}
|
||||
|
||||
FLinearColor UAnimGraphNode_PS_AI_Agent_FacialExpression::GetNodeTitleColor() const
|
||||
{
|
||||
// Warm amber to distinguish from Lip Sync (teal)
|
||||
return FLinearColor(0.8f, 0.5f, 0.1f, 1.0f);
|
||||
}
|
||||
|
||||
#undef LOCTEXT_NAMESPACE
|
||||
@ -0,0 +1,32 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimGraphNode_PS_AI_Agent_LipSync.h"
|
||||
|
||||
#define LOCTEXT_NAMESPACE "AnimNode_PS_AI_Agent_LipSync"
|
||||
|
||||
FText UAnimGraphNode_PS_AI_Agent_LipSync::GetNodeTitle(ENodeTitleType::Type TitleType) const
|
||||
{
|
||||
return LOCTEXT("NodeTitle", "PS AI Agent Lip Sync");
|
||||
}
|
||||
|
||||
FText UAnimGraphNode_PS_AI_Agent_LipSync::GetTooltipText() const
|
||||
{
|
||||
return LOCTEXT("Tooltip",
|
||||
"Injects lip sync animation curves from the PS AI Agent Lip Sync component.\n\n"
|
||||
"Place this node BEFORE the Control Rig in the MetaHuman Face AnimBP.\n"
|
||||
"It auto-discovers the component and outputs CTRL_expressions_* curves\n"
|
||||
"that the MetaHuman Control Rig uses to drive facial bones.");
|
||||
}
|
||||
|
||||
FString UAnimGraphNode_PS_AI_Agent_LipSync::GetNodeCategory() const
|
||||
{
|
||||
return TEXT("PS AI Agent");
|
||||
}
|
||||
|
||||
FLinearColor UAnimGraphNode_PS_AI_Agent_LipSync::GetNodeTitleColor() const
|
||||
{
|
||||
// PS AI Agent teal color
|
||||
return FLinearColor(0.1f, 0.7f, 0.6f, 1.0f);
|
||||
}
|
||||
|
||||
#undef LOCTEXT_NAMESPACE
|
||||
@ -0,0 +1,33 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "AnimGraphNode_PS_AI_Agent_Posture.h"
|
||||
|
||||
#define LOCTEXT_NAMESPACE "AnimNode_PS_AI_Agent_Posture"
|
||||
|
||||
FText UAnimGraphNode_PS_AI_Agent_Posture::GetNodeTitle(ENodeTitleType::Type TitleType) const
|
||||
{
|
||||
return LOCTEXT("NodeTitle", "PS AI Agent Posture");
|
||||
}
|
||||
|
||||
FText UAnimGraphNode_PS_AI_Agent_Posture::GetTooltipText() const
|
||||
{
|
||||
return LOCTEXT("Tooltip",
|
||||
"Injects head rotation and eye gaze curves from the PS AI Agent Posture component.\n\n"
|
||||
"Place this node AFTER the PS AI Agent Facial Expression node and\n"
|
||||
"BEFORE the PS AI Agent Lip Sync node in the MetaHuman Face AnimBP.\n\n"
|
||||
"The component distributes look-at rotation across body (actor yaw),\n"
|
||||
"head (bone rotation), and eyes (ARKit curves) for a natural look-at effect.");
|
||||
}
|
||||
|
||||
FString UAnimGraphNode_PS_AI_Agent_Posture::GetNodeCategory() const
|
||||
{
|
||||
return TEXT("PS AI Agent");
|
||||
}
|
||||
|
||||
FLinearColor UAnimGraphNode_PS_AI_Agent_Posture::GetNodeTitleColor() const
|
||||
{
|
||||
// Cool blue to distinguish from Facial Expression (amber) and Lip Sync (teal)
|
||||
return FLinearColor(0.2f, 0.4f, 0.9f, 1.0f);
|
||||
}
|
||||
|
||||
#undef LOCTEXT_NAMESPACE
|
||||
@ -0,0 +1,29 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "PS_AI_Agent_EmotionPoseMapFactory.h"
|
||||
#include "PS_AI_Agent_EmotionPoseMap.h"
|
||||
#include "AssetTypeCategories.h"
|
||||
|
||||
UPS_AI_Agent_EmotionPoseMapFactory::UPS_AI_Agent_EmotionPoseMapFactory()
|
||||
{
|
||||
SupportedClass = UPS_AI_Agent_EmotionPoseMap::StaticClass();
|
||||
bCreateNew = true;
|
||||
bEditAfterNew = true;
|
||||
}
|
||||
|
||||
UObject* UPS_AI_Agent_EmotionPoseMapFactory::FactoryCreateNew(
|
||||
UClass* Class, UObject* InParent, FName Name, EObjectFlags Flags,
|
||||
UObject* Context, FFeedbackContext* Warn)
|
||||
{
|
||||
return NewObject<UPS_AI_Agent_EmotionPoseMap>(InParent, Class, Name, Flags);
|
||||
}
|
||||
|
||||
FText UPS_AI_Agent_EmotionPoseMapFactory::GetDisplayName() const
|
||||
{
|
||||
return FText::FromString(TEXT("PS AI Agent Emotion Pose Map"));
|
||||
}
|
||||
|
||||
uint32 UPS_AI_Agent_EmotionPoseMapFactory::GetMenuCategories() const
|
||||
{
|
||||
return EAssetTypeCategories::Misc;
|
||||
}
|
||||
@ -4,19 +4,19 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "Factories/Factory.h"
|
||||
#include "ElevenLabsLipSyncPoseMapFactory.generated.h"
|
||||
#include "PS_AI_Agent_EmotionPoseMapFactory.generated.h"
|
||||
|
||||
/**
|
||||
* Factory that lets users create ElevenLabsLipSyncPoseMap assets
|
||||
* Factory that lets users create PS_AI_Agent_EmotionPoseMap assets
|
||||
* directly from the Content Browser (right-click → Miscellaneous).
|
||||
*/
|
||||
UCLASS()
|
||||
class UElevenLabsLipSyncPoseMapFactory : public UFactory
|
||||
class UPS_AI_Agent_EmotionPoseMapFactory : public UFactory
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
public:
|
||||
UElevenLabsLipSyncPoseMapFactory();
|
||||
UPS_AI_Agent_EmotionPoseMapFactory();
|
||||
|
||||
virtual UObject* FactoryCreateNew(UClass* Class, UObject* InParent,
|
||||
FName Name, EObjectFlags Flags, UObject* Context,
|
||||
@ -0,0 +1,29 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "PS_AI_Agent_LipSyncPoseMapFactory.h"
|
||||
#include "PS_AI_Agent_LipSyncPoseMap.h"
|
||||
#include "AssetTypeCategories.h"
|
||||
|
||||
UPS_AI_Agent_LipSyncPoseMapFactory::UPS_AI_Agent_LipSyncPoseMapFactory()
|
||||
{
|
||||
SupportedClass = UPS_AI_Agent_LipSyncPoseMap::StaticClass();
|
||||
bCreateNew = true;
|
||||
bEditAfterNew = true;
|
||||
}
|
||||
|
||||
UObject* UPS_AI_Agent_LipSyncPoseMapFactory::FactoryCreateNew(
|
||||
UClass* Class, UObject* InParent, FName Name, EObjectFlags Flags,
|
||||
UObject* Context, FFeedbackContext* Warn)
|
||||
{
|
||||
return NewObject<UPS_AI_Agent_LipSyncPoseMap>(InParent, Class, Name, Flags);
|
||||
}
|
||||
|
||||
FText UPS_AI_Agent_LipSyncPoseMapFactory::GetDisplayName() const
|
||||
{
|
||||
return FText::FromString(TEXT("PS AI Agent Lip Sync Pose Map"));
|
||||
}
|
||||
|
||||
uint32 UPS_AI_Agent_LipSyncPoseMapFactory::GetMenuCategories() const
|
||||
{
|
||||
return EAssetTypeCategories::Misc;
|
||||
}
|
||||
@ -4,19 +4,19 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "Factories/Factory.h"
|
||||
#include "ElevenLabsEmotionPoseMapFactory.generated.h"
|
||||
#include "PS_AI_Agent_LipSyncPoseMapFactory.generated.h"
|
||||
|
||||
/**
|
||||
* Factory that lets users create ElevenLabsEmotionPoseMap assets
|
||||
* Factory that lets users create PS_AI_Agent_LipSyncPoseMap assets
|
||||
* directly from the Content Browser (right-click → Miscellaneous).
|
||||
*/
|
||||
UCLASS()
|
||||
class UElevenLabsEmotionPoseMapFactory : public UFactory
|
||||
class UPS_AI_Agent_LipSyncPoseMapFactory : public UFactory
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
public:
|
||||
UElevenLabsEmotionPoseMapFactory();
|
||||
UPS_AI_Agent_LipSyncPoseMapFactory();
|
||||
|
||||
virtual UObject* FactoryCreateNew(UClass* Class, UObject* InParent,
|
||||
FName Name, EObjectFlags Flags, UObject* Context,
|
||||
@ -0,0 +1,16 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#include "Modules/ModuleManager.h"
|
||||
|
||||
/**
|
||||
* Editor module for PS_AI_ConvAgent plugin.
|
||||
* Provides AnimGraph node(s) for the PS AI Agent Lip Sync system.
|
||||
*/
|
||||
class FPS_AI_ConvAgentEditorModule : public IModuleInterface
|
||||
{
|
||||
public:
|
||||
virtual void StartupModule() override {}
|
||||
virtual void ShutdownModule() override {}
|
||||
};
|
||||
|
||||
IMPLEMENT_MODULE(FPS_AI_ConvAgentEditorModule, PS_AI_ConvAgentEditor)
|
||||
@ -0,0 +1,31 @@
|
||||
// Copyright ASTERION. All Rights Reserved.
|
||||
|
||||
#pragma once
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "AnimGraphNode_Base.h"
|
||||
#include "AnimNode_PS_AI_Agent_FacialExpression.h"
|
||||
#include "AnimGraphNode_PS_AI_Agent_FacialExpression.generated.h"
|
||||
|
||||
/**
|
||||
* AnimGraph editor node for the PS AI Agent Facial Expression AnimNode.
|
||||
*
|
||||
* This node appears in the AnimBP graph editor under the "PS AI Agent" category.
|
||||
* Place it BEFORE the PS AI Agent Lip Sync node in the MetaHuman Face AnimBP.
|
||||
* It auto-discovers the PS_AI_Agent_FacialExpressionComponent on the owning Actor
|
||||
* and injects CTRL_expressions_* curves for emotion-driven facial expressions.
|
||||
*/
|
||||
UCLASS()
|
||||
class UAnimGraphNode_PS_AI_Agent_FacialExpression : public UAnimGraphNode_Base
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
UPROPERTY(EditAnywhere, Category = "Settings")
|
||||
FAnimNode_PS_AI_Agent_FacialExpression Node;
|
||||
|
||||
// UAnimGraphNode_Base interface
|
||||
virtual FText GetNodeTitle(ENodeTitleType::Type TitleType) const override;
|
||||
virtual FText GetTooltipText() const override;
|
||||
virtual FString GetNodeCategory() const override;
|
||||
virtual FLinearColor GetNodeTitleColor() const override;
|
||||
};
|
||||
@ -4,24 +4,24 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "AnimGraphNode_Base.h"
|
||||
#include "AnimNode_ElevenLabsLipSync.h"
|
||||
#include "AnimGraphNode_ElevenLabsLipSync.generated.h"
|
||||
#include "AnimNode_PS_AI_Agent_LipSync.h"
|
||||
#include "AnimGraphNode_PS_AI_Agent_LipSync.generated.h"
|
||||
|
||||
/**
|
||||
* AnimGraph editor node for the ElevenLabs Lip Sync AnimNode.
|
||||
* AnimGraph editor node for the PS AI Agent Lip Sync AnimNode.
|
||||
*
|
||||
* This node appears in the AnimBP graph editor under the "ElevenLabs" category.
|
||||
* This node appears in the AnimBP graph editor under the "PS AI Agent" category.
|
||||
* Drop it into your MetaHuman Face AnimBP between the pose source and the
|
||||
* Control Rig node. It auto-discovers the ElevenLabsLipSyncComponent on the
|
||||
* Control Rig node. It auto-discovers the PS_AI_Agent_LipSyncComponent on the
|
||||
* owning Actor and injects CTRL_expressions_* curves for lip sync.
|
||||
*/
|
||||
UCLASS()
|
||||
class UAnimGraphNode_ElevenLabsLipSync : public UAnimGraphNode_Base
|
||||
class UAnimGraphNode_PS_AI_Agent_LipSync : public UAnimGraphNode_Base
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
UPROPERTY(EditAnywhere, Category = "Settings")
|
||||
FAnimNode_ElevenLabsLipSync Node;
|
||||
FAnimNode_PS_AI_Agent_LipSync Node;
|
||||
|
||||
// UAnimGraphNode_Base interface
|
||||
virtual FText GetNodeTitle(ENodeTitleType::Type TitleType) const override;
|
||||
@ -4,26 +4,26 @@
|
||||
|
||||
#include "CoreMinimal.h"
|
||||
#include "AnimGraphNode_Base.h"
|
||||
#include "AnimNode_ElevenLabsPosture.h"
|
||||
#include "AnimGraphNode_ElevenLabsPosture.generated.h"
|
||||
#include "AnimNode_PS_AI_Agent_Posture.h"
|
||||
#include "AnimGraphNode_PS_AI_Agent_Posture.generated.h"
|
||||
|
||||
/**
|
||||
* AnimGraph editor node for the ElevenLabs Posture AnimNode.
|
||||
* AnimGraph editor node for the PS AI Agent Posture AnimNode.
|
||||
*
|
||||
* This node appears in the AnimBP graph editor under the "ElevenLabs" category.
|
||||
* Place it AFTER the ElevenLabs Facial Expression node and BEFORE the
|
||||
* ElevenLabs Lip Sync node in the MetaHuman Face AnimBP.
|
||||
* This node appears in the AnimBP graph editor under the "PS AI Agent" category.
|
||||
* Place it AFTER the PS AI Agent Facial Expression node and BEFORE the
|
||||
* PS AI Agent Lip Sync node in the MetaHuman Face AnimBP.
|
||||
*
|
||||
* It auto-discovers the ElevenLabsPostureComponent on the owning Actor
|
||||
* It auto-discovers the PS_AI_Agent_PostureComponent on the owning Actor
|
||||
* and injects head bone rotation + ARKit eye gaze curves for look-at tracking.
|
||||
*/
|
||||
UCLASS()
|
||||
class UAnimGraphNode_ElevenLabsPosture : public UAnimGraphNode_Base
|
||||
class UAnimGraphNode_PS_AI_Agent_Posture : public UAnimGraphNode_Base
|
||||
{
|
||||
GENERATED_BODY()
|
||||
|
||||
UPROPERTY(EditAnywhere, Category = "Settings")
|
||||
FAnimNode_ElevenLabsPosture Node;
|
||||
FAnimNode_PS_AI_Agent_Posture Node;
|
||||
|
||||
// UAnimGraphNode_Base interface
|
||||
virtual FText GetNodeTitle(ENodeTitleType::Type TitleType) const override;
|
||||
Loading…
x
Reference in New Issue
Block a user