Add PS_AI_Agent_ElevenLabs plugin (initial implementation)

Adds a new UE5.5 plugin integrating the ElevenLabs Conversational AI Agent
via WebSocket. No gRPC or third-party libs required.

Plugin components:
- UElevenLabsSettings: API key + Agent ID in Project Settings
- UElevenLabsWebSocketProxy: full WS session lifecycle, JSON message handling,
  ping/pong keepalive, Base64 PCM audio send/receive
- UElevenLabsConversationalAgentComponent: ActorComponent for NPC voice
  conversation, orchestrates mic capture -> WS -> procedural audio playback
- UElevenLabsMicrophoneCaptureComponent: wraps Audio::FAudioCapture,
  resamples to 16kHz mono, dispatches on game thread

Also adds .claude/ memory files (project context, plugin notes, patterns)
so Claude Code can restore full context on any machine after a git pull.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
j.foucher 2026-02-19 12:57:48 +01:00
parent 61710c9fde
commit f0055e85ed
16 changed files with 1867 additions and 0 deletions

35
.claude/MEMORY.md Normal file
View File

@ -0,0 +1,35 @@
# Project Memory PS_AI_Agent
> This file is committed to the repository so it is available on any machine.
> Claude Code reads it automatically at session start (via the auto-memory system)
> when the working directory is inside this repo.
> **Keep it under ~180 lines** lines beyond 200 are truncated by the system.
---
## Project Location
- Repo root: `<repo_root>/` (wherever this is cloned)
- UE5 project: `<repo_root>/Unreal/PS_AI_Agent/`
- `.uproject`: `<repo_root>/Unreal/PS_AI_Agent/PS_AI_Agent.uproject`
- Engine: **Unreal Engine 5.5** — Win64 primary target
## Plugins
| Plugin | Path | Purpose |
|--------|------|---------|
| Convai (reference) | `<repo_root>/ConvAI/Convai/` | gRPC + protobuf streaming to Convai API. Has ElevenLabs voice type enum in `ConvaiDefinitions.h`. Used as architectural reference. |
| **PS_AI_Agent_ElevenLabs** | `<repo_root>/Unreal/PS_AI_Agent/Plugins/PS_AI_Agent_ElevenLabs/` | Our ElevenLabs Conversational AI integration. See `.claude/elevenlabs_plugin.md` for full details. |
## User Preferences
- Plugin naming: `PS_AI_Agent_<Service>` (e.g. `PS_AI_Agent_ElevenLabs`)
- Save memory frequently during long sessions
- Goal: ElevenLabs Conversational AI integration — simpler than Convai, no gRPC
- Full original ask + intent: see `.claude/project_context.md`
## Key UE5 Plugin Patterns
- Settings object: `UCLASS(config=Engine, defaultconfig)` inheriting `UObject`, registered via `ISettingsModule`
- Module startup: `NewObject<USettings>(..., RF_Standalone)` + `AddToRoot()`
- WebSocket: `FWebSocketsModule::Get().CreateWebSocket(URL, TEXT(""), Headers)`
- Audio capture: `Audio::FAudioCapture` from the `AudioCapture` module
- Procedural audio playback: `USoundWaveProcedural` + `OnSoundWaveProceduralUnderflow` delegate
- Audio capture callbacks arrive on a **background thread** — always marshal to game thread with `AsyncTask(ENamedThreads::GameThread, ...)`
- Resample mic audio to **16000 Hz mono** before sending to ElevenLabs

View File

@ -0,0 +1,61 @@
# PS_AI_Agent_ElevenLabs Plugin
## Location
`Unreal/PS_AI_Agent/Plugins/PS_AI_Agent_ElevenLabs/`
## File Map
```
PS_AI_Agent_ElevenLabs.uplugin
Source/PS_AI_Agent_ElevenLabs/
PS_AI_Agent_ElevenLabs.Build.cs
Public/
PS_AI_Agent_ElevenLabs.h FPS_AI_Agent_ElevenLabsModule + UElevenLabsSettings
ElevenLabsDefinitions.h Enums, structs, ElevenLabsMessageType/Audio constants
ElevenLabsWebSocketProxy.h/.cpp UObject managing one WS session
ElevenLabsConversationalAgentComponent.h/.cpp Main ActorComponent (attach to NPC)
ElevenLabsMicrophoneCaptureComponent.h/.cpp Mic capture, resample, dispatch to game thread
Private/
(implementations of the above)
```
## ElevenLabs Conversational AI Protocol
- **WebSocket URL**: `wss://api.elevenlabs.io/v1/convai/conversation?agent_id=<ID>`
- **Auth**: HTTP upgrade header `xi-api-key: <key>` (set in Project Settings)
- **All frames**: JSON text (no binary frames used by the API)
- **Audio format**: PCM 16-bit signed, 16000 Hz, mono, little-endian — Base64-encoded in JSON
### Client → Server messages
| Type field value | Payload |
|---|---|
| *(none key is the type)* `user_audio_chunk` | `{ "user_audio_chunk": "<base64 PCM>" }` |
| `user_turn_start` | `{ "type": "user_turn_start" }` |
| `user_turn_end` | `{ "type": "user_turn_end" }` |
| `interrupt` | `{ "type": "interrupt" }` |
| `pong` | `{ "type": "pong", "pong_event": { "event_id": N } }` |
### Server → Client messages (field: `type`)
| type value | Key nested object | Notes |
|---|---|---|
| `conversation_initiation_metadata` | `conversation_initiation_metadata_event.conversation_id` | Marks WS ready |
| `audio` | `audio_event.audio_base_64` | Base64 PCM from agent |
| `transcript` | `transcript_event.{speaker, message, is_final}` | User or agent speech |
| `agent_response` | `agent_response_event.agent_response` | Final agent text |
| `interruption` | — | Agent stopped mid-sentence |
| `ping` | `ping_event.event_id` | Must reply with pong |
## Key Design Decisions
- **No gRPC / no ThirdParty libs** — pure UE WebSockets + HTTP, builds out of the box
- Audio resampled in-plugin: device rate → 16000 Hz mono (linear interpolation)
- `USoundWaveProcedural` for real-time agent audio playback (queue-driven)
- Silence heuristic: 30 game-thread ticks (~0.5 s at 60 fps) with no new audio → agent done speaking
- `bSignedURLMode` setting: fetch a signed WS URL from your own backend (keeps API key off client)
- Two turn modes: `Server VAD` (ElevenLabs detects speech end) and `Client Controlled` (push-to-talk)
## Build Dependencies (Build.cs)
Core, CoreUObject, Engine, InputCore, Json, JsonUtilities, WebSockets, HTTP,
AudioMixer, AudioCaptureCore, AudioCapture, Voice, SignalProcessing
## Status
- **Session 1** (2026-02-19): All source files written, registered in .uproject. Not yet compiled.
- **TODO**: Open in UE 5.5 Editor → compile → test basic WS connection with a test agent ID.
- **Watch out**: Verify `USoundWaveProcedural::OnSoundWaveProceduralUnderflow` delegate signature vs UE 5.5 API.

View File

@ -0,0 +1,79 @@
# Project Context & Original Ask
## What the user wants to build
A **UE5 plugin** that integrates the **ElevenLabs Conversational AI Agent** API into Unreal Engine 5.5,
allowing an in-game NPC (or any Actor) to hold a real-time voice conversation with a player.
### The original request (paraphrased)
> "I want to create a plugin to use ElevenLabs Conversational Agent in Unreal Engine 5.5.
> I previously used the Convai plugin which does what I want, but I prefer ElevenLabs quality.
> The goal is to create a plugin in the existing Unreal Project to make a first step for integration.
> Convai AI plugin may be too big in terms of functionality for the new project, but it is the final goal.
> You can use the Convai source code to find the right way to make the ElevenLabs version —
> it should be very similar."
### Plugin name
`PS_AI_Agent_ElevenLabs`
---
## User's mental model / intent
1. **Short-term**: A working first-step plugin — minimal but functional — that can:
- Connect to ElevenLabs Conversational AI via WebSocket
- Capture microphone audio from the player
- Stream it to ElevenLabs in real time
- Play back the agent's voice response
- Surface key events (transcript, agent text, speaking state) to Blueprint
2. **Long-term**: Match the full feature set of Convai — character IDs, session memory,
actions/environment context, lip-sync, etc. — but powered by ElevenLabs instead.
3. **Key preference**: Simpler than Convai. No gRPC, no protobuf, no ThirdParty precompiled
libraries. ElevenLabs' Conversational AI API uses plain WebSocket + JSON, which maps
naturally to UE's built-in `WebSockets` module.
---
## How we used Convai as a reference
We studied the Convai plugin source (`ConvAI/Convai/`) to understand:
- **Module structure**: `UConvaiSettings` + `IModuleInterface` + `ISettingsModule` registration
- **Audio capture pattern**: `Audio::FAudioCapture`, ring buffers, thread-safe dispatch to game thread
- **Audio playback pattern**: `USoundWaveProcedural` fed from a queue
- **Component architecture**: `UConvaiChatbotComponent` (NPC side) + `UConvaiPlayerComponent` (player side)
- **HTTP proxy pattern**: `UConvaiAPIBaseProxy` base class for async REST calls
- **Voice type enum**: Convai already had `EVoiceType::ElevenLabsVoices` — confirming ElevenLabs
is a natural fit
We then replaced gRPC/protobuf with **WebSocket + JSON** to match the ElevenLabs API, and
simplified the architecture to the minimum needed for a first working version.
---
## What was built (Session 1 — 2026-02-19)
All source files created and registered. See `.claude/elevenlabs_plugin.md` for full file map and protocol details.
### Components created
| Class | Role |
|---|---|
| `UElevenLabsSettings` | Project Settings UI — API key, Agent ID, security options |
| `UElevenLabsWebSocketProxy` | Manages one WS session: connect, send audio, handle all server message types |
| `UElevenLabsConversationalAgentComponent` | ActorComponent to attach to any NPC — orchestrates mic + WS + playback |
| `UElevenLabsMicrophoneCaptureComponent` | Wraps `Audio::FAudioCapture`, resamples to 16 kHz mono |
### Not yet done (next sessions)
- Compile & test in UE 5.5 Editor
- Verify `USoundWaveProcedural::OnSoundWaveProceduralUnderflow` delegate signature for UE 5.5
- Add lip-sync support (future)
- Add session memory / conversation history (future)
- Add environment/action context support (future, matching Convai's full feature set)
---
## Notes on the ElevenLabs API
- Docs: https://elevenlabs.io/docs/conversational-ai
- Create agents at: https://elevenlabs.io/app/conversational-ai
- API keys at: https://elevenlabs.io (dashboard)

View File

@ -0,0 +1,7 @@
{
"permissions": {
"allow": [
"Bash(dir /s \"E:\\\\ASTERION\\\\GIT\\\\PS_AI_Agent\")"
]
}
}

View File

@ -17,6 +17,14 @@
"TargetAllowList": [
"Editor"
]
},
{
"Name": "PS_AI_Agent_ElevenLabs",
"Enabled": true
},
{
"Name": "WebSockets",
"Enabled": true
}
]
}

View File

@ -0,0 +1,35 @@
{
"FileVersion": 3,
"Version": 1,
"VersionName": "1.0.0",
"FriendlyName": "PS AI Agent - ElevenLabs",
"Description": "Integrates ElevenLabs Conversational AI Agent into Unreal Engine 5.5. Supports real-time voice conversation via WebSocket, microphone capture, and audio playback.",
"Category": "AI",
"CreatedBy": "ASTERION",
"CreatedByURL": "",
"DocsURL": "https://elevenlabs.io/docs/conversational-ai",
"MarketplaceURL": "",
"SupportURL": "",
"CanContainContent": false,
"IsBetaVersion": true,
"IsExperimentalVersion": false,
"Installed": false,
"Modules": [
{
"Name": "PS_AI_Agent_ElevenLabs",
"Type": "Runtime",
"LoadingPhase": "PreDefault",
"PlatformAllowList": [
"Win64",
"Mac",
"Linux"
]
}
],
"Plugins": [
{
"Name": "WebSockets",
"Enabled": true
}
]
}

View File

@ -0,0 +1,40 @@
// Copyright ASTERION. All Rights Reserved.
using UnrealBuildTool;
public class PS_AI_Agent_ElevenLabs : ModuleRules
{
public PS_AI_Agent_ElevenLabs(ReadOnlyTargetRules Target) : base(Target)
{
DefaultBuildSettings = BuildSettingsVersion.Latest;
PCHUsage = PCHUsageMode.UseExplicitOrSharedPCHs;
PublicDependencyModuleNames.AddRange(new string[]
{
"Core",
"CoreUObject",
"Engine",
"InputCore",
// JSON serialization for WebSocket message payloads
"Json",
"JsonUtilities",
// WebSocket for ElevenLabs Conversational AI real-time API
"WebSockets",
// HTTP for REST calls (agent metadata, auth, etc.)
"HTTP",
// Audio capture (microphone input)
"AudioMixer",
"AudioCaptureCore",
"AudioCapture",
"Voice",
"SignalProcessing",
});
PrivateDependencyModuleNames.AddRange(new string[]
{
"Projects",
// For ISettingsModule (Project Settings integration)
"Settings",
});
}
}

View File

@ -0,0 +1,335 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsConversationalAgentComponent.h"
#include "ElevenLabsMicrophoneCaptureComponent.h"
#include "PS_AI_Agent_ElevenLabs.h"
#include "Components/AudioComponent.h"
#include "Sound/SoundWaveProcedural.h"
#include "GameFramework/Actor.h"
#include "Engine/World.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsAgent, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Constructor
// ─────────────────────────────────────────────────────────────────────────────
UElevenLabsConversationalAgentComponent::UElevenLabsConversationalAgentComponent()
{
PrimaryComponentTick.bCanEverTick = true;
// Tick is used only to detect silence (agent stopped speaking).
// Disable if not needed for perf.
PrimaryComponentTick.TickInterval = 1.0f / 60.0f;
}
// ─────────────────────────────────────────────────────────────────────────────
// Lifecycle
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::BeginPlay()
{
Super::BeginPlay();
InitAudioPlayback();
}
void UElevenLabsConversationalAgentComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
{
EndConversation();
Super::EndPlay(EndPlayReason);
}
void UElevenLabsConversationalAgentComponent::TickComponent(float DeltaTime, ELevelTick TickType,
FActorComponentTickFunction* ThisTickFunction)
{
Super::TickComponent(DeltaTime, TickType, ThisTickFunction);
if (bAgentSpeaking)
{
FScopeLock Lock(&AudioQueueLock);
if (AudioQueue.Num() == 0)
{
SilentTickCount++;
if (SilentTickCount >= SilenceThresholdTicks)
{
bAgentSpeaking = false;
SilentTickCount = 0;
OnAgentStoppedSpeaking.Broadcast();
}
}
else
{
SilentTickCount = 0;
}
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Control
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::StartConversation()
{
if (!WebSocketProxy)
{
WebSocketProxy = NewObject<UElevenLabsWebSocketProxy>(this);
WebSocketProxy->OnConnected.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleConnected);
WebSocketProxy->OnDisconnected.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleDisconnected);
WebSocketProxy->OnError.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleError);
WebSocketProxy->OnAudioReceived.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleAudioReceived);
WebSocketProxy->OnTranscript.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleTranscript);
WebSocketProxy->OnAgentResponse.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleAgentResponse);
WebSocketProxy->OnInterrupted.AddDynamic(this,
&UElevenLabsConversationalAgentComponent::HandleInterrupted);
}
WebSocketProxy->Connect(AgentID);
}
void UElevenLabsConversationalAgentComponent::EndConversation()
{
StopListening();
StopAgentAudio();
if (WebSocketProxy)
{
WebSocketProxy->Disconnect();
WebSocketProxy = nullptr;
}
}
void UElevenLabsConversationalAgentComponent::StartListening()
{
if (!IsConnected())
{
UE_LOG(LogElevenLabsAgent, Warning, TEXT("StartListening: not connected."));
return;
}
if (bIsListening) return;
bIsListening = true;
if (TurnMode == EElevenLabsTurnMode::Client)
{
WebSocketProxy->SendUserTurnStart();
}
// Find the microphone component on our owner actor, or create one.
UElevenLabsMicrophoneCaptureComponent* Mic =
GetOwner()->FindComponentByClass<UElevenLabsMicrophoneCaptureComponent>();
if (!Mic)
{
Mic = NewObject<UElevenLabsMicrophoneCaptureComponent>(GetOwner(),
TEXT("ElevenLabsMicrophone"));
Mic->RegisterComponent();
}
Mic->OnAudioCaptured.AddUObject(this,
&UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured);
Mic->StartCapture();
UE_LOG(LogElevenLabsAgent, Log, TEXT("Microphone capture started."));
}
void UElevenLabsConversationalAgentComponent::StopListening()
{
if (!bIsListening) return;
bIsListening = false;
if (UElevenLabsMicrophoneCaptureComponent* Mic =
GetOwner() ? GetOwner()->FindComponentByClass<UElevenLabsMicrophoneCaptureComponent>() : nullptr)
{
Mic->StopCapture();
Mic->OnAudioCaptured.RemoveAll(this);
}
if (WebSocketProxy && TurnMode == EElevenLabsTurnMode::Client)
{
WebSocketProxy->SendUserTurnEnd();
}
UE_LOG(LogElevenLabsAgent, Log, TEXT("Microphone capture stopped."));
}
void UElevenLabsConversationalAgentComponent::InterruptAgent()
{
if (WebSocketProxy) WebSocketProxy->SendInterrupt();
StopAgentAudio();
}
// ─────────────────────────────────────────────────────────────────────────────
// State queries
// ─────────────────────────────────────────────────────────────────────────────
bool UElevenLabsConversationalAgentComponent::IsConnected() const
{
return WebSocketProxy && WebSocketProxy->IsConnected();
}
const FElevenLabsConversationInfo& UElevenLabsConversationalAgentComponent::GetConversationInfo() const
{
static FElevenLabsConversationInfo Empty;
return WebSocketProxy ? WebSocketProxy->GetConversationInfo() : Empty;
}
// ─────────────────────────────────────────────────────────────────────────────
// WebSocket event handlers
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::HandleConnected(const FElevenLabsConversationInfo& Info)
{
UE_LOG(LogElevenLabsAgent, Log, TEXT("Agent connected. ConversationID=%s"), *Info.ConversationID);
OnAgentConnected.Broadcast(Info);
if (bAutoStartListening)
{
StartListening();
}
}
void UElevenLabsConversationalAgentComponent::HandleDisconnected(int32 StatusCode, const FString& Reason)
{
UE_LOG(LogElevenLabsAgent, Log, TEXT("Agent disconnected. Code=%d Reason=%s"), StatusCode, *Reason);
bIsListening = false;
bAgentSpeaking = false;
OnAgentDisconnected.Broadcast(StatusCode, Reason);
}
void UElevenLabsConversationalAgentComponent::HandleError(const FString& ErrorMessage)
{
UE_LOG(LogElevenLabsAgent, Error, TEXT("Agent error: %s"), *ErrorMessage);
OnAgentError.Broadcast(ErrorMessage);
}
void UElevenLabsConversationalAgentComponent::HandleAudioReceived(const TArray<uint8>& PCMData)
{
EnqueueAgentAudio(PCMData);
}
void UElevenLabsConversationalAgentComponent::HandleTranscript(const FElevenLabsTranscriptSegment& Segment)
{
OnAgentTranscript.Broadcast(Segment);
}
void UElevenLabsConversationalAgentComponent::HandleAgentResponse(const FString& ResponseText)
{
OnAgentTextResponse.Broadcast(ResponseText);
}
void UElevenLabsConversationalAgentComponent::HandleInterrupted()
{
StopAgentAudio();
OnAgentInterrupted.Broadcast();
}
// ─────────────────────────────────────────────────────────────────────────────
// Audio playback
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::InitAudioPlayback()
{
AActor* Owner = GetOwner();
if (!Owner) return;
// USoundWaveProcedural lets us push raw PCM data at runtime.
ProceduralSoundWave = NewObject<USoundWaveProcedural>(this);
ProceduralSoundWave->SetSampleRate(ElevenLabsAudio::SampleRate);
ProceduralSoundWave->NumChannels = ElevenLabsAudio::Channels;
ProceduralSoundWave->Duration = INDEFINITELY_LOOPING_DURATION;
ProceduralSoundWave->SoundGroup = SOUNDGROUP_Voice;
ProceduralSoundWave->bLooping = false;
// Create the audio component attached to the owner.
AudioPlaybackComponent = NewObject<UAudioComponent>(Owner, TEXT("ElevenLabsAudioPlayback"));
AudioPlaybackComponent->RegisterComponent();
AudioPlaybackComponent->bAutoActivate = false;
AudioPlaybackComponent->SetSound(ProceduralSoundWave);
// When the procedural sound wave needs more audio data, pull from our queue.
ProceduralSoundWave->OnSoundWaveProceduralUnderflow =
FOnSoundWaveProceduralUnderflow::CreateUObject(
this, &UElevenLabsConversationalAgentComponent::OnProceduralUnderflow);
}
void UElevenLabsConversationalAgentComponent::OnProceduralUnderflow(
USoundWaveProcedural* InProceduralWave, const int32 SamplesRequired)
{
FScopeLock Lock(&AudioQueueLock);
if (AudioQueue.Num() == 0) return;
const int32 BytesRequired = SamplesRequired * sizeof(int16);
const int32 BytesToPush = FMath::Min(AudioQueue.Num(), BytesRequired);
InProceduralWave->QueueAudio(AudioQueue.GetData(), BytesToPush);
AudioQueue.RemoveAt(0, BytesToPush, false);
}
void UElevenLabsConversationalAgentComponent::EnqueueAgentAudio(const TArray<uint8>& PCMData)
{
{
FScopeLock Lock(&AudioQueueLock);
AudioQueue.Append(PCMData);
}
// Start playback if not already playing.
if (!bAgentSpeaking)
{
bAgentSpeaking = true;
SilentTickCount = 0;
OnAgentStartedSpeaking.Broadcast();
if (AudioPlaybackComponent && !AudioPlaybackComponent->IsPlaying())
{
AudioPlaybackComponent->Play();
}
}
}
void UElevenLabsConversationalAgentComponent::StopAgentAudio()
{
if (AudioPlaybackComponent && AudioPlaybackComponent->IsPlaying())
{
AudioPlaybackComponent->Stop();
}
FScopeLock Lock(&AudioQueueLock);
AudioQueue.Empty();
if (bAgentSpeaking)
{
bAgentSpeaking = false;
SilentTickCount = 0;
OnAgentStoppedSpeaking.Broadcast();
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Microphone → WebSocket
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsConversationalAgentComponent::OnMicrophoneDataCaptured(const TArray<float>& FloatPCM)
{
if (!IsConnected() || !bIsListening) return;
TArray<uint8> PCMBytes = FloatPCMToInt16Bytes(FloatPCM);
WebSocketProxy->SendAudioChunk(PCMBytes);
}
TArray<uint8> UElevenLabsConversationalAgentComponent::FloatPCMToInt16Bytes(const TArray<float>& FloatPCM)
{
TArray<uint8> Out;
Out.Reserve(FloatPCM.Num() * 2);
for (float Sample : FloatPCM)
{
// Clamp to [-1,1] then scale to int16 range
const float Clamped = FMath::Clamp(Sample, -1.0f, 1.0f);
const int16 Int16Sample = static_cast<int16>(Clamped * 32767.0f);
// Little-endian
Out.Add(static_cast<uint8>(Int16Sample & 0xFF));
Out.Add(static_cast<uint8>((Int16Sample >> 8) & 0xFF));
}
return Out;
}

View File

@ -0,0 +1,168 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsMicrophoneCaptureComponent.h"
#include "ElevenLabsDefinitions.h"
#include "AudioCaptureCore.h"
#include "Async/Async.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsMic, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Constructor
// ─────────────────────────────────────────────────────────────────────────────
UElevenLabsMicrophoneCaptureComponent::UElevenLabsMicrophoneCaptureComponent()
{
PrimaryComponentTick.bCanEverTick = false;
}
// ─────────────────────────────────────────────────────────────────────────────
// Lifecycle
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsMicrophoneCaptureComponent::EndPlay(const EEndPlayReason::Type EndPlayReason)
{
StopCapture();
Super::EndPlay(EndPlayReason);
}
// ─────────────────────────────────────────────────────────────────────────────
// Capture control
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsMicrophoneCaptureComponent::StartCapture()
{
if (bCapturing)
{
UE_LOG(LogElevenLabsMic, Warning, TEXT("StartCapture called while already capturing."));
return;
}
// Open the default audio capture stream.
// FAudioCapture discovers the default device and its sample rate automatically.
Audio::FOnAudioCaptureFunction CaptureCallback =
[this](const float* InAudio, int32 NumSamples, int32 InNumChannels,
int32 InSampleRate, double StreamTime, bool bOverflow)
{
OnAudioGenerate(InAudio, NumSamples, InNumChannels, InSampleRate, StreamTime, bOverflow);
};
if (!AudioCapture.OpenDefaultCaptureStream(DeviceParams, MoveTemp(CaptureCallback), 1024))
{
UE_LOG(LogElevenLabsMic, Error, TEXT("Failed to open default audio capture stream."));
return;
}
// Retrieve the actual device parameters after opening the stream.
Audio::FCaptureDeviceInfo DeviceInfo;
if (AudioCapture.GetCaptureDeviceInfo(DeviceInfo))
{
DeviceSampleRate = DeviceInfo.PreferredSampleRate;
DeviceChannels = DeviceInfo.InputChannels;
UE_LOG(LogElevenLabsMic, Log, TEXT("Capture device: %s | Rate=%d | Channels=%d"),
*DeviceInfo.DeviceName, DeviceSampleRate, DeviceChannels);
}
AudioCapture.StartStream();
bCapturing = true;
UE_LOG(LogElevenLabsMic, Log, TEXT("Audio capture started."));
}
void UElevenLabsMicrophoneCaptureComponent::StopCapture()
{
if (!bCapturing) return;
AudioCapture.StopStream();
AudioCapture.CloseStream();
bCapturing = false;
UE_LOG(LogElevenLabsMic, Log, TEXT("Audio capture stopped."));
}
// ─────────────────────────────────────────────────────────────────────────────
// Audio callback (background thread)
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsMicrophoneCaptureComponent::OnAudioGenerate(
const float* InAudio, int32 NumSamples,
int32 InNumChannels, int32 InSampleRate,
double StreamTime, bool bOverflow)
{
if (bOverflow)
{
UE_LOG(LogElevenLabsMic, Verbose, TEXT("Audio capture buffer overflow."));
}
// Resample + downmix to 16000 Hz mono.
TArray<float> Resampled = ResampleTo16000(InAudio, NumSamples, InNumChannels, InSampleRate);
// Apply volume multiplier.
if (!FMath::IsNearlyEqual(VolumeMultiplier, 1.0f))
{
for (float& S : Resampled)
{
S *= VolumeMultiplier;
}
}
// Fire the delegate on the game thread so subscribers don't need to be
// thread-safe (WebSocket Send is not thread-safe in UE's implementation).
AsyncTask(ENamedThreads::GameThread, [this, Data = MoveTemp(Resampled)]()
{
if (bCapturing)
{
OnAudioCaptured.Broadcast(Data);
}
});
}
// ─────────────────────────────────────────────────────────────────────────────
// Resampling
// ─────────────────────────────────────────────────────────────────────────────
TArray<float> UElevenLabsMicrophoneCaptureComponent::ResampleTo16000(
const float* InAudio, int32 NumSamples,
int32 InChannels, int32 InSampleRate)
{
const int32 TargetRate = ElevenLabsAudio::SampleRate; // 16000
// --- Step 1: Downmix to mono ---
TArray<float> Mono;
if (InChannels == 1)
{
Mono = TArray<float>(InAudio, NumSamples);
}
else
{
const int32 NumFrames = NumSamples / InChannels;
Mono.Reserve(NumFrames);
for (int32 i = 0; i < NumFrames; i++)
{
float Sum = 0.0f;
for (int32 c = 0; c < InChannels; c++)
{
Sum += InAudio[i * InChannels + c];
}
Mono.Add(Sum / static_cast<float>(InChannels));
}
}
// --- Step 2: Resample via linear interpolation ---
if (InSampleRate == TargetRate)
{
return Mono;
}
const float Ratio = static_cast<float>(InSampleRate) / static_cast<float>(TargetRate);
const int32 OutSamples = FMath::FloorToInt(static_cast<float>(Mono.Num()) / Ratio);
TArray<float> Out;
Out.Reserve(OutSamples);
for (int32 i = 0; i < OutSamples; i++)
{
const float SrcIndex = static_cast<float>(i) * Ratio;
const int32 SrcLow = FMath::FloorToInt(SrcIndex);
const int32 SrcHigh = FMath::Min(SrcLow + 1, Mono.Num() - 1);
const float Alpha = SrcIndex - static_cast<float>(SrcLow);
Out.Add(FMath::Lerp(Mono[SrcLow], Mono[SrcHigh], Alpha));
}
return Out;
}

View File

@ -0,0 +1,382 @@
// Copyright ASTERION. All Rights Reserved.
#include "ElevenLabsWebSocketProxy.h"
#include "PS_AI_Agent_ElevenLabs.h"
#include "WebSocketsModule.h"
#include "IWebSocket.h"
#include "Json.h"
#include "JsonUtilities.h"
#include "Misc/Base64.h"
DEFINE_LOG_CATEGORY_STATIC(LogElevenLabsWS, Log, All);
// ─────────────────────────────────────────────────────────────────────────────
// Helpers
// ─────────────────────────────────────────────────────────────────────────────
static void EL_LOG(bool bVerbose, const TCHAR* Format, ...)
{
if (!bVerbose) return;
va_list Args;
va_start(Args, Format);
// Forward to UE_LOG at Verbose level
TCHAR Buffer[2048];
FCString::GetVarArgs(Buffer, UE_ARRAY_COUNT(Buffer), Format, Args);
va_end(Args);
UE_LOG(LogElevenLabsWS, Verbose, TEXT("%s"), Buffer);
}
// ─────────────────────────────────────────────────────────────────────────────
// Connect / Disconnect
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::Connect(const FString& AgentIDOverride, const FString& APIKeyOverride)
{
if (ConnectionState == EElevenLabsConnectionState::Connected ||
ConnectionState == EElevenLabsConnectionState::Connecting)
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("Connect called but already connecting/connected. Ignoring."));
return;
}
if (!FModuleManager::Get().IsModuleLoaded("WebSockets"))
{
FModuleManager::LoadModuleChecked<FWebSocketsModule>("WebSockets");
}
const FString URL = BuildWebSocketURL(AgentIDOverride, APIKeyOverride);
if (URL.IsEmpty())
{
const FString Msg = TEXT("Cannot connect: no Agent ID configured. Set it in Project Settings or pass it to Connect().");
UE_LOG(LogElevenLabsWS, Error, TEXT("%s"), *Msg);
OnError.Broadcast(Msg);
ConnectionState = EElevenLabsConnectionState::Error;
return;
}
UE_LOG(LogElevenLabsWS, Log, TEXT("Connecting to ElevenLabs: %s"), *URL);
ConnectionState = EElevenLabsConnectionState::Connecting;
// Headers: the ElevenLabs Conversational AI WS endpoint accepts the
// xi-api-key header on the initial HTTP upgrade request.
TMap<FString, FString> UpgradeHeaders;
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
const FString ResolvedKey = APIKeyOverride.IsEmpty() ? Settings->API_Key : APIKeyOverride;
if (!ResolvedKey.IsEmpty())
{
UpgradeHeaders.Add(TEXT("xi-api-key"), ResolvedKey);
}
WebSocket = FWebSocketsModule::Get().CreateWebSocket(URL, TEXT(""), UpgradeHeaders);
WebSocket->OnConnected().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsConnected);
WebSocket->OnConnectionError().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsConnectionError);
WebSocket->OnClosed().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsClosed);
WebSocket->OnMessage().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsMessage);
WebSocket->OnRawMessage().AddUObject(this, &UElevenLabsWebSocketProxy::OnWsBinaryMessage);
WebSocket->Connect();
}
void UElevenLabsWebSocketProxy::Disconnect()
{
if (WebSocket.IsValid() && WebSocket->IsConnected())
{
WebSocket->Close(1000, TEXT("Client disconnected"));
}
ConnectionState = EElevenLabsConnectionState::Disconnected;
}
// ─────────────────────────────────────────────────────────────────────────────
// Audio & turn control
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::SendAudioChunk(const TArray<uint8>& PCMData)
{
if (!IsConnected())
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("SendAudioChunk: not connected."));
return;
}
if (PCMData.Num() == 0) return;
// ElevenLabs expects: { "user_audio_chunk": "<base64 PCM>" }
const FString Base64Audio = FBase64::Encode(PCMData.GetData(), PCMData.Num());
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
Msg->SetStringField(ElevenLabsMessageType::AudioChunk, Base64Audio);
SendJsonMessage(Msg);
}
void UElevenLabsWebSocketProxy::SendUserTurnStart()
{
if (!IsConnected()) return;
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::UserTurnStart);
SendJsonMessage(Msg);
}
void UElevenLabsWebSocketProxy::SendUserTurnEnd()
{
if (!IsConnected()) return;
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::UserTurnEnd);
SendJsonMessage(Msg);
}
void UElevenLabsWebSocketProxy::SendInterrupt()
{
if (!IsConnected()) return;
TSharedPtr<FJsonObject> Msg = MakeShareable(new FJsonObject());
Msg->SetStringField(TEXT("type"), ElevenLabsMessageType::Interrupt);
SendJsonMessage(Msg);
}
// ─────────────────────────────────────────────────────────────────────────────
// WebSocket callbacks
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::OnWsConnected()
{
UE_LOG(LogElevenLabsWS, Log, TEXT("WebSocket connected. Waiting for conversation_initiation_metadata..."));
// State stays Connecting until we receive the initiation metadata from the server.
}
void UElevenLabsWebSocketProxy::OnWsConnectionError(const FString& Error)
{
UE_LOG(LogElevenLabsWS, Error, TEXT("WebSocket connection error: %s"), *Error);
ConnectionState = EElevenLabsConnectionState::Error;
OnError.Broadcast(Error);
}
void UElevenLabsWebSocketProxy::OnWsClosed(int32 StatusCode, const FString& Reason, bool bWasClean)
{
UE_LOG(LogElevenLabsWS, Log, TEXT("WebSocket closed. Code=%d Reason=%s Clean=%d"), StatusCode, *Reason, bWasClean);
ConnectionState = EElevenLabsConnectionState::Disconnected;
WebSocket.Reset();
OnDisconnected.Broadcast(StatusCode, Reason);
}
void UElevenLabsWebSocketProxy::OnWsMessage(const FString& Message)
{
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
if (Settings->bVerboseLogging)
{
UE_LOG(LogElevenLabsWS, Verbose, TEXT(">> %s"), *Message);
}
TSharedPtr<FJsonObject> Root;
TSharedRef<TJsonReader<>> Reader = TJsonReaderFactory<>::Create(Message);
if (!FJsonSerializer::Deserialize(Reader, Root) || !Root.IsValid())
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("Failed to parse WebSocket message as JSON."));
return;
}
FString MsgType;
// ElevenLabs wraps the type in a "type" field
if (!Root->TryGetStringField(TEXT("type"), MsgType))
{
// Fallback: some messages use the top-level key as the type
// e.g. { "user_audio_chunk": "..." } from ourselves (shouldn't arrive)
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Message has no 'type' field, ignoring."));
return;
}
if (MsgType == ElevenLabsMessageType::ConversationInitiation)
{
HandleConversationInitiation(Root);
}
else if (MsgType == ElevenLabsMessageType::AudioResponse)
{
HandleAudioResponse(Root);
}
else if (MsgType == ElevenLabsMessageType::Transcript)
{
HandleTranscript(Root);
}
else if (MsgType == ElevenLabsMessageType::AgentResponse)
{
HandleAgentResponse(Root);
}
else if (MsgType == ElevenLabsMessageType::InterruptionEvent)
{
HandleInterruption(Root);
}
else if (MsgType == ElevenLabsMessageType::PingEvent)
{
HandlePing(Root);
}
else
{
UE_LOG(LogElevenLabsWS, Verbose, TEXT("Unhandled message type: %s"), *MsgType);
}
}
void UElevenLabsWebSocketProxy::OnWsBinaryMessage(const void* Data, SIZE_T Size, SIZE_T BytesRemaining)
{
// ElevenLabs Conversational AI uses text (JSON) frames only.
// If binary frames arrive in future API versions, handle here.
UE_LOG(LogElevenLabsWS, Warning, TEXT("Received unexpected binary WebSocket frame (%llu bytes)."), (uint64)Size);
}
// ─────────────────────────────────────────────────────────────────────────────
// Message handlers
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::HandleConversationInitiation(const TSharedPtr<FJsonObject>& Root)
{
// Expected structure:
// { "type": "conversation_initiation_metadata",
// "conversation_initiation_metadata_event": {
// "conversation_id": "...",
// "agent_output_audio_format": "pcm_16000"
// }
// }
const TSharedPtr<FJsonObject>* MetaObj = nullptr;
if (Root->TryGetObjectField(TEXT("conversation_initiation_metadata_event"), MetaObj) && MetaObj)
{
(*MetaObj)->TryGetStringField(TEXT("conversation_id"), ConversationInfo.ConversationID);
}
UE_LOG(LogElevenLabsWS, Log, TEXT("Conversation initiated. ID=%s"), *ConversationInfo.ConversationID);
ConnectionState = EElevenLabsConnectionState::Connected;
OnConnected.Broadcast(ConversationInfo);
}
void UElevenLabsWebSocketProxy::HandleAudioResponse(const TSharedPtr<FJsonObject>& Root)
{
// Expected structure:
// { "type": "audio",
// "audio_event": { "audio_base_64": "<base64 PCM>", "event_id": 1 }
// }
const TSharedPtr<FJsonObject>* AudioEvent = nullptr;
if (!Root->TryGetObjectField(TEXT("audio_event"), AudioEvent) || !AudioEvent)
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("audio message missing 'audio_event' field."));
return;
}
FString Base64Audio;
if (!(*AudioEvent)->TryGetStringField(TEXT("audio_base_64"), Base64Audio))
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("audio_event missing 'audio_base_64' field."));
return;
}
TArray<uint8> PCMData;
if (!FBase64::Decode(Base64Audio, PCMData))
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("Failed to Base64-decode audio data."));
return;
}
OnAudioReceived.Broadcast(PCMData);
}
void UElevenLabsWebSocketProxy::HandleTranscript(const TSharedPtr<FJsonObject>& Root)
{
// Expected structure:
// { "type": "transcript",
// "transcript_event": { "speaker": "user"|"agent", "message": "...", "event_id": 1 }
// }
const TSharedPtr<FJsonObject>* TranscriptEvent = nullptr;
if (!Root->TryGetObjectField(TEXT("transcript_event"), TranscriptEvent) || !TranscriptEvent)
{
return;
}
FElevenLabsTranscriptSegment Segment;
(*TranscriptEvent)->TryGetStringField(TEXT("speaker"), Segment.Speaker);
(*TranscriptEvent)->TryGetStringField(TEXT("message"), Segment.Text);
// ElevenLabs marks final vs. interim via "is_final"
(*TranscriptEvent)->TryGetBoolField(TEXT("is_final"), Segment.bIsFinal);
OnTranscript.Broadcast(Segment);
}
void UElevenLabsWebSocketProxy::HandleAgentResponse(const TSharedPtr<FJsonObject>& Root)
{
// { "type": "agent_response",
// "agent_response_event": { "agent_response": "..." }
// }
const TSharedPtr<FJsonObject>* ResponseEvent = nullptr;
if (!Root->TryGetObjectField(TEXT("agent_response_event"), ResponseEvent) || !ResponseEvent)
{
return;
}
FString ResponseText;
(*ResponseEvent)->TryGetStringField(TEXT("agent_response"), ResponseText);
OnAgentResponse.Broadcast(ResponseText);
}
void UElevenLabsWebSocketProxy::HandleInterruption(const TSharedPtr<FJsonObject>& Root)
{
UE_LOG(LogElevenLabsWS, Log, TEXT("Agent interrupted."));
OnInterrupted.Broadcast();
}
void UElevenLabsWebSocketProxy::HandlePing(const TSharedPtr<FJsonObject>& Root)
{
// Reply with a pong to keep the connection alive.
// { "type": "ping", "ping_event": { "event_id": 1 } }
int32 EventID = 0;
const TSharedPtr<FJsonObject>* PingEvent = nullptr;
if (Root->TryGetObjectField(TEXT("ping_event"), PingEvent) && PingEvent)
{
(*PingEvent)->TryGetNumberField(TEXT("event_id"), EventID);
}
TSharedPtr<FJsonObject> Pong = MakeShareable(new FJsonObject());
Pong->SetStringField(TEXT("type"), TEXT("pong"));
TSharedPtr<FJsonObject> PongEvent = MakeShareable(new FJsonObject());
PongEvent->SetNumberField(TEXT("event_id"), EventID);
Pong->SetObjectField(TEXT("pong_event"), PongEvent);
SendJsonMessage(Pong);
}
// ─────────────────────────────────────────────────────────────────────────────
// Helpers
// ─────────────────────────────────────────────────────────────────────────────
void UElevenLabsWebSocketProxy::SendJsonMessage(const TSharedPtr<FJsonObject>& JsonObj)
{
if (!WebSocket.IsValid() || !WebSocket->IsConnected())
{
UE_LOG(LogElevenLabsWS, Warning, TEXT("SendJsonMessage: WebSocket not connected."));
return;
}
FString Out;
TSharedRef<TJsonWriter<>> Writer = TJsonWriterFactory<>::Create(&Out);
FJsonSerializer::Serialize(JsonObj.ToSharedRef(), Writer);
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
if (Settings->bVerboseLogging)
{
UE_LOG(LogElevenLabsWS, Verbose, TEXT("<< %s"), *Out);
}
WebSocket->Send(Out);
}
FString UElevenLabsWebSocketProxy::BuildWebSocketURL(const FString& AgentIDOverride, const FString& APIKeyOverride) const
{
const UElevenLabsSettings* Settings = FPS_AI_Agent_ElevenLabsModule::Get().GetSettings();
// Custom URL override takes full precedence
if (!Settings->CustomWebSocketURL.IsEmpty())
{
return Settings->CustomWebSocketURL;
}
const FString ResolvedAgentID = AgentIDOverride.IsEmpty() ? Settings->AgentID : AgentIDOverride;
if (ResolvedAgentID.IsEmpty())
{
return FString();
}
// Official ElevenLabs Conversational AI WebSocket endpoint
// wss://api.elevenlabs.io/v1/convai/conversation?agent_id=<ID>
return FString::Printf(
TEXT("wss://api.elevenlabs.io/v1/convai/conversation?agent_id=%s"),
*ResolvedAgentID);
}

View File

@ -0,0 +1,50 @@
// Copyright ASTERION. All Rights Reserved.
#include "PS_AI_Agent_ElevenLabs.h"
#include "Developer/Settings/Public/ISettingsModule.h"
#include "UObject/UObjectGlobals.h"
#include "UObject/Package.h"
IMPLEMENT_MODULE(FPS_AI_Agent_ElevenLabsModule, PS_AI_Agent_ElevenLabs)
#define LOCTEXT_NAMESPACE "PS_AI_Agent_ElevenLabs"
void FPS_AI_Agent_ElevenLabsModule::StartupModule()
{
Settings = NewObject<UElevenLabsSettings>(GetTransientPackage(), "ElevenLabsSettings", RF_Standalone);
Settings->AddToRoot();
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
{
SettingsModule->RegisterSettings(
"Project", "Plugins", "ElevenLabsAIAgent",
LOCTEXT("SettingsName", "ElevenLabs AI Agent"),
LOCTEXT("SettingsDescription", "Configure the ElevenLabs Conversational AI Agent plugin"),
Settings);
}
}
void FPS_AI_Agent_ElevenLabsModule::ShutdownModule()
{
if (ISettingsModule* SettingsModule = FModuleManager::GetModulePtr<ISettingsModule>("Settings"))
{
SettingsModule->UnregisterSettings("Project", "Plugins", "ElevenLabsAIAgent");
}
if (!GExitPurge)
{
Settings->RemoveFromRoot();
}
else
{
Settings = nullptr;
}
}
UElevenLabsSettings* FPS_AI_Agent_ElevenLabsModule::GetSettings() const
{
check(Settings);
return Settings;
}
#undef LOCTEXT_NAMESPACE

View File

@ -0,0 +1,225 @@
// Copyright ASTERION. All Rights Reserved.
#pragma once
#include "CoreMinimal.h"
#include "Components/ActorComponent.h"
#include "ElevenLabsDefinitions.h"
#include "ElevenLabsWebSocketProxy.h"
#include "Sound/SoundWaveProcedural.h"
#include "ElevenLabsConversationalAgentComponent.generated.h"
class UAudioComponent;
class UElevenLabsMicrophoneCaptureComponent;
// ─────────────────────────────────────────────────────────────────────────────
// Delegates exposed to Blueprint
// ─────────────────────────────────────────────────────────────────────────────
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentConnected,
const FElevenLabsConversationInfo&, ConversationInfo);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnAgentDisconnected,
int32, StatusCode, const FString&, Reason);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentError,
const FString&, ErrorMessage);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentTranscript,
const FElevenLabsTranscriptSegment&, Segment);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnAgentTextResponse,
const FString&, ResponseText);
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnAgentStartedSpeaking);
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnAgentStoppedSpeaking);
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnAgentInterrupted);
// ─────────────────────────────────────────────────────────────────────────────
// UElevenLabsConversationalAgentComponent
//
// Attach this to any Actor (e.g. a character NPC) to give it a voice powered by
// the ElevenLabs Conversational AI API.
//
// Workflow:
// 1. Set AgentID (or rely on project default).
// 2. Call StartConversation() to open the WebSocket.
// 3. Call StartListening() / StopListening() to control microphone capture.
// 4. React to events (OnAgentTranscript, OnAgentTextResponse, etc.) in Blueprint.
// 5. Call EndConversation() when done.
// ─────────────────────────────────────────────────────────────────────────────
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
DisplayName = "ElevenLabs Conversational Agent")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsConversationalAgentComponent : public UActorComponent
{
GENERATED_BODY()
public:
UElevenLabsConversationalAgentComponent();
// ── Configuration ─────────────────────────────────────────────────────────
/**
* ElevenLabs Agent ID. Overrides the project-level default in Project Settings.
* Leave empty to use the project default.
*/
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs")
FString AgentID;
/**
* Turn mode:
* - Server VAD: ElevenLabs detects end-of-speech automatically (recommended).
* - Client Controlled: you call StartListening/StopListening manually (push-to-talk).
*/
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs")
EElevenLabsTurnMode TurnMode = EElevenLabsTurnMode::Server;
/**
* Automatically start listening (microphone capture) once the WebSocket is
* connected and the conversation is initiated.
*/
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs")
bool bAutoStartListening = true;
// ── Events ────────────────────────────────────────────────────────────────
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnAgentConnected OnAgentConnected;
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnAgentDisconnected OnAgentDisconnected;
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnAgentError OnAgentError;
/** Fired for every transcript segment (user speech or agent speech, tentative and final). */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnAgentTranscript OnAgentTranscript;
/** Final text response produced by the agent (mirrors the audio). */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnAgentTextResponse OnAgentTextResponse;
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnAgentStartedSpeaking OnAgentStartedSpeaking;
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnAgentStoppedSpeaking OnAgentStoppedSpeaking;
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnAgentInterrupted OnAgentInterrupted;
// ── Control ───────────────────────────────────────────────────────────────
/**
* Open the WebSocket connection and start the conversation.
* If bAutoStartListening is true, microphone capture also starts once connected.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void StartConversation();
/** Close the WebSocket and stop all audio. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void EndConversation();
/**
* Start capturing microphone audio and streaming it to ElevenLabs.
* In Client turn mode, also sends a UserTurnStart signal.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void StartListening();
/**
* Stop capturing microphone audio.
* In Client turn mode, also sends a UserTurnEnd signal.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void StopListening();
/** Interrupt the agent's current utterance. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void InterruptAgent();
// ── State queries ─────────────────────────────────────────────────────────
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
bool IsConnected() const;
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
bool IsListening() const { return bIsListening; }
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
bool IsAgentSpeaking() const { return bAgentSpeaking; }
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
const FElevenLabsConversationInfo& GetConversationInfo() const;
/** Access the underlying WebSocket proxy (advanced use). */
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
UElevenLabsWebSocketProxy* GetWebSocketProxy() const { return WebSocketProxy; }
// ─────────────────────────────────────────────────────────────────────────
// UActorComponent overrides
// ─────────────────────────────────────────────────────────────────────────
virtual void BeginPlay() override;
virtual void EndPlay(const EEndPlayReason::Type EndPlayReason) override;
virtual void TickComponent(float DeltaTime, ELevelTick TickType,
FActorComponentTickFunction* ThisTickFunction) override;
private:
// ── Internal event handlers ───────────────────────────────────────────────
UFUNCTION()
void HandleConnected(const FElevenLabsConversationInfo& Info);
UFUNCTION()
void HandleDisconnected(int32 StatusCode, const FString& Reason);
UFUNCTION()
void HandleError(const FString& ErrorMessage);
UFUNCTION()
void HandleAudioReceived(const TArray<uint8>& PCMData);
UFUNCTION()
void HandleTranscript(const FElevenLabsTranscriptSegment& Segment);
UFUNCTION()
void HandleAgentResponse(const FString& ResponseText);
UFUNCTION()
void HandleInterrupted();
// ── Audio playback ────────────────────────────────────────────────────────
void InitAudioPlayback();
void EnqueueAgentAudio(const TArray<uint8>& PCMData);
void StopAgentAudio();
/** Called by USoundWaveProcedural when it needs more PCM data. */
void OnProceduralUnderflow(USoundWaveProcedural* InProceduralWave, const int32 SamplesRequired);
// ── Microphone streaming ──────────────────────────────────────────────────
void OnMicrophoneDataCaptured(const TArray<float>& FloatPCM);
/** Convert float PCM to int16 little-endian bytes for ElevenLabs. */
static TArray<uint8> FloatPCMToInt16Bytes(const TArray<float>& FloatPCM);
// ── Sub-objects ───────────────────────────────────────────────────────────
UPROPERTY()
UElevenLabsWebSocketProxy* WebSocketProxy = nullptr;
UPROPERTY()
UAudioComponent* AudioPlaybackComponent = nullptr;
UPROPERTY()
USoundWaveProcedural* ProceduralSoundWave = nullptr;
// ── State ─────────────────────────────────────────────────────────────────
bool bIsListening = false;
bool bAgentSpeaking = false;
// Accumulates incoming PCM bytes until the audio component needs data.
TArray<uint8> AudioQueue;
FCriticalSection AudioQueueLock;
// Simple heuristic: if we haven't received audio data for this many ticks,
// consider the agent done speaking.
int32 SilentTickCount = 0;
static constexpr int32 SilenceThresholdTicks = 30; // ~0.5s at 60fps
};

View File

@ -0,0 +1,104 @@
// Copyright ASTERION. All Rights Reserved.
#pragma once
#include "CoreMinimal.h"
#include "ElevenLabsDefinitions.generated.h"
// ─────────────────────────────────────────────────────────────────────────────
// Connection state
// ─────────────────────────────────────────────────────────────────────────────
UENUM(BlueprintType)
enum class EElevenLabsConnectionState : uint8
{
Disconnected UMETA(DisplayName = "Disconnected"),
Connecting UMETA(DisplayName = "Connecting"),
Connected UMETA(DisplayName = "Connected"),
Error UMETA(DisplayName = "Error"),
};
// ─────────────────────────────────────────────────────────────────────────────
// Agent turn mode
// ─────────────────────────────────────────────────────────────────────────────
UENUM(BlueprintType)
enum class EElevenLabsTurnMode : uint8
{
/** ElevenLabs server decides when the user has finished speaking (default). */
Server UMETA(DisplayName = "Server VAD"),
/** Client explicitly signals turn start/end (manual push-to-talk). */
Client UMETA(DisplayName = "Client Controlled"),
};
// ─────────────────────────────────────────────────────────────────────────────
// WebSocket message type helpers (internal, not exposed to Blueprint)
// ─────────────────────────────────────────────────────────────────────────────
namespace ElevenLabsMessageType
{
// Client → Server
static const FString AudioChunk = TEXT("user_audio_chunk");
static const FString UserTurnStart = TEXT("user_turn_start");
static const FString UserTurnEnd = TEXT("user_turn_end");
static const FString Interrupt = TEXT("interrupt");
static const FString ClientToolResult = TEXT("client_tool_result");
// Server → Client
static const FString ConversationInitiation = TEXT("conversation_initiation_metadata");
static const FString AudioResponse = TEXT("audio");
static const FString Transcript = TEXT("transcript");
static const FString AgentResponse = TEXT("agent_response");
static const FString InterruptionEvent = TEXT("interruption");
static const FString PingEvent = TEXT("ping");
static const FString ClientToolCall = TEXT("client_tool_call");
static const FString InternalTentativeAgent = TEXT("internal_tentative_agent_response");
}
// ─────────────────────────────────────────────────────────────────────────────
// Audio format exchanged with ElevenLabs
// PCM 16-bit signed, 16000 Hz, mono, little-endian.
// ─────────────────────────────────────────────────────────────────────────────
namespace ElevenLabsAudio
{
static constexpr int32 SampleRate = 16000;
static constexpr int32 Channels = 1;
static constexpr int32 BitsPerSample = 16;
// Chunk size sent per WebSocket frame: 100 ms of audio
static constexpr int32 ChunkSamples = SampleRate / 10; // 1600 samples = 3200 bytes
}
// ─────────────────────────────────────────────────────────────────────────────
// Conversation metadata received on successful connection
// ─────────────────────────────────────────────────────────────────────────────
USTRUCT(BlueprintType)
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsConversationInfo
{
GENERATED_BODY()
/** Unique ID of this conversation session assigned by ElevenLabs. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
FString ConversationID;
/** Agent ID that is responding. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
FString AgentID;
};
// ─────────────────────────────────────────────────────────────────────────────
// Transcript segment
// ─────────────────────────────────────────────────────────────────────────────
USTRUCT(BlueprintType)
struct PS_AI_AGENT_ELEVENLABS_API FElevenLabsTranscriptSegment
{
GENERATED_BODY()
/** Transcribed text. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
FString Text;
/** "user" or "agent". */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
FString Speaker;
/** Whether this is a final transcript or a tentative (in-progress) one. */
UPROPERTY(BlueprintReadOnly, Category = "ElevenLabs")
bool bIsFinal = false;
};

View File

@ -0,0 +1,73 @@
// Copyright ASTERION. All Rights Reserved.
#pragma once
#include "CoreMinimal.h"
#include "Components/ActorComponent.h"
#include "AudioCapture.h"
#include "ElevenLabsMicrophoneCaptureComponent.generated.h"
// Delivers captured float PCM samples (16000 Hz mono, resampled from device rate).
DECLARE_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAudioCaptured, const TArray<float>& /*FloatPCM*/);
/**
* Lightweight microphone capture component.
* Captures from the default audio input device, resamples to 16000 Hz mono,
* and delivers chunks via FOnElevenLabsAudioCaptured.
*
* Modelled after Convai's ConvaiAudioCaptureComponent but stripped to the
* minimal functionality needed for the ElevenLabs Conversational AI API.
*/
UCLASS(ClassGroup = "ElevenLabs", meta = (BlueprintSpawnableComponent),
DisplayName = "ElevenLabs Microphone Capture")
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsMicrophoneCaptureComponent : public UActorComponent
{
GENERATED_BODY()
public:
UElevenLabsMicrophoneCaptureComponent();
/** Volume multiplier applied to captured samples before forwarding. */
UPROPERTY(EditAnywhere, BlueprintReadWrite, Category = "ElevenLabs|Microphone",
meta = (ClampMin = "0.0", ClampMax = "4.0"))
float VolumeMultiplier = 1.0f;
/**
* Delegate fired on the game thread each time a new chunk of PCM audio
* is captured. Samples are float32, resampled to 16000 Hz mono.
*/
FOnElevenLabsAudioCaptured OnAudioCaptured;
/** Open the default capture device and begin streaming audio. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void StartCapture();
/** Stop streaming and close the capture device. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void StopCapture();
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
bool IsCapturing() const { return bCapturing; }
// ─────────────────────────────────────────────────────────────────────────
// UActorComponent overrides
// ─────────────────────────────────────────────────────────────────────────
virtual void EndPlay(const EEndPlayReason::Type EndPlayReason) override;
private:
/** Called by the audio capture callback on a background thread. */
void OnAudioGenerate(const float* InAudio, int32 NumSamples,
int32 InNumChannels, int32 InSampleRate, double StreamTime, bool bOverflow);
/** Simple linear resample from InSampleRate to 16000 Hz. */
static TArray<float> ResampleTo16000(const float* InAudio, int32 NumSamples,
int32 InChannels, int32 InSampleRate);
Audio::FAudioCapture AudioCapture;
Audio::FAudioCaptureDeviceParams DeviceParams;
bool bCapturing = false;
// Device sample rate discovered on StartCapture
int32 DeviceSampleRate = 44100;
int32 DeviceChannels = 1;
};

View File

@ -0,0 +1,166 @@
// Copyright ASTERION. All Rights Reserved.
#pragma once
#include "CoreMinimal.h"
#include "UObject/NoExportTypes.h"
#include "ElevenLabsDefinitions.h"
#include "IWebSocket.h"
#include "ElevenLabsWebSocketProxy.generated.h"
// ─────────────────────────────────────────────────────────────────────────────
// Delegates (all Blueprint-assignable)
// ─────────────────────────────────────────────────────────────────────────────
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsConnected,
const FElevenLabsConversationInfo&, ConversationInfo);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_TwoParams(FOnElevenLabsDisconnected,
int32, StatusCode, const FString&, Reason);
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsError,
const FString&, ErrorMessage);
/** Fired when a PCM audio chunk arrives from the agent. Raw bytes, 16-bit signed 16kHz mono. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAudioReceived,
const TArray<uint8>&, PCMData);
/** Fired for user or agent transcript segments. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsTranscript,
const FElevenLabsTranscriptSegment&, Segment);
/** Fired with the final text response from the agent. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE_OneParam(FOnElevenLabsAgentResponse,
const FString&, ResponseText);
/** Fired when the agent interrupts the user. */
DECLARE_DYNAMIC_MULTICAST_DELEGATE(FOnElevenLabsInterrupted);
// ─────────────────────────────────────────────────────────────────────────────
// WebSocket Proxy
// Manages the lifecycle of a single ElevenLabs Conversational AI WebSocket session.
// Instantiate via UElevenLabsConversationalAgentComponent (the component manages
// one proxy at a time), or create manually through Blueprints.
// ─────────────────────────────────────────────────────────────────────────────
UCLASS(BlueprintType, Blueprintable)
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsWebSocketProxy : public UObject
{
GENERATED_BODY()
public:
// ── Events ────────────────────────────────────────────────────────────────
/** Called once the WebSocket handshake succeeds and the agent sends its initiation metadata. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsConnected OnConnected;
/** Called when the WebSocket closes (graceful or remote). */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsDisconnected OnDisconnected;
/** Called on any connection or protocol error. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsError OnError;
/** Raw PCM audio coming from the agent — feed this into your audio component. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsAudioReceived OnAudioReceived;
/** User or agent transcript (may be tentative while the conversation is ongoing). */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsTranscript OnTranscript;
/** Final text response from the agent (complements audio). */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsAgentResponse OnAgentResponse;
/** The agent was interrupted by new user speech. */
UPROPERTY(BlueprintAssignable, Category = "ElevenLabs|Events")
FOnElevenLabsInterrupted OnInterrupted;
// ── Lifecycle ─────────────────────────────────────────────────────────────
/**
* Open a WebSocket connection to ElevenLabs.
* Uses settings from Project Settings unless overridden by the parameters.
*
* @param AgentID ElevenLabs agent ID. Overrides the project-level default when non-empty.
* @param APIKey API key. Overrides the project-level default when non-empty.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void Connect(const FString& AgentID = TEXT(""), const FString& APIKey = TEXT(""));
/**
* Gracefully close the WebSocket connection.
* OnDisconnected will fire after the server acknowledges.
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void Disconnect();
/** Current connection state. */
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
EElevenLabsConnectionState GetConnectionState() const { return ConnectionState; }
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
bool IsConnected() const { return ConnectionState == EElevenLabsConnectionState::Connected; }
// ── Audio sending ─────────────────────────────────────────────────────────
/**
* Send a chunk of raw PCM audio to ElevenLabs.
* Audio must be 16-bit signed, 16000 Hz, mono, little-endian.
* The data is Base64-encoded and sent as a JSON message.
* Call this repeatedly while the microphone is capturing.
*
* @param PCMData Raw PCM bytes (16-bit LE, 16kHz, mono).
*/
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void SendAudioChunk(const TArray<uint8>& PCMData);
// ── Turn control (only relevant in Client turn mode) ──────────────────────
/** Signal that the user has started speaking (Client turn mode). */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void SendUserTurnStart();
/** Signal that the user has finished speaking (Client turn mode). */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void SendUserTurnEnd();
/** Ask the agent to stop the current utterance. */
UFUNCTION(BlueprintCallable, Category = "ElevenLabs")
void SendInterrupt();
// ── Info ──────────────────────────────────────────────────────────────────
UFUNCTION(BlueprintPure, Category = "ElevenLabs")
const FElevenLabsConversationInfo& GetConversationInfo() const { return ConversationInfo; }
// ─────────────────────────────────────────────────────────────────────────
// Internal
// ─────────────────────────────────────────────────────────────────────────
private:
void OnWsConnected();
void OnWsConnectionError(const FString& Error);
void OnWsClosed(int32 StatusCode, const FString& Reason, bool bWasClean);
void OnWsMessage(const FString& Message);
void OnWsBinaryMessage(const void* Data, SIZE_T Size, SIZE_T BytesRemaining);
void HandleConversationInitiation(const TSharedPtr<FJsonObject>& Payload);
void HandleAudioResponse(const TSharedPtr<FJsonObject>& Payload);
void HandleTranscript(const TSharedPtr<FJsonObject>& Payload);
void HandleAgentResponse(const TSharedPtr<FJsonObject>& Payload);
void HandleInterruption(const TSharedPtr<FJsonObject>& Payload);
void HandlePing(const TSharedPtr<FJsonObject>& Payload);
/** Build and send a JSON text frame to the server. */
void SendJsonMessage(const TSharedPtr<FJsonObject>& JsonObj);
/** Resolve the WebSocket URL from settings / parameters. */
FString BuildWebSocketURL(const FString& AgentID, const FString& APIKey) const;
TSharedPtr<IWebSocket> WebSocket;
EElevenLabsConnectionState ConnectionState = EElevenLabsConnectionState::Disconnected;
FElevenLabsConversationInfo ConversationInfo;
};

View File

@ -0,0 +1,99 @@
// Copyright ASTERION. All Rights Reserved.
#pragma once
#include "CoreMinimal.h"
#include "Modules/ModuleManager.h"
#include "PS_AI_Agent_ElevenLabs.generated.h"
// ─────────────────────────────────────────────────────────────────────────────
// Settings object exposed in Project Settings → Plugins → ElevenLabs AI Agent
// ─────────────────────────────────────────────────────────────────────────────
UCLASS(config = Engine, defaultconfig)
class PS_AI_AGENT_ELEVENLABS_API UElevenLabsSettings : public UObject
{
GENERATED_BODY()
public:
UElevenLabsSettings(const FObjectInitializer& ObjectInitializer)
: Super(ObjectInitializer)
{
API_Key = TEXT("");
AgentID = TEXT("");
bSignedURLMode = false;
}
/**
* ElevenLabs API key.
* Obtain from https://elevenlabs.io used to authenticate WebSocket connections.
* Keep this secret; do not ship with the key hard-coded in a shipping build.
*/
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API")
FString API_Key;
/**
* The default ElevenLabs Conversational Agent ID to use when none is specified
* on the component. Create agents at https://elevenlabs.io/app/conversational-ai
*/
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API")
FString AgentID;
/**
* When true, the plugin fetches a signed WebSocket URL from your own backend
* before connecting, so the API key is never exposed in the client.
* Set SignedURLEndpoint to point to your server that returns the signed URL.
*/
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API | Security")
bool bSignedURLMode;
/**
* Your backend endpoint that returns a signed WebSocket URL for ElevenLabs.
* Only used when bSignedURLMode = true.
* Expected response body: { "signed_url": "wss://..." }
*/
UPROPERTY(Config, EditAnywhere, Category = "ElevenLabs API | Security",
meta = (EditCondition = "bSignedURLMode"))
FString SignedURLEndpoint;
/**
* Override the ElevenLabs WebSocket base URL. Leave empty to use the default:
* wss://api.elevenlabs.io/v1/convai/conversation
*/
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "ElevenLabs API")
FString CustomWebSocketURL;
/** Log verbose WebSocket messages to the Output Log (useful during development). */
UPROPERTY(Config, EditAnywhere, AdvancedDisplay, Category = "ElevenLabs API")
bool bVerboseLogging = false;
};
// ─────────────────────────────────────────────────────────────────────────────
// Module
// ─────────────────────────────────────────────────────────────────────────────
class PS_AI_AGENT_ELEVENLABS_API FPS_AI_Agent_ElevenLabsModule : public IModuleInterface
{
public:
/** IModuleInterface implementation */
virtual void StartupModule() override;
virtual void ShutdownModule() override;
virtual bool IsGameModule() const override { return true; }
/** Singleton access */
static inline FPS_AI_Agent_ElevenLabsModule& Get()
{
return FModuleManager::LoadModuleChecked<FPS_AI_Agent_ElevenLabsModule>("PS_AI_Agent_ElevenLabs");
}
static inline bool IsAvailable()
{
return FModuleManager::Get().IsModuleLoaded("PS_AI_Agent_ElevenLabs");
}
/** Access the settings object at runtime */
UElevenLabsSettings* GetSettings() const;
private:
UElevenLabsSettings* Settings = nullptr;
};