VoxtaAudioClient
The VoxtaAudioClient is a specialized client designed for high-throughput, low-latency binary audio streaming. While the main VoxtaClient handles text and control messages, the VoxtaAudioClient establishes a dedicated WebSocket for raw PCM data.
Typical Use Case
You would use this client if you are building a voice-to-voice application where you need to stream the user's microphone to Voxta (STT) and receive the character's voice as a stream (TTS).
Dedicated client for handling binary PCM audio streaming to/from Voxta.
Initialize the audio client.
Parameters:
-
url(str) –The base URL of the Voxta server.
-
logger(Optional[Logger], default:None) –Optional logger instance.
Attributes
Functions
connect
async
Connect to the audio stream WebSocket and authenticate.
Parameters:
-
connection_token(str) –The SignalR connection token obtained from negotiation.
-
cookies(Optional[dict[str, str]], default:None) –Optional cookies to include in the connection request.
send_audio
async
Send binary PCM data to the server.
Parameters:
-
pcm_data(bytes) –Binary PCM audio data.
on_audio
Register a callback for received audio data.
Parameters:
-
callback(Callable[[bytes], None]) –Function that receives binary PCM data.
Audio Format
Currently, the audio client expects and delivers:
- Format: Raw PCM
- Sample Rate: Depends on server configuration (typically 16kHz or 24kHz)
- Channels: Mono
- Bit Depth: 16-bit signed integer
Implementation Example
audio_client = VoxtaAudioClient("http://localhost:5384")
# Handle incoming audio from the AI
@audio_client.on_audio
def handle_tts(pcm_data):
# Play the data via sounddevice or save to a buffer
pass
# Start connection (requires token from VoxtaClient.negotiate)
await audio_client.connect(token)
# Send user voice data
await audio_client.send_audio(microphone_bytes)