Web SDK

Integrate Millis AI’s voice agent capabilities directly into your web applications and browser extensions.

Installation

Install the SDK with npm:

npm install @millisai/web-sdk

Usage

Here’s how to quickly set up a voice agent in your web application:

1. Import the SDK:

import Millis from '@millisai/web-sdk';

2. Initialize the Client:

const msClient = Millis.createClient({publicKey: 'your_public_key', endPoint?: 'region-based-endpoint'});

Obtain your public key from your

Learn more about which endPoint to use .

3. Start a Conversation:

Starting from version 1.0.15, use the following format to initiate a call:

msClient.start({
  agent: {
    agent_id: agentId,          // Optionally pass agent_id
    agent_config: {}            // Optionally pass agent_config
  },
  metadata: {},                 // Optional metadata for personalized context
  include_metadata_in_prompt: true/false,  // Optional flag to include metadata in prompt
  session_continuation: { // Optional session ID for continuation
    session_id: "<previous session id>"
  }
});

msClient.start({
  agent: {
    agent_id: <agent-id>
  }
});

Replace agent-id with the ID of your agent obtained from the Playground.

The metadata is optional. You can pass any additional data to the session, which we will forward to your custom LLM and function webhooks. If you provide metadata, you can make it available to the agent by setting include_metadata_in_prompt to true. This will include the metadata in the agent’s system prompt, allowing the agent to use the data during the conversation.

You can also dynamically create a temporary voice agent with custom configurations using the code below:

msClient.start({
  agent: {
    agent_config: {
      prompt: "You're a helpful assistant.", // Example prompt
      voice: {
        provider: "elevenlabs", // Voice provider
        voice_id: "voice-id" // Replace 'voice-id' with the ID of the desired voice
      },
      language: "<language_code>", // optional - use language code such as en, es
      tools: [
        {
          name: "get_user_data",
          description: "",
          webhook: "https://...",
          header: {
            "Content-Type": "application/json",
            "Authorization": ""
          },
          params: [
            {
              name: "",
              type: "string" | "number" | "boolean",
              description: "",
              required: true
            }
          ]
        }
      ], // Replace with actual function calls you need
      custom_llm_websocket: "wss://...", // optional - enable custom llm
      llm: "", // optional - choose llm model. Ex: gpt-4o, llama-3-70b
    }
  }
});

When both agent_id and agent_config are provided, the session will use the configuration associated with agent_id but will override it with any settings provided in agent_config. This option allows for minor modifications to the agent’s default configuration on a per-session basis.

Metadata

metadata: Optional field to pass any additional information that may personalize the conversation. It can be used by the agent if include_metadata_in_prompt is set to true.

Including Metadata in Prompt

include_metadata_in_prompt: Boolean flag (true or false). If true, the metadata provided will be included in the prompt to give context to the agent.

Session Continuation

session_continuation: Provide session_id from a previous session to enable continuity in conversation. This allows the agent to reference previous interactions.

msClient.stop();

    msClient.on("onopen", () => {
      // When the client connected to the server
    });

    msClient.on("onready", () => {
      // When the conversation is ready
    });

    msClient.on("onaudio", (audio: Uint8Array) => {
      // Incoming audio chunk
    });

    msClient.on("analyzer", (analyzer: AnalyserNode) => {
      // AnalyserNode that you can use for audio animation
    });

    msClient.on("onclose", (event) => {
      // When the connection is closed
    });

    msClient.on("onerror", (error) => {
      // An error occurred
    });

Description: Emitted when the WebSocket connection is successfully opened.

Callback Signature:

'onopen': () => void;

Description: Emitted when the client is ready to start processing audio or other tasks.

Callback Signature:

'onready': () => void;

Description: Emitted when a session has ended.

Callback Signature:

'onsessionended': () => void;

Description: Emitted when audio data is received.

Callback Signature:

'onaudio': (audio: Uint8Array) => void;

Parameters:

audio - The received audio data in Uint8Array format.

Description: Emitted when agent’s response text is received.

Callback Signature:

'onresponsetext': (text: string, payload: { is_final?: boolean }) => void;

Parameters:

text - The received response text.
payload - An object containing additional information.
- is_final (optional) - A boolean indicating if the response text is final.

Description: Emitted when user’s transcript text is received.

Callback Signature:

'ontranscript': (text: string, payload: { is_final?: boolean }) => void;

Parameters:

text - The received transcript text.
payload - An object containing additional information.
- is_final (optional) - A boolean indicating if the transcript text is final.

Description: Emitted when agent triggered a function call.

Callback Signature:

'onfunction': (text: string, payload: { name: string, params: object }) => void;

Parameters:

text - Empty.
payload - Information about the triggered function.
- name - The function name.
- params - The params being used in the function call.

Description: Emitted with an AnalyserNode for agent’s audio analysis.

Callback Signature:

'analyzer': (analyzer: AnalyserNode) => void;

Parameters:

analyzer - The AnalyserNode used for audio analysis.

Description: Emitted when user audio is ready for processing.

Callback Signature:

'useraudioready': (data: { analyser: AnalyserNode, stream: MediaStream }) => void;

Parameters:

data - An object containing audio-related information.
- analyser - The AnalyserNode for user’s audio analysis.
- stream - The MediaStream containing the user’s audio data.

Description: Emitted to report latency information for debugging purpose.

Callback Signature:

'onlatency': (latency: number) => void;

Parameters:

latency - The measured latency in milliseconds.

Description: Emitted when the WebSocket connection is closed.

Callback Signature:

'onclose': (event: CloseEvent) => void;

Parameters:

event - The CloseEvent containing details about the WebSocket closure.

Description: Emitted when an error occurs in the WebSocket connection.

Callback Signature:

'onerror': (error: Event) => void;

Parameters:

error - The Event containing details about the error.

Here’s an example of how to listen to these events in the Client class:

const client = new Client(config);

client.on('onopen', () => {
  console.log('WebSocket connection opened.');
});

client.on('onready', () => {
  console.log('Client is ready.');
});

client.on('onsessionended', () => {
  console.log('Session ended.');
});

client.on('onaudio', (audio) => {
  console.log('Audio received:', audio);
});

client.on('onresponsetext', (text, payload) => {
  console.log('Response text:', text, 'Payload:', payload);
});

client.on('ontranscript', (text, payload) => {
  console.log('Transcript:', text, 'Payload:', payload);
});

client.on('analyzer', (analyzer) => {
  console.log('Analyzer node:', analyzer);
});

client.on('useraudioready', (data) => {
  console.log('User audio ready:', data);
});

client.on('onlatency', (latency) => {
  console.log('Latency:', latency);
});

client.on('onclose', (event) => {
  console.log('WebSocket connection closed:', event);
});

client.on('onerror', (error) => {
  console.error('WebSocket error:', error);
});

PreviousRunning Voice Agents in Different Regions NextCustom LLM

Last updated 3 months ago

Installation

Usage

1. Import the SDK:

2. Initialize the Client:

3. Start a Conversation:

Installation

Usage

1. Import the SDK:

2. Initialize the Client:

3. Start a Conversation:

4. Stop a Conversation:

5. Setup event listener:

Event List

onopen

onready

onsessionended

onaudio

onresponsetext

ontranscript

onfunction

analyzer

useraudioready

onlatency

onclose

onerror

Example Usage

Support

4. Stop a Conversation:

5. Setup event listener:

Event List

onopen

onready

onsessionended

onaudio

onresponsetext

ontranscript

onfunction

analyzer

useraudioready

onlatency

onclose

onerror

Example Usage

Support

`onopen`

`onready`

`onsessionended`

`onaudio`

`onresponsetext`

`ontranscript`

`onfunction`

`analyzer`

`useraudioready`

`onlatency`

`onclose`

`onerror`

`onopen`

`onready`

`onsessionended`

`onaudio`

`onresponsetext`

`ontranscript`

`onfunction`

`analyzer`

`useraudioready`

`onlatency`

`onclose`

`onerror`