New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details
Socket
Book a DemoSign in
Socket

@avatara/avatar-stream

Package Overview
Dependencies
Maintainers
0
Versions
55
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

@avatara/avatar-stream

Avatara official SDK for avatar streaming

latest
npmnpm
Version
1.5.0
Version published
Maintainers
0
Created
Source

Avatara Avatar Stream SDK

Avatara official SDK for avatar streaming.

Overview

This library provides both client-side and server-side functionalities.

  • Client-Side Functionality: These can be integrated directly into your React applications, enabling interactive avatar streaming features within your user interface. They are designed to be used exclusively on the client side, as they rely on browser-specific APIs such as getUserMedia, AudioContext, etc. In Next.js you might want to use dynamic imports with ssr: false to avoid server-side rendering issues.

  • Server-Side Functionality: These must be called from your backend, such as API routes in Next.js or other server-side frameworks. This separation ensures the security of sensitive operations like API key management and server-side data processing.

Table of Contents

Installation

npm install @avatara/avatar-stream

Usage

Client-Side

useAvatarStream

'use client';

import { useAvatarStream } from '@avatara/avatar-stream';
import type { Star } from '@/types/common';

const {
  speak,
  pause,
  resume,
  initPlayer,
  isSpeechPaused,
  isSpeakLoading,
  isAvatarTalking,
  streamRefs,
} = useAvatarStream({
  star: yourStarObject,
  onSpeechEnd: () => {
    console.log('Speech has ended.');
  },
  onError: (error) => {
    console.error('Avatar Stream Error:', error);
  },
  onPause: () => {
    console.log('Speech has been paused.');
  },
  onPlaying: () => {
    console.log('Speech is playing.');
  },
});
Description

useAvatarStream is a React hook designed to manage an avatar's audio and video streaming sessions. It provides functionalities to handle speech playback, control media streams, and manage the avatar's interaction states. The hook integrates with the StreamPlayer for audio handling and manages video sources for the avatar's visual representation.

Parameters
  • params: Props

    Configuration options for the avatar stream.

    • star: Star
      The star object containing information about the avatar, including media sources and interactive configurations. You can get this object from getStarByUID.

    • onError?: (error: Error) => void
      Callback invoked when an error occurs during the streaming session.

    • onPause?: () => void
      Callback invoked when speech playback is paused.

    • onPlaying?: () => void
      Callback invoked when speech playback starts or resumes.

    • onSpeechEnd?: () => void
      Callback invoked when speech playback ends.

Returns
  • UseAvatarStreamReturn

    An object containing methods and state variables to manage the avatar's streaming session.

    • speak: (audioResponse: Response) => Promise<void>
      Initiates speech playback using the provided audio response.

    • pause: () => void
      Pauses the current speech playback.

    • resume: () => void
      Resumes paused speech playback.

    • initPlayer: () => Promise<void>
      Initializes the speech player. You must pass this function to useRecordAndSTTForHoldTap or useRecordAndSTTForContinuous to enable speech playback.

    • isSpeechPaused: boolean
      Indicates whether the speech playback is currently paused.

    • isSpeakLoading: boolean
      Indicates whether a speak action is in progress.

    • isAvatarTalking: boolean
      Indicates whether the avatar is currently engaged in speech playback.

    • streamRefs
      References to the media elements used in the streaming session. You must pass this to the StreamPlayer component.

Example
'use client';

import React, { useEffect, useRef, useState } from 'react';
import { StreamPlayer, useAvatarStream } from '@avatara/avatar-stream';
import type { Star } from '@/types/common';

const AvatarStreamer: React.FC<{ star: Star }> = ({ star }) => {
  const [isSessionActive, setIsSessionActive] = useState(false);
  const [isLoading, setIsLoading] = useState(false);
  const sessionButtonRef = useRef<ContinuousSessionButtonRef>(null);

  const {
    streamRefs,
    isSpeakLoading,
    isAvatarTalking,
    pause,
    speak,
    initPlayer,
  } = useAvatarStream({
    star,
    onSpeechEnd: () => {
      sessionButtonRef.current?.resumeListening();
    },
    onError: (error) => console.error('Avatar Stream Error:', error),
    onPause: () => console.log('Speech has been paused.'),
    onPlaying: () => console.log('Speech is playing.'),
  });

  const handleEndSession = async () => {
    pause();
    setIsSessionActive(false);
  };

  const handleStartSession = async () => {
    setIsLoading(true);
    try {
      const response = await fetch('/api/access-token', {
        method: 'POST',
        body: JSON.stringify({ remote_id: 'maz-playground-remote-id' }),
      });

      if (!response.ok) {
        throw new Error(`Error: ${response.status} ${response.statusText}`);
      }

      const data: AuthResponseData = await response.json();

      if (!data.stream_token || !data.token) {
        throw new Error('Invalid token response');
      }

      localStorage.setItem('stream_token', data.stream_token);
      localStorage.setItem('token', data.token);

      const conversationResponse = await fetch('/api/conversation', {
        method: 'POST',
        headers: {
          Authorization: `Bearer ${data.token}`,
        },
        body: JSON.stringify({ star_uid: star.uid }),
      });

      if (!conversationResponse.ok) {
        throw new Error(
          `Error: ${conversationResponse.status} ${conversationResponse.statusText}`
        );
      }

      const conversationData = await conversationResponse.json();

      if (!conversationData || !conversationData.uid) {
        throw new Error('Invalid conversation data');
      }

      localStorage.setItem('conversation_uid', conversationData.uid);

      setIsSessionActive(true);
    } catch (error) {
      console.error(error);
    } finally {
      setIsLoading(false);
    }
  };

  const uploadAudioFn = async (audioBlob: Blob): Promise<string> => {
    setIsLoading(true);
    sessionButtonRef.current?.pauseListening();
    try {
      const conversation_uid = localStorage.getItem('conversation_uid');

      if (!conversation_uid) {
        throw new Error('Conversation UID not found');
      }
      const token = localStorage.getItem('token');

      if (!token) {
        throw new Error('Token not found');
      }

      const formData = new FormData();

      formData.append('conversation_uid', conversation_uid || '');
      formData.append('audio', audioBlob);

      const response = await fetch('/api/sts', {
        method: 'POST',
        body: formData,
        headers: {
          Authorization: `Bearer ${token}`,
        },
      });

      if (!response.ok) {
        throw new Error('Failed to upload audio');
      }

      await speak(response);

      return '';
    } catch (error) {
      sessionButtonRef.current?.resumeListening();
      console.error('Error during STS request:', error);
      throw new Error('Failed to get transcription');
    } finally {
      setIsLoading(false);
      if (enableInterrupt) {
        sessionButtonRef.current?.resumeListening();
      }
    }
  };

  const disabled = useMemo(() => {
    const _disabled = isLoading;
    return enableInterrupt ? _disabled : _disabled || isSpeakLoading;
  }, [enableInterrupt, isLoading, isSpeakLoading]);

  useEffect(() => {
    handleEndSession();
  }, [enableInterrupt, interaction]);

  return (
    <div className="relative flex h-full w-full flex-col items-center">
      <div className="relative">
        <StreamPlayer
          containerStyle={{
            height: '640px',
            width: '360px',
            backgroundColor: 'white',
          }}
          isTalking={isAvatarTalking}
          streamRefs={streamRefs}
        />
        {interaction === 'continuous' ? (
          <ContinuousSessionButton
            ref={sessionButtonRef}
            endSession={handleEndSession}
            initPlayer={initPlayer}
            isLoading={isLoading || isSpeakLoading}
            isSessionActive={isSessionActive}
            startSession={handleStartSession}
            uploadAudioFn={uploadAudioFn}
          />
        ) : (
          <HoldSpeakButton
            disabled={disabled}
            endSession={handleEndSession}
            initPlayer={initPlayer}
            isLoading={isLoading || isSpeakLoading}
            isSessionActive={isSessionActive}
            startSession={handleStartSession}
            uploadAudioFn={uploadAudioFn}
          />
        )}
      </div>
    </div>
  );
};

export default AvatarStreamer;

StreamPlayer

Description

StreamPlayer is a React component responsible for rendering the avatar's audio and video streams.

Parameters
  • props: Props

    Configuration options for the StreamPlayer.

    • isTalking: boolean
      Determines whether the avatar is currently talking. This controls the visibility and layering of the active and idle video streams.

    • streamRefs
      References to the media elements used in the streaming session. These references are provided by the useAvatarStream hook.

    • containerStyle?: CSSProperties
      Inline styles applied to the container div. Defaults to { width: '100%', height: '100%' }.

    • containerClassName?: string
      CSS class names applied to the container div.

    • videoStyle?: CSSProperties
      Inline styles applied to video element.

    • videoClassName?: string
      CSS class names applied to video element.

    • onLoadStart?: () => void
      Callback invoked when the active video starts loading. Useful for showing loading indicators.

    • onLoadedData?: () => void
      Callback invoked when the active video has loaded data. Useful for handling video playback after loading is complete.

Example
'use client';

import React, { useEffect, useRef } from 'react';
import { StreamPlayer, useAvatarStream } from '@avatara/avatar-stream';
import type { Star } from '@/types/common';

const ExampleStreamPlayer: React.FC<{ star: Star }> = ({ star }) => {
  const {
    streamRefs,
    isAvatarTalking,
    speak,
    pause,
    initPlayer,
  } = useAvatarStream({
    star,
    onSpeechEnd: () => console.log('Speech ended.'),
    onError: (error) => console.error('Error:', error),
  });

  return (
    <StreamPlayer
      isTalking={isAvatarTalking}
      streamRefs={streamRefs}
      containerStyle={{ width: '640px', height: '480px' }}
      onLoadStart={() => console.log('Video loading started.')}
      onLoadedData={() => console.log('Video loaded.')}
    />
    <ContinuousSessionButton
      initPlayer={initPlayer} // pass initPlayer function from useAvatarStream
      
      // ref={sessionButtonRef}
      // endSession={handleEndSession}
      // isLoading={isLoading || isSpeakLoading}
      // isSessionActive={isSessionActive}
      // startSession={handleStartSession}
      // uploadAudioFn={uploadAudioFn}
    />
  );
};

export default ExampleStreamPlayer;

useRecordAndSTTForHoldTap

import { useRecordAndSTTForHoldTap } from '@avatara/avatar-stream';

const {
  isRecording,
  transcript,
  amplitude,
  start,
  stop,
  handlePressDown,
  handlePressUp,
  handleTap,
} = useRecordAndSTTForHoldTap({
  uploadAudioFn: async (audioBlob) => {
    // Upload audioBlob to your STT server and return the transcript
    return 'transcript';
  },
  onTranscription: (text) => console.log('Transcription:', text),
  onError: (err) => console.error('Error:', err),
  scenario: 'hold',
});
Description

useRecordAndSTTForHoldTap is a React hook that manages audio recording and speech-to-text (STT) conversion based on user interactions. It supports both "hold-to-talk" and "tap-to-talk" scenarios, providing amplitude monitoring for visual feedback during recording.

  • "hold" Scenario: Starts recording when the user presses down and stops (and uploads) when the user releases.
  • "tap" Scenario: Toggles between recording and not recording each time the user taps a button, uploading the audio once recording stops.

Additionally, it monitors the audio amplitude to provide visual feedback, enhancing the user experience during interactions.

Parameters
  • params: UseRecordAndSTTForHoldTapParams

    Configuration options for the hook.

    • uploadAudioFn: (audioBlob: Blob) => Promise<string>
      A function that receives an audio Blob and returns the recognized text (STT). This function should handle uploading the audio to your STT service and return the transcribed text.

    • onTranscription?: (text: string) => void
      Callback invoked when transcription text is received from the STT service.

    • onError?: (err: Error) => void
      Callback invoked if there's an error (e.g., microphone permissions denied, STT upload issues).

    • scenario: 'hold' | 'tap'
      Determines the interaction pattern for recording:

      • 'hold': Recording starts on press down and stops on press up.
      • 'tap': Recording toggles on each tap.
Returns
  • UseRecordAndSttReturn

    An object containing the recording state, transcript, amplitude, and interaction handlers.

    • isRecording: boolean
      Indicates if recording is currently active.

    • transcript: string
      The transcribed text from the recording.

    • amplitude: number
      The amplitude of the audio signal, normalized between 0 and 1. Useful for visualizing audio levels.

    • start: () => void
      Function to manually start recording (useful for "tap" or continuous scenarios).

    • stop: () => void
      Function to manually stop recording.

    • handlePressDown: () => void
      Function to handle press down events (used in the "hold" scenario).

    • handlePressUp: () => void
      Function to handle press up events (used in the "hold" scenario).

    • handleTap: () => void
      Function to handle tap events (used in the "tap" scenario).

Example
import React from 'react';
import { useRecordAndSTTForHoldTap } from '@avatara/avatar-stream';

const RecordAndTranscribeComponent = () => {
  const {
    isRecording,
    transcript,
    amplitude,
    start,
    stop,
    handlePressDown,
    handlePressUp,
    handleTap,
  } = useRecordAndSTTForHoldTap({
    uploadAudioFn: async (audioBlob) => {
      // Replace with your STT service upload logic
      const formData = new FormData();
      formData.append('file', audioBlob, 'recording.webm');

      const response = await fetch('/api/stt', {
        method: 'POST',
        body: formData,
      });

      if (!response.ok) {
        throw new Error('Failed to upload audio for transcription');
      }

      const data = await response.json();
      return data.transcript;
    },
    onTranscription: (text) => {
      console.log('Transcription received:', text);
    },
    onError: (err) => {
      console.error('Recording error:', err);
    },
    scenario: 'hold', // Change to 'tap' for tap-to-talk
  });

  return (
    <div>
      <button
        onMouseDown={handlePressDown}
        onMouseUp={handlePressUp}
        onTouchStart={handlePressDown}
        onTouchEnd={handlePressUp}
        disabled={isRecording}
      >
        {isRecording ? 'Recording...' : 'Hold to Talk'}
      </button>

      {/* For 'tap' scenario, use the handleTap function */}
      {/* 
      <button onClick={handleTap}>
        {isRecording ? 'Stop Recording' : 'Tap to Talk'}
      </button>
      */}

      <div>
        <p>Amplitude: {(amplitude * 100).toFixed(2)}%</p>
        <p>Transcript: {transcript}</p>
      </div>
    </div>
  );
};

In the above example:

  • Hold Scenario: The user presses and holds the button to start recording and releases to stop. The amplitude is visualized, and upon stopping, the audio is uploaded for transcription.

  • Tap Scenario: Uncomment the tap button and comment out the hold button to switch to the tap-to-talk interaction.

useRecordAndSTTForContinuous

import { useRecordAndSTTForContinuous } from '@avatara/avatar-stream';

const {
  userSpeaking,
  transcript,
  amplitude,
  start,
  pause,
  stop,
  resetTranscript,
  pauseAmplitude,
  resumeAmplitude,
  pauseListening,
  resumeListening,
} = useRecordAndSTTForContinuous({
  uploadAudioFn: async (audioBlob) => {
    // Replace with your STT service upload logic
    const formData = new FormData();
    formData.append('file', audioBlob, 'recording.wav');

    const response = await fetch('/api/stt', {
      method: 'POST',
      body: formData,
    });

    if (!response.ok) {
      throw new Error('Failed to upload audio for transcription');
    }

    const data = await response.json();
    return data.transcript;
  },
  onTranscription: (text) => console.log('Transcription:', text),
  onError: (err) => console.error('Error:', err),
  startOnLoad: true,
});
Description

useRecordAndSTTForContinuous is a React hook that facilitates continuous audio recording and speech-to-text (STT) conversion. It leverages voice activity detection (VAD) to automatically detect when the user starts and stops speaking, enabling seamless and uninterrupted interactions.

This hook is ideal for scenarios where continuous listening and real-time transcription are required, such as live chat interfaces, voice-controlled applications, or interactive avatar interactions.

Parameters
  • params: UseRecordAndSTTForContinuousParams

    Configuration options for the hook.

    • uploadAudioFn: (audioBlob: Blob) => Promise<string>
      A function that receives an audio Blob and returns the recognized text (STT). This function should handle uploading the audio to your STT service and return the transcribed text.

    • onTranscription?: (text: string) => void
      Callback invoked when partial or final transcription text is received from the STT service.

    • onError?: (error: Error) => void
      Callback invoked if there's an error (e.g., microphone access issues, VAD errors, STT upload failures).

    • startOnLoad?: boolean
      If true, the microphone and VAD start automatically when the component mounts.
      Default: false

Returns
  • UseRecordAndSttReturn

    An object containing the recording state, transcript, amplitude, and interaction handlers.

    • userSpeaking: boolean
      Indicates if the user is currently speaking, as detected by VAD.

    • transcript: string
      The cumulative transcribed text from the recording.

    • amplitude: number
      The amplitude of the audio signal, normalized between 0 and 1. Useful for visualizing audio levels.

    • start: () => void
      Function to manually start continuous recording and listening.

    • pause: () => void
      Function to pause the continuous recording and listening without stopping.

    • stop: () => void
      Function to completely stop recording and listening, releasing all resources.

    • resetTranscript: () => void
      Function to reset the accumulated transcript to an empty string.

    • pauseAmplitude: () => void
      Function to pause the amplitude monitoring.

    • resumeAmplitude: () => void
      Function to resume the amplitude monitoring.

    • pauseListening: () => void
      Function to pause both audio amplitude monitoring and VAD-based listening.

    • resumeListening: () => void
      Function to resume both audio amplitude monitoring and VAD-based listening.

Example
import React from 'react';
import { useRecordAndSTTForContinuous } from '@avatara/avatar-stream';

const ContinuousRecordingComponent = () => {
  const {
    userSpeaking,
    transcript,
    amplitude,
    start,
    pause,
    stop,
    resetTranscript,
    pauseAmplitude,
    resumeAmplitude,
    pauseListening,
    resumeListening,
  } = useRecordAndSTTForContinuous({
    uploadAudioFn: async (audioBlob) => {
      // Replace with your STT service upload logic
      const formData = new FormData();
      formData.append('file', audioBlob, 'recording.wav');

      const response = await fetch('/api/stt', {
        method: 'POST',
        body: formData,
      });

      if (!response.ok) {
        throw new Error('Failed to upload audio for transcription');
      }

      const data = await response.json();
      return data.transcript;
    },
    onTranscription: (text) => {
      console.log('Transcription received:', text);
    },
    onError: (err) => {
      console.error('Recording error:', err);
    },
    startOnLoad: true, // Automatically start listening on mount
  });

  return (
    <div>
      <p>User is {userSpeaking ? 'speaking' : 'silent'}.</p>
      <p>Amplitude: {(amplitude * 100).toFixed(2)}%</p>
      <p>Transcript: {transcript}</p>

      <button onClick={start}>Start Listening</button>
      <button onClick={pause}>Pause Listening</button>
      <button onClick={resumeListening}>Resume Listening</button>
      <button onClick={stop}>Stop Listening</button>
      <button onClick={resetTranscript}>Reset Transcript</button>
    </div>
  );
};

In the above example:

  • Automatic Start: Since startOnLoad is set to true, the hook begins listening and recording as soon as the component mounts.

  • Visual Feedback: The amplitude is visualized to provide real-time feedback on the audio levels, and the userSpeaking state indicates whether the user is currently speaking.

  • Transcript Management: The transcript accumulates recognized text, which can be reset using the resetTranscript function.

  • Control Buttons: Users can manually start, pause, resume, and stop the listening process, offering flexibility in managing interactions.

useMicrophoneVisualization

import { useMicrophoneVisualization } from '@avatara/avatar-stream';

const {
  amplitude,
  startVisualization,
  stopVisualization,
  pauseVisualization,
  resumeVisualization,
  isPaused,
} = useMicrophoneVisualization();
Description

useMicrophoneVisualization is a React hook that provides real-time visualization of the user's microphone input by calculating and exposing the audio amplitude. This hook is particularly useful for creating visual feedback components such as audio level meters or animations that respond to the user's voice activity.

The hook manages microphone access, sets up the audio context, and continuously updates the amplitude based on incoming audio data. It also offers control functions to start, stop, pause, and resume the visualization, allowing developers to manage the visualization state as needed.

Parameters

This hook does not accept any parameters.

Returns
  • UseMicrophoneVisualizationReturn

    An object containing the current amplitude, control functions for the visualization, and the paused state.

    • amplitude: number
      The current amplitude of the audio signal, normalized between 0 and 1. This value can be used to drive visual elements that respond to audio levels.

    • startVisualization: () => void
      Initiates the microphone visualization by accessing the user's microphone and starting the amplitude calculation.

    • stopVisualization: () => void
      Terminates the microphone visualization by stopping the audio processing and releasing microphone resources.

    • pauseVisualization: () => void
      Pauses the amplitude updates without releasing the microphone resources. Useful for temporarily halting visualization while maintaining the current state.

    • resumeVisualization: () => void
      Resumes the amplitude updates after a pause.

    • isPaused: boolean
      Indicates whether the visualization is currently paused.

Example
import React, { useEffect } from 'react';
import { useMicrophoneVisualization } from '@avatara/avatar-stream';

const MicrophoneVisualizer = () => {
  const {
    amplitude,
    startVisualization,
    stopVisualization,
    pauseVisualization,
    resumeVisualization,
    isPaused,
  } = useMicrophoneVisualization();

  useEffect(() => {
    // Start the visualization when the component mounts
    startVisualization();

    // Cleanup: Stop the visualization when the component unmounts
    return () => {
      stopVisualization();
    };
    // eslint-disable-next-line react-hooks/exhaustive-deps
  }, []);

  const handlePauseResume = () => {
    if (isPaused) {
      resumeVisualization();
    } else {
      pauseVisualization();
    }
  };

  const handleStop = () => {
    stopVisualization();
  };

  return (
    <div>
      <div style={{ display: 'flex', alignItems: 'center', marginBottom: '10px' }}>
        <div
          style={{
            width: '100px',
            height: '20px',
            backgroundColor: '#ddd',
            marginRight: '10px',
            position: 'relative',
          }}
        >
          <div
            style={{
              width: `${amplitude * 100}%`,
              height: '100%',
              backgroundColor: '#4caf50',
              transition: 'width 0.1s ease-out',
            }}
          />
        </div>
        <span>{(amplitude * 100).toFixed(2)}%</span>
      </div>

      <button onClick={handlePauseResume} style={{ marginRight: '10px' }}>
        {isPaused ? 'Resume Visualization' : 'Pause Visualization'}
      </button>
      <button onClick={handleStop}>Stop Visualization</button>
    </div>
  );
};

export default MicrophoneVisualizer;
Additional Notes
  • Permissions: When startVisualization is invoked, the browser will prompt the user for microphone access. Ensure that your application handles permission denials gracefully.

  • Performance: The hook uses requestAnimationFrame for efficient amplitude updates, ensuring smooth visual feedback without unnecessary performance overhead.

  • Error Handling: Errors during microphone access or audio processing are logged to the console, and an alert is shown to inform the user. You may customize this behavior based on your application's requirements.

Server-Side

StreamAvatarServerHelper

import { StreamAvatarServerHelper } from '@avatara/avatar-stream';

const helper = new StreamAvatarServerHelper({
  apikey: 'your-api-key',
});

const token = await helper.getToken({
  remote_id: 'user-id',
  name: 'User Name',
});

const stars = await helper.getStars({ page: 1, limit: 10 });
const star = await helper.getStarByUID('star-uid');
const conversationUID = await helper.getConversationUID('star-uid', 'token');
const chatReply = await helper.getChatReply(
  { conversation_uid: 'conversation-uid', message: 'Hello' },
  'token'
);
const transcript = await helper.stt({ audio_file: new ArrayBuffer(0) }, 'token');
Public Methods

Note: All of these methods should be called on the server side to avoid exposing the API key.

getToken(request: AuthRequest): Promise<AuthResponseData>

Description
Obtains an authentication token (and optionally a stream token) for a user.

  • Parameters:

    • request: AuthRequest
      • remote_id (string): A unique identifier for the user on your system.
      • name (string, optional): The user’s full name.
      • dob (string, optional): Date of birth in YYYY-MM-DD format.
      • gender (string, optional): The user’s gender.
      • nickname (string, optional): The user’s nickname.
  • Returns:
    A Promise<AuthResponseData> that resolves with authentication data, including a token and stream_token.

  • Throws:

    • Error if the request fails or if the server returns a non-OK status or an error status.

Example:

const tokenData = await helper.getToken({
  remote_id: 'unique-user-id',
  name: 'Jane Doe',
});
console.log(tokenData.token); // Bearer token
getStars(param: GetStarsParam): Promise<StarBase[]>

Description
Retrieves a paginated list of available Stars (characters) from the server. No Bearer token required, but the API key is needed.

  • Parameters:

    • param: GetStarsParam
      • page (number, optional): Page number to fetch (default is 1).
      • limit (number, optional): Number of items per page (default is 10).
  • Returns:
    A Promise<StarBase[]> that resolves with an array of StarBase objects.

  • Throws:

    • Error if the request fails or if the server returns a non-OK status or an error status.

Example:

const stars = await helper.getStars({ page: 1, limit: 10 });
console.log(stars);
getStarByUID(uid: string): Promise<Star>

Description
Retrieves detailed information for a specific Star. This includes interactive_config if an associated interactive character exists.

  • Parameters:

    • uid: string — The unique identifier of the Star.
  • Returns:
    A Promise<Star> that resolves with the full Star object, including interactive_config if applicable.

  • Throws:

    • Error if the request fails or if the server returns a non-OK status or an error status.

Example:

const starDetail = await helper.getStarByUID('star-uid');
console.log(starDetail.interactive_config);
getConversationUID(starUID: string, token: string): Promise<string>

Description
Retrieves or creates a conversation for a given Star and returns its UID. If no existing conversation is found, a new one is created.

  • Parameters:

    • starUID: string: The unique identifier of the Star.
    • token: string: A Bearer token (from getToken).
  • Returns:
    A Promise<string> that resolves with the conversation UID.

  • Throws:

    • Error if the request fails or if the server returns a non-OK status or an error status.

Example:

const conversationUID = await helper.getConversationUID('star-uid', tokenData.token);
console.log(conversationUID);
getChatReply(param: ChatReplyParam, token: string): Promise<string>

Description
Sends a message to the Star within a conversation and retrieves the AI-generated reply.

  • Parameters:

    • param: ChatReplyParam
      • conversation_uid (string): The conversation UID.
      • message (string): The user’s message.
    • token: string: A Bearer token (from getToken).
  • Returns:
    A Promise<string> that resolves with the Star’s chat reply.

  • Throws:

    • Error if the request fails or if the server returns a non-OK status or an error status.

Example:

const reply = await helper.getChatReply(
  { conversation_uid: conversationUID, message: 'Hello!' },
  tokenData.token
);
console.log(reply);
stt(param: STTParam, token: string): Promise<string>

Description
Uploads audio data for speech-to-text (STT) processing and returns the transcribed text.

  • Parameters:

    • param: STTParam
      • audio_file (ArrayBuffer): The raw audio data to transcribe.
      • language (string, optional): Language code (e.g., 'en', 'ja') for transcription.
    • token: string: A Bearer token (from getToken).
  • Returns:
    A Promise<string> that resolves with the transcribed text from the audio file.

  • Throws:

    • Error if the request fails or if the server returns a non-OK status or an error status.

Example:

const transcript = await helper.stt({ audio_file: myAudioBuffer }, tokenData.token);
console.log(transcript);
sts(form: FormData, token: string): Promise<STSResponse>

Description
Uploads form data for speech-to-speech (STS) processing and returns the processed audio stream.

  • Parameters:

    • form: FormData
      • Must include:
        • conversation_uid (string): The unique identifier for the conversation.
        • audio (Blob): The audio data to be processed.
    • token: string: A Bearer token (from getToken).
  • Returns:
    A Promise<STSResponse> that resolves with the processed audio stream from the STS service.

  • Throws:

    • Error if the request fails, if the server returns a non-OK status, or if an error occurs during the fetch operation.

Example:

const formData = new FormData();
formData.append('conversation_uid', conversationUID);
formData.append('audio', audioBlob); // audioBlob is a Blob representing the audio data

try {
  const stsResponse = await helper.sts(formData, tokenData.token);
  const audioStream = stsResponse; // Handle the audio stream as needed
  // For example, you might play the audio or process it further
} catch (error) {
  console.error('STS processing failed:', error);
}

Types

There are two sets of types: one for the client side and another for the server side. You can import the relevant types directly from the package:

import type { StreamingEventsCallbacks } from '@avatara/avatar-stream';
import type { AuthRequest } from '@avatara/avatar-stream/server';

Keywords

avatar

FAQs

Package last updated on 19 Mar 2025

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts