API Reference
MicVAD
The MicVAD
API is for recording user audio in the browser and running callbacks on speech segments and related events.
Support
Package |
Supported |
@ricky0123/vad-web |
Yes |
@ricky0123/vad-node |
No |
@ricky0123/vad-react |
No, use the useMicVAD hook |
Example
| import { MicVAD } from "@ricky0123/vad-web"
const myvad = await MicVAD.new({
onSpeechEnd: (audio) => {
// do something with `audio` (Float32Array of audio samples at sample rate 16000)...
},
})
myvad.start()
|
Options
New instances of MicVAD
are created by calling the async static method MicVAD.new(options)
. The options object can contain the following fields (all are optional).
Option |
Type |
Description |
additionalAudioConstraints |
|
constraints to pass to getUserMedia via the audio field |
onFrameProcessed |
(probabilities: {isSpeech: float; notSpeech: float}) => any |
Callback to run after each frame. |
onVADMisfire |
() => any |
Callback to run if speech start was detected but onSpeechEnd will not be run because the audio segment is smaller than minSpeechFrames |
onSpeechStart |
() => any |
Callback to run when speech start is detected |
onSpeechEnd |
(audio: Float32Array) => any |
Callback to run when speech end is detected. Takes as arg a Float32Array of audio samples between -1 and 1, sample rate 16000. This will not run if the audio segment is smaller than minSpeechFrames |
positiveSpeechThreshold |
number |
see algorithm configuration |
negativeSpeechThreshold |
number |
see algorithm configuration |
redemptionFrames |
number |
see algorithm configuration |
frameSamples |
number |
see algorithm configuration |
preSpeechPadFrames |
number |
see algorithm configuration |
minSpeechFrames |
number |
see algorithm configuration |
Attributes
Attributes |
Type |
Description |
listening |
boolean |
Is the VAD listening to mic input or is it paused? |
pause |
() => void |
Stop listening to mic input |
start |
() => void |
Start listening to mic input |
NonRealTimeVAD
The NonRealTimeVAD
API is for identifying segments of user speech if you already have a Float32Array of audio samples.
Support
Package |
Supported |
@ricky0123/vad-web |
Yes |
@ricky0123/vad-node |
Yes |
@ricky0123/vad-react |
No |
Example
| const vad = require("@ricky0123/vad-node") // or @ricky0123/vad-web
const options: Partial<vad.NonRealTimeVADOptions> = { /* ... */ }
const myvad = await vad.NonRealTimeVAD.new(options)
const audioFileData, nativeSampleRate = ... // get audio and sample rate from file or something
for await (const {audio, start, end} of myvad.run(audioFileData, nativeSampleRate)) {
// do stuff with
// audio (float32array of audio)
// start (milliseconds into audio where speech starts)
// end (milliseconds into audio where speech ends)
}
|
Options
New instances of MicVAD
are created by calling the async static method MicVAD.new(options)
. The options object can contain the following fields (all are optional).
Attributes
Attributes |
Type |
Description |
run |
async function* (inputAudio: Float32Array, sampleRate: number): AsyncGenerator |
Run the VAD model on your audio |
useMicVAD
A React hook wrapper for MicVAD. Use this if you want to run the VAD model on mic input in a React application.
Support
Package |
Supported |
@ricky0123/vad-web |
No, use MicVAD |
@ricky0123/vad-node |
No |
@ricky0123/vad-react |
Yes |
Example
| import { useMicVAD } from "@ricky0123/vad-react"
const MyComponent = () => {
const vad = useMicVAD({
startOnLoad: true,
onSpeechEnd: (audio) => {
console.log("User stopped talking")
},
})
return <div>{vad.userSpeaking && "User is speaking"}</div>
}
|
Options
The useMicVAD
hook takes an options object with the following fields (all optional).
Option |
Type |
Description |
startOnLoad |
boolean |
Should the VAD start listening to mic input when it finishes loading? |
additionalAudioConstraints |
|
constraints to pass to getUserMedia via the audio field |
onFrameProcessed |
(probabilities: {isSpeech: float; notSpeech: float}) => any |
Callback to run after each frame. |
onVADMisfire |
() => any |
Callback to run if speech start was detected but onSpeechEnd will not be run because the audio segment is smaller than minSpeechFrames |
onSpeechStart |
() => any |
Callback to run when speech start is detected |
onSpeechEnd |
(audio: Float32Array) => any |
Callback to run when speech end is detected. Takes as arg a Float32Array of audio samples between -1 and 1, sample rate 16000. This will not run if the audio segment is smaller than minSpeechFrames |
positiveSpeechThreshold |
number |
see algorithm configuration |
negativeSpeechThreshold |
number |
see algorithm configuration |
redemptionFrames |
number |
see algorithm configuration |
frameSamples |
number |
see algorithm configuration |
preSpeechPadFrames |
number |
see algorithm configuration |
minSpeechFrames |
number |
see algorithm configuration |
Returns
Attributes |
Type |
Description |
listening |
boolean |
Is the VAD currently listening to mic input? |
errored |
false \| { message: string; } |
Did the VAD fail to load? |
loading |
boolean |
Did the VAD finish loading? |
userSpeaking |
boolean |
Is the user speaking? |
pause |
() => void |
Stop the VAD from running on mic input |
start |
() => void |
Start running the VAD on mic input |