Table of Contents
This page is generated via scout. For now it just shows the project README.
1. boot
Boot builds voice bots by listening to human conversations, without any supervision.
1.1. Transcribing
The first step is to transcribe a set of human-human conversations.
# Maybe few ENV vars or config to use the right ASR/VAD boot transcribe --audios-dir=<audios-dir> --conversations-json=<conversations-json>
--audios-dir is a directory with human-human stereo calls. One file per call.
--conversations-json is where transcribed calls are kept as list of turns in
the following structure:
{
"audio-file-name": [<utterance>, ...], ...
}
<utterance> := {"start": float, "end": float, "channel": str, "alternatives": [<alternative>], "text": str}
<alternative> := {"confidence": float, "transcript": str}
1.2. Train
boot train --conversations-json=<conversations-json> --output-model=<model.boot>
1.3. Serve
boot serve --model=<model.boot> # Exposes an audio server boot interact --model=<model.boot> # Audio server working on CLI