AssemblyAI Audio Transcript
This covers how to load audio (and video) transcripts as document objects from a file using the AssemblyAI API.
Usage
First, you'll need to install the official AssemblyAI package:
- npm
- Yarn
- pnpm
npm install @langchain/community @langchain/core assemblyai
yarn add @langchain/community @langchain/core assemblyai
pnpm add @langchain/community @langchain/core assemblyai
To use the loaders you need an AssemblyAI account and get your AssemblyAI API key from the dashboard.
Then, configure the API key as the ASSEMBLYAI_API_KEY
environment variable or the apiKey
options parameter.
import {
AudioTranscriptLoader,
// AudioTranscriptParagraphsLoader,
// AudioTranscriptSentencesLoader
} from "@langchain/community/document_loaders/web/assemblyai";
// You can also use a local file path and the loader will upload it to AssemblyAI for you.
const audioUrl = "https://storage.googleapis.com/aai-docs-samples/espn.m4a";
// Use `AudioTranscriptParagraphsLoader` or `AudioTranscriptSentencesLoader` for splitting the transcript into paragraphs or sentences
const loader = new AudioTranscriptLoader(
{
audio: audioUrl,
// any other parameters as documented here: https://www.assemblyai.com/docs/api-reference/transcripts/submit
},
{
apiKey: "<ASSEMBLYAI_API_KEY>", // or set the `ASSEMBLYAI_API_KEY` env variable
}
);
const docs = await loader.load();
console.dir(docs, { depth: Infinity });
API Reference:
- AudioTranscriptLoader from
@langchain/community/document_loaders/web/assemblyai
info
- You can use the
AudioTranscriptParagraphsLoader
orAudioTranscriptSentencesLoader
to split the transcript into paragraphs or sentences.- The
audio
parameter can be a URL, a local file path, a buffer, or a stream.- The
audio
can also be a video file. See the list of supported file types in the FAQ doc.- If you don't pass in the
apiKey
option, the loader will use theASSEMBLYAI_API_KEY
environment variable.- You can add more properties in addition to
audio
. Find the full list of request parameters in the AssemblyAI API docs.
You can also use the AudioSubtitleLoader
to get srt
or vtt
subtitles as a document.
import { AudioSubtitleLoader } from "@langchain/community/document_loaders/web/assemblyai";
// You can also use a local file path and the loader will upload it to AssemblyAI for you.
const audioUrl = "https://storage.googleapis.com/aai-docs-samples/espn.m4a";
const loader = new AudioSubtitleLoader(
{
audio: audioUrl,
// any other parameters as documented here: https://www.assemblyai.com/docs/api-reference/transcripts/submit
},
"srt", // srt or vtt
{
apiKey: "<ASSEMBLYAI_API_KEY>", // or set the `ASSEMBLYAI_API_KEY` env variable
}
);
const docs = await loader.load();
console.dir(docs, { depth: Infinity });
API Reference:
- AudioSubtitleLoader from
@langchain/community/document_loaders/web/assemblyai