A workflow for using AI embeddings to build a recommendation engine that suggests similar videos based on content
You can build a content-based recommendation system that suggests similar videos by converting video transcripts into AI embeddings and performing a nearest neighbor search.
The core concept is to convert text (your video transcripts) into high-dimensional vectors (embeddings) that capture semantic meaning. Videos with similar content will have embeddings that are close together in vector space, allowing you to find and recommend similar content.
The idea is that your recommendation system will return a ranked list of similar videos, for example:
{
"id": "abc123",
"recommendations": [
{
"id": "def456",
"title": "Similar Video Title",
"similarity_score": 0.89,
"mux_asset_id": "mux-asset-id-123",
"mux_playback_id": "mux-playback-id-456"
},
{
"id": "ghi789",
"title": "Another Related Video",
"similarity_score": 0.82,
"mux_asset_id": "mux-asset-id-789",
"mux_playback_id": "mux-playback-id-1011"
}
]
}
The transcript data from Mux's auto-generated captions provides the textual content needed to create meaningful embeddings for your recommendation engine.
Here's how to build a content-based video recommendation system:
video.asset.track.ready
webhook, which will tell you that the captions track has finished being createdEmbeddings are numerical representations of text that capture semantic meaning. Words or phrases with similar meanings will have similar embedding vectors.
There are many services, APIs, and models available for creating embeddings from text:
import OpenAI from 'openai';
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
async function createEmbedding(transcript) {
const response = await openai.embeddings.create({
model: "text-embedding-3-small",
input: transcript,
encoding_format: "float",
});
return response.data[0].embedding;
}
import { BedrockRuntimeClient, InvokeModelCommand } from "@aws-sdk/client-bedrock-runtime";
const client = new BedrockRuntimeClient({ region: "us-east-1" });
async function createEmbedding(transcript) {
const command = new InvokeModelCommand({
modelId: "amazon.titan-embed-text-v1",
contentType: "application/json",
body: JSON.stringify({
inputText: transcript
})
});
const response = await client.send(command);
const responseBody = JSON.parse(new TextDecoder().decode(response.body));
return responseBody.embedding;
}
You need to store your embeddings in a way that allows for efficient similarity search. Here are the main approaches:
For applications already using PostgreSQL, you can add vector similarity search with the pgvector extension.
For an example with Supabase, see this guide about using pgvector with Supabase
Dedicated vector databases are optimized for similarity search:
// Example with Pinecone
import { Pinecone } from '@pinecone-database/pinecone';
const pinecone = new Pinecone({
apiKey: process.env.PINECONE_API_KEY,
});
const index = pinecone.index('video-recommendations');
async function storeEmbedding(videoId, embedding, metadata) {
await index.upsert([{
id: videoId,
values: embedding,
metadata: metadata
}]);
}
async function findSimilarVideos(queryEmbedding, topK = 5) {
const queryResponse = await index.query({
vector: queryEmbedding,
topK: topK,
includeMetadata: true,
includeValues: true
});
return queryResponse.matches;
}
This approach gives you a content-based recommendation system that understands the semantic meaning of your videos, helping viewers discover relevant content based on what they're actually watching.