What is the difference between keyword search and semantic search?

Keyword search matches exact terms in documents, while semantic search understands the meaning and intent behind queries. For example, searching 'affordable laptops' with semantic search will also find documents mentioning 'budget computers' or 'cheap notebooks' because it understands these concepts are related, even without matching keywords.

What are vector embeddings and how do they enable semantic search?

Vector embeddings are numerical representations of text (or other data) in high-dimensional space, typically 768-1536 dimensions. Similar concepts are positioned close together in this space. Semantic search works by converting queries and documents to embeddings, then finding documents whose vectors are closest to the query vector using similarity metrics like cosine similarity.

When should I use hybrid search instead of pure semantic search?

Hybrid search combines keyword (BM25) and semantic search for best results. Use hybrid when: you need exact term matching for product codes or names, your corpus has specialized terminology, users expect both conceptual and literal matches, or you want to balance precision and recall. Most production systems use hybrid approaches with tunable weights.

AI-Enhanced Search: Semantic and Vector Search

Traditional keyword-based search is fundamentally limited: it cannot understand that "affordable laptop" and "budget computer" mean the same thing. Users searching for "how to fix a slow website" won't find your excellent article titled "Web Performance Optimization Techniques." This semantic gap costs businesses significantly. According to Algolia research, semantic search implementations show 40-60% improvements in search relevance over traditional keyword matching.

In this comprehensive guide, we'll explore how vector embeddings and semantic search transform user experiences. You'll learn to implement production-ready search systems using Elasticsearch, Pinecone, and Weaviate, create embedding pipelines with OpenAI, and build hybrid search systems that combine the best of keyword and semantic approaches.

Understanding Semantic Search

Semantic search represents a paradigm shift from matching strings to understanding meaning. While keyword search asks "which documents contain these exact words?", semantic search asks "which documents are about the same concepts as this query?"

The Evolution of Search

Search technology has evolved through distinct generations:

// search-evolution.ts

// Generation 1: Exact Match (1990s)
// "laptop" only finds documents with "laptop"
const exactMatch = (query: string, documents: string[]) => {
    return documents.filter(doc =>
        doc.toLowerCase().includes(query.toLowerCase())
    );
};

// Generation 2: TF-IDF / BM25 (2000s)
// Ranks by term frequency and inverse document frequency
interface BM25Result {
    document: string;
    score: number;
}

const bm25Search = (query: string, documents: string[]): BM25Result[] => {
    const queryTerms = query.toLowerCase().split(' ');
    const k1 = 1.2;
    const b = 0.75;
    const avgDocLength = documents.reduce((sum, d) => sum + d.length, 0) / documents.length;

    return documents.map(doc => {
        const docTerms = doc.toLowerCase().split(' ');
        let score = 0;

        queryTerms.forEach(term => {
            const tf = docTerms.filter(t => t === term).length;
            const idf = Math.log((documents.length + 1) /
                (documents.filter(d => d.includes(term)).length + 1));
            const docLength = docTerms.length;

            score += idf * ((tf * (k1 + 1)) /
                (tf + k1 * (1 - b + b * (docLength / avgDocLength))));
        });

        return { document: doc, score };
    }).sort((a, b) => b.score - a.score);
};

// Generation 3: Semantic Search (2020s)
// Understands meaning through vector embeddings
interface SemanticResult {
    document: string;
    similarity: number;
    embedding: number[];
}

const semanticSearch = async (
    query: string,
    documentEmbeddings: Map,
    embeddingModel: EmbeddingModel
): Promise => {
    const queryEmbedding = await embeddingModel.embed(query);

    const results: SemanticResult[] = [];

    documentEmbeddings.forEach((embedding, document) => {
        const similarity = cosineSimilarity(queryEmbedding, embedding);
        results.push({ document, similarity, embedding });
    });

    return results.sort((a, b) => b.similarity - a.similarity);
};

// Cosine similarity: measures angle between vectors
const cosineSimilarity = (a: number[], b: number[]): number => {
    let dotProduct = 0;
    let normA = 0;
    let normB = 0;

    for (let i = 0; i < a.length; i++) {
        dotProduct += a[i] * b[i];
        normA += a[i] * a[i];
        normB += b[i] * b[i];
    }

    return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
};

Vector Embeddings: The Foundation

Vector embeddings are the mathematical representation that makes semantic search possible. Modern embedding models like OpenAI's text-embedding-3 convert text into high-dimensional vectors where semantic similarity translates to spatial proximity.

Creating Embeddings with OpenAI

// embedding-service.ts
import OpenAI from 'openai';

interface EmbeddingConfig {
    model: 'text-embedding-3-small' | 'text-embedding-3-large' | 'text-embedding-ada-002';
    dimensions?: number; // Only for text-embedding-3 models
    batchSize: number;
}

interface EmbeddingResult {
    text: string;
    embedding: number[];
    tokens: number;
}

class OpenAIEmbeddingService {
    private client: OpenAI;
    private config: EmbeddingConfig;
    private cache: Map = new Map();

    constructor(apiKey: string, config: Partial = {}) {
        this.client = new OpenAI({ apiKey });
        this.config = {
            model: config.model || 'text-embedding-3-small',
            dimensions: config.dimensions || 1536,
            batchSize: config.batchSize || 100
        };
    }

    // Generate embedding for single text
    async embed(text: string): Promise {
        // Check cache first
        const cacheKey = this.getCacheKey(text);
        if (this.cache.has(cacheKey)) {
            return this.cache.get(cacheKey)!;
        }

        const response = await this.client.embeddings.create({
            model: this.config.model,
            input: text,
            dimensions: this.config.dimensions
        });

        const embedding = response.data[0].embedding;
        this.cache.set(cacheKey, embedding);

        return embedding;
    }

    // Batch embed multiple texts
    async embedBatch(texts: string[]): Promise {
        const results: EmbeddingResult[] = [];
        const uncachedTexts: { text: string; index: number }[] = [];

        // Check cache and identify uncached texts
        texts.forEach((text, index) => {
            const cacheKey = this.getCacheKey(text);
            if (this.cache.has(cacheKey)) {
                results[index] = {
                    text,
                    embedding: this.cache.get(cacheKey)!,
                    tokens: 0
                };
            } else {
                uncachedTexts.push({ text, index });
            }
        });

        // Process uncached texts in batches
        for (let i = 0; i < uncachedTexts.length; i += this.config.batchSize) {
            const batch = uncachedTexts.slice(i, i + this.config.batchSize);
            const batchTexts = batch.map(item => item.text);

            const response = await this.client.embeddings.create({
                model: this.config.model,
                input: batchTexts,
                dimensions: this.config.dimensions
            });

            response.data.forEach((data, j) => {
                const originalIndex = batch[j].index;
                const text = batch[j].text;

                results[originalIndex] = {
                    text,
                    embedding: data.embedding,
                    tokens: response.usage?.total_tokens || 0
                };

                this.cache.set(this.getCacheKey(text), data.embedding);
            });

            // Rate limiting: wait between batches
            if (i + this.config.batchSize < uncachedTexts.length) {
                await this.sleep(100);
            }
        }

        return results;
    }

    // Chunk text for embedding (handles long documents)
    chunkText(text: string, maxTokens: number = 8000): string[] {
        const chunks: string[] = [];
        const sentences = text.split(/[.!?]+\s+/);
        let currentChunk = '';

        for (const sentence of sentences) {
            // Rough token estimate: 1 token ~= 4 characters
            const estimatedTokens = (currentChunk + sentence).length / 4;

            if (estimatedTokens > maxTokens && currentChunk) {
                chunks.push(currentChunk.trim());
                currentChunk = sentence;
            } else {
                currentChunk += (currentChunk ? '. ' : '') + sentence;
            }
        }

        if (currentChunk) {
            chunks.push(currentChunk.trim());
        }

        return chunks;
    }

    // Embed long document with chunking and averaging
    async embedDocument(document: string): Promise {
        const chunks = this.chunkText(document);
        const embeddings = await this.embedBatch(chunks);

        // Average all chunk embeddings
        const dimension = embeddings[0].embedding.length;
        const averaged = new Array(dimension).fill(0);

        embeddings.forEach(result => {
            result.embedding.forEach((val, i) => {
                averaged[i] += val / embeddings.length;
            });
        });

        // Normalize the averaged vector
        const norm = Math.sqrt(averaged.reduce((sum, val) => sum + val * val, 0));
        return averaged.map(val => val / norm);
    }

    private getCacheKey(text: string): string {
        return `${this.config.model}:${this.config.dimensions}:${text.substring(0, 100)}`;
    }

    private sleep(ms: number): Promise {
        return new Promise(resolve => setTimeout(resolve, ms));
    }
}

// Usage example
const embeddingService = new OpenAIEmbeddingService(process.env.OPENAI_API_KEY!, {
    model: 'text-embedding-3-small',
    dimensions: 1536
});

// Embed a query
const queryEmbedding = await embeddingService.embed("best practices for React performance");

// Batch embed documents
const documents = [
    "React optimization techniques including memoization and lazy loading",
    "Vue.js component lifecycle and performance tips",
    "Angular change detection strategies for faster apps"
];
const docEmbeddings = await embeddingService.embedBatch(documents);

Vector Databases: Pinecone, Weaviate, and Qdrant

Vector databases are purpose-built for storing and querying embeddings at scale. They use specialized indexing algorithms like HNSW (Hierarchical Navigable Small World) to enable sub-second similarity searches across millions of vectors.

Pinecone Implementation

// pinecone-search.ts
import { Pinecone } from '@pinecone-database/pinecone';

interface Document {
    id: string;
    content: string;
    metadata: {
        title: string;
        category: string;
        author: string;
        createdAt: string;
        tags: string[];
    };
}

interface SearchResult {
    id: string;
    score: number;
    metadata: Document['metadata'];
    content?: string;
}

class PineconeSearchService {
    private client: Pinecone;
    private indexName: string;
    private embeddingService: OpenAIEmbeddingService;
    private namespace: string;

    constructor(
        apiKey: string,
        indexName: string,
        embeddingService: OpenAIEmbeddingService,
        namespace: string = 'default'
    ) {
        this.client = new Pinecone({ apiKey });
        this.indexName = indexName;
        this.embeddingService = embeddingService;
        this.namespace = namespace;
    }

    // Initialize index (run once)
    async createIndex(dimension: number = 1536): Promise {
        const existingIndexes = await this.client.listIndexes();

        if (!existingIndexes.indexes?.find(idx => idx.name === this.indexName)) {
            await this.client.createIndex({
                name: this.indexName,
                dimension,
                metric: 'cosine',
                spec: {
                    serverless: {
                        cloud: 'aws',
                        region: 'us-east-1'
                    }
                }
            });

            // Wait for index to be ready
            await this.waitForIndexReady();
        }
    }

    private async waitForIndexReady(): Promise {
        let ready = false;
        while (!ready) {
            const description = await this.client.describeIndex(this.indexName);
            ready = description.status?.ready || false;
            if (!ready) {
                await new Promise(resolve => setTimeout(resolve, 1000));
            }
        }
    }

    // Index documents
    async indexDocuments(documents: Document[]): Promise {
        const index = this.client.index(this.indexName);

        // Generate embeddings for all documents
        const embeddings = await this.embeddingService.embedBatch(
            documents.map(doc => doc.content)
        );

        // Prepare vectors for upsert
        const vectors = documents.map((doc, i) => ({
            id: doc.id,
            values: embeddings[i].embedding,
            metadata: {
                ...doc.metadata,
                content: doc.content.substring(0, 1000) // Store truncated content
            }
        }));

        // Upsert in batches of 100
        const batchSize = 100;
        for (let i = 0; i < vectors.length; i += batchSize) {
            const batch = vectors.slice(i, i + batchSize);
            await index.namespace(this.namespace).upsert(batch);
        }

        console.log(`Indexed ${documents.length} documents`);
    }

    // Semantic search
    async search(
        query: string,
        options: {
            topK?: number;
            filter?: Record;
            includeMetadata?: boolean;
        } = {}
    ): Promise {
        const { topK = 10, filter, includeMetadata = true } = options;

        const index = this.client.index(this.indexName);
        const queryEmbedding = await this.embeddingService.embed(query);

        const results = await index.namespace(this.namespace).query({
            vector: queryEmbedding,
            topK,
            filter,
            includeMetadata
        });

        return results.matches?.map(match => ({
            id: match.id,
            score: match.score || 0,
            metadata: match.metadata as Document['metadata'],
            content: (match.metadata as any)?.content
        })) || [];
    }

    // Search with metadata filtering
    async searchWithFilters(
        query: string,
        filters: {
            category?: string;
            tags?: string[];
            dateRange?: { start: string; end: string };
        }
    ): Promise {
        const filter: Record = {};

        if (filters.category) {
            filter.category = { $eq: filters.category };
        }

        if (filters.tags?.length) {
            filter.tags = { $in: filters.tags };
        }

        if (filters.dateRange) {
            filter.createdAt = {
                $gte: filters.dateRange.start,
                $lte: filters.dateRange.end
            };
        }

        return this.search(query, { filter, topK: 20 });
    }

    // Delete documents
    async deleteDocuments(ids: string[]): Promise {
        const index = this.client.index(this.indexName);
        await index.namespace(this.namespace).deleteMany(ids);
    }

    // Update document
    async updateDocument(document: Document): Promise {
        await this.deleteDocuments([document.id]);
        await this.indexDocuments([document]);
    }
}

// Usage
const searchService = new PineconeSearchService(
    process.env.PINECONE_API_KEY!,
    'my-search-index',
    embeddingService
);

// Search with filters
const results = await searchService.searchWithFilters(
    "React performance optimization",
    {
        category: "frontend",
        tags: ["react", "performance"]
    }
);

Weaviate Implementation

// weaviate-search.ts
import weaviate, { WeaviateClient, ApiKey } from 'weaviate-ts-client';

interface WeaviateConfig {
    host: string;
    apiKey?: string;
    openAIKey: string;
}

class WeaviateSearchService {
    private client: WeaviateClient;
    private className: string;

    constructor(config: WeaviateConfig, className: string = 'Document') {
        this.client = weaviate.client({
            scheme: 'https',
            host: config.host,
            apiKey: config.apiKey ? new ApiKey(config.apiKey) : undefined,
            headers: {
                'X-OpenAI-Api-Key': config.openAIKey
            }
        });
        this.className = className;
    }

    // Create schema with vectorizer
    async createSchema(): Promise {
        const schema = {
            class: this.className,
            vectorizer: 'text2vec-openai',
            moduleConfig: {
                'text2vec-openai': {
                    model: 'text-embedding-3-small',
                    dimensions: 1536,
                    type: 'text'
                },
                'generative-openai': {
                    model: 'gpt-4o-mini'
                }
            },
            properties: [
                {
                    name: 'title',
                    dataType: ['text'],
                    moduleConfig: {
                        'text2vec-openai': {
                            skip: false,
                            vectorizePropertyName: false
                        }
                    }
                },
                {
                    name: 'content',
                    dataType: ['text'],
                    moduleConfig: {
                        'text2vec-openai': {
                            skip: false,
                            vectorizePropertyName: false
                        }
                    }
                },
                {
                    name: 'category',
                    dataType: ['text'],
                    moduleConfig: {
                        'text2vec-openai': {
                            skip: true // Don't vectorize metadata fields
                        }
                    }
                },
                {
                    name: 'author',
                    dataType: ['text'],
                    moduleConfig: {
                        'text2vec-openai': { skip: true }
                    }
                },
                {
                    name: 'tags',
                    dataType: ['text[]']
                },
                {
                    name: 'createdAt',
                    dataType: ['date']
                }
            ]
        };

        try {
            await this.client.schema.classCreator().withClass(schema).do();
            console.log(`Created class ${this.className}`);
        } catch (error: any) {
            if (error.message?.includes('already exists')) {
                console.log(`Class ${this.className} already exists`);
            } else {
                throw error;
            }
        }
    }

    // Index documents (Weaviate handles vectorization automatically)
    async indexDocuments(documents: Document[]): Promise {
        let batcher = this.client.batch.objectsBatcher();
        let batchSize = 0;

        for (const doc of documents) {
            batcher = batcher.withObject({
                class: this.className,
                properties: {
                    title: doc.metadata.title,
                    content: doc.content,
                    category: doc.metadata.category,
                    author: doc.metadata.author,
                    tags: doc.metadata.tags,
                    createdAt: doc.metadata.createdAt
                },
                id: doc.id
            });

            batchSize++;

            if (batchSize >= 100) {
                await batcher.do();
                batcher = this.client.batch.objectsBatcher();
                batchSize = 0;
            }
        }

        if (batchSize > 0) {
            await batcher.do();
        }

        console.log(`Indexed ${documents.length} documents`);
    }

    // Semantic search with nearText
    async search(
        query: string,
        options: {
            limit?: number;
            offset?: number;
            filters?: {
                category?: string;
                tags?: string[];
            };
        } = {}
    ): Promise {
        const { limit = 10, offset = 0, filters } = options;

        let queryBuilder = this.client.graphql
            .get()
            .withClassName(this.className)
            .withFields('title content category author tags createdAt _additional { id certainty distance }')
            .withNearText({ concepts: [query] })
            .withLimit(limit)
            .withOffset(offset);

        // Add filters if provided
        if (filters) {
            const whereFilter = this.buildWhereFilter(filters);
            if (whereFilter) {
                queryBuilder = queryBuilder.withWhere(whereFilter);
            }
        }

        const result = await queryBuilder.do();

        return result.data.Get[this.className]?.map((item: any) => ({
            id: item._additional.id,
            score: item._additional.certainty,
            metadata: {
                title: item.title,
                category: item.category,
                author: item.author,
                tags: item.tags,
                createdAt: item.createdAt
            },
            content: item.content
        })) || [];
    }

    // Hybrid search (combines BM25 + vector search)
    async hybridSearch(
        query: string,
        options: {
            limit?: number;
            alpha?: number; // 0 = pure BM25, 1 = pure vector
        } = {}
    ): Promise {
        const { limit = 10, alpha = 0.5 } = options;

        const result = await this.client.graphql
            .get()
            .withClassName(this.className)
            .withFields('title content category author _additional { id score }')
            .withHybrid({
                query,
                alpha // Balance between keyword and semantic
            })
            .withLimit(limit)
            .do();

        return result.data.Get[this.className]?.map((item: any) => ({
            id: item._additional.id,
            score: item._additional.score,
            metadata: {
                title: item.title,
                category: item.category,
                author: item.author
            },
            content: item.content
        })) || [];
    }

    // Generative search (RAG - search + generate answer)
    async generateAnswer(
        query: string,
        options: { limit?: number } = {}
    ): Promise<{ answer: string; sources: SearchResult[] }> {
        const { limit = 5 } = options;

        const result = await this.client.graphql
            .get()
            .withClassName(this.className)
            .withFields('title content _additional { id certainty }')
            .withNearText({ concepts: [query] })
            .withLimit(limit)
            .withGenerate({
                groupedTask: `Based on the following documents, answer this question: "${query}".
                              Provide a comprehensive answer and cite the relevant sources.`
            })
            .do();

        const data = result.data.Get[this.className];

        return {
            answer: data?._additional?.generate?.groupedResult || 'No answer generated',
            sources: data?.map((item: any) => ({
                id: item._additional.id,
                score: item._additional.certainty,
                metadata: { title: item.title },
                content: item.content
            })) || []
        };
    }

    private buildWhereFilter(filters: { category?: string; tags?: string[] }): any {
        const operands: any[] = [];

        if (filters.category) {
            operands.push({
                path: ['category'],
                operator: 'Equal',
                valueText: filters.category
            });
        }

        if (filters.tags?.length) {
            operands.push({
                path: ['tags'],
                operator: 'ContainsAny',
                valueTextArray: filters.tags
            });
        }

        if (operands.length === 0) return null;
        if (operands.length === 1) return operands[0];

        return {
            operator: 'And',
            operands
        };
    }
}

Elasticsearch with Vector Search

Elasticsearch 8.x now supports native vector search alongside its powerful full-text capabilities, making it ideal for hybrid search implementations.

// elasticsearch-hybrid-search.ts
import { Client } from '@elastic/elasticsearch';

interface ElasticsearchConfig {
    node: string;
    auth?: {
        username: string;
        password: string;
    };
    cloud?: {
        id: string;
    };
}

class ElasticsearchHybridSearch {
    private client: Client;
    private indexName: string;
    private embeddingService: OpenAIEmbeddingService;

    constructor(
        config: ElasticsearchConfig,
        indexName: string,
        embeddingService: OpenAIEmbeddingService
    ) {
        this.client = new Client(config);
        this.indexName = indexName;
        this.embeddingService = embeddingService;
    }

    // Create index with vector field
    async createIndex(): Promise {
        const indexExists = await this.client.indices.exists({ index: this.indexName });

        if (indexExists) {
            console.log(`Index ${this.indexName} already exists`);
            return;
        }

        await this.client.indices.create({
            index: this.indexName,
            body: {
                settings: {
                    number_of_shards: 1,
                    number_of_replicas: 1,
                    'index.knn': true
                },
                mappings: {
                    properties: {
                        title: {
                            type: 'text',
                            analyzer: 'english',
                            fields: {
                                keyword: { type: 'keyword' }
                            }
                        },
                        content: {
                            type: 'text',
                            analyzer: 'english'
                        },
                        category: { type: 'keyword' },
                        author: { type: 'keyword' },
                        tags: { type: 'keyword' },
                        createdAt: { type: 'date' },
                        embedding: {
                            type: 'dense_vector',
                            dims: 1536,
                            index: true,
                            similarity: 'cosine'
                        }
                    }
                }
            }
        });

        console.log(`Created index ${this.indexName}`);
    }

    // Index documents with embeddings
    async indexDocuments(documents: Document[]): Promise {
        const embeddings = await this.embeddingService.embedBatch(
            documents.map(doc => doc.content)
        );

        const operations = documents.flatMap((doc, i) => [
            { index: { _index: this.indexName, _id: doc.id } },
            {
                title: doc.metadata.title,
                content: doc.content,
                category: doc.metadata.category,
                author: doc.metadata.author,
                tags: doc.metadata.tags,
                createdAt: doc.metadata.createdAt,
                embedding: embeddings[i].embedding
            }
        ]);

        const result = await this.client.bulk({ operations, refresh: true });

        if (result.errors) {
            console.error('Bulk indexing errors:', result.items.filter(item => item.index?.error));
        } else {
            console.log(`Indexed ${documents.length} documents`);
        }
    }

    // Pure keyword search (BM25)
    async keywordSearch(query: string, limit: number = 10): Promise {
        const result = await this.client.search({
            index: this.indexName,
            body: {
                query: {
                    multi_match: {
                        query,
                        fields: ['title^2', 'content', 'tags'],
                        type: 'best_fields',
                        fuzziness: 'AUTO'
                    }
                },
                size: limit
            }
        });

        return this.formatResults(result);
    }

    // Pure vector search (semantic)
    async vectorSearch(query: string, limit: number = 10): Promise {
        const queryEmbedding = await this.embeddingService.embed(query);

        const result = await this.client.search({
            index: this.indexName,
            body: {
                knn: {
                    field: 'embedding',
                    query_vector: queryEmbedding,
                    k: limit,
                    num_candidates: limit * 10
                },
                _source: {
                    excludes: ['embedding'] // Don't return the embedding
                }
            }
        });

        return this.formatResults(result);
    }

    // Hybrid search: combines BM25 and vector search
    async hybridSearch(
        query: string,
        options: {
            limit?: number;
            keywordWeight?: number;
            vectorWeight?: number;
            filters?: {
                category?: string;
                tags?: string[];
                dateRange?: { start: string; end: string };
            };
        } = {}
    ): Promise {
        const {
            limit = 10,
            keywordWeight = 0.3,
            vectorWeight = 0.7,
            filters
        } = options;

        const queryEmbedding = await this.embeddingService.embed(query);

        // Build filter query
        const filterClauses: any[] = [];
        if (filters?.category) {
            filterClauses.push({ term: { category: filters.category } });
        }
        if (filters?.tags?.length) {
            filterClauses.push({ terms: { tags: filters.tags } });
        }
        if (filters?.dateRange) {
            filterClauses.push({
                range: {
                    createdAt: {
                        gte: filters.dateRange.start,
                        lte: filters.dateRange.end
                    }
                }
            });
        }

        const result = await this.client.search({
            index: this.indexName,
            body: {
                query: {
                    bool: {
                        must: [
                            {
                                // Hybrid scoring using script_score
                                script_score: {
                                    query: {
                                        bool: {
                                            should: [
                                                {
                                                    multi_match: {
                                                        query,
                                                        fields: ['title^2', 'content'],
                                                        type: 'best_fields',
                                                        boost: keywordWeight
                                                    }
                                                }
                                            ],
                                            filter: filterClauses.length > 0 ? filterClauses : undefined
                                        }
                                    },
                                    script: {
                                        source: `
                                            double vectorScore = cosineSimilarity(params.queryVector, 'embedding') + 1.0;
                                            double keywordScore = _score;
                                            return (params.vectorWeight * vectorScore) + (params.keywordWeight * keywordScore);
                                        `,
                                        params: {
                                            queryVector: queryEmbedding,
                                            vectorWeight,
                                            keywordWeight
                                        }
                                    }
                                }
                            }
                        ]
                    }
                },
                size: limit,
                _source: {
                    excludes: ['embedding']
                }
            }
        });

        return this.formatResults(result);
    }

    // Reciprocal Rank Fusion (RRF) for hybrid search
    async rrfHybridSearch(
        query: string,
        limit: number = 10,
        k: number = 60 // RRF constant
    ): Promise {
        // Get results from both search methods
        const [keywordResults, vectorResults] = await Promise.all([
            this.keywordSearch(query, limit * 2),
            this.vectorSearch(query, limit * 2)
        ]);

        // Calculate RRF scores
        const rrfScores = new Map();

        keywordResults.forEach((result, rank) => {
            const rrfScore = 1 / (k + rank + 1);
            rrfScores.set(result.id, {
                score: rrfScore,
                data: result
            });
        });

        vectorResults.forEach((result, rank) => {
            const rrfScore = 1 / (k + rank + 1);
            const existing = rrfScores.get(result.id);

            if (existing) {
                existing.score += rrfScore;
            } else {
                rrfScores.set(result.id, {
                    score: rrfScore,
                    data: result
                });
            }
        });

        // Sort by combined RRF score
        const combinedResults = Array.from(rrfScores.values())
            .sort((a, b) => b.score - a.score)
            .slice(0, limit)
            .map(item => ({
                ...item.data,
                score: item.score
            }));

        return combinedResults;
    }

    // Autocomplete with semantic boosting
    async autocomplete(
        prefix: string,
        limit: number = 5
    ): Promise<{ suggestion: string; score: number }[]> {
        const result = await this.client.search({
            index: this.indexName,
            body: {
                query: {
                    bool: {
                        should: [
                            {
                                prefix: {
                                    'title.keyword': {
                                        value: prefix.toLowerCase(),
                                        boost: 2
                                    }
                                }
                            },
                            {
                                match_phrase_prefix: {
                                    title: {
                                        query: prefix,
                                        boost: 1.5
                                    }
                                }
                            }
                        ]
                    }
                },
                size: limit,
                _source: ['title']
            }
        });

        return result.hits.hits.map((hit: any) => ({
            suggestion: hit._source.title,
            score: hit._score
        }));
    }

    private formatResults(result: any): SearchResult[] {
        return result.hits.hits.map((hit: any) => ({
            id: hit._id,
            score: hit._score,
            metadata: {
                title: hit._source.title,
                category: hit._source.category,
                author: hit._source.author,
                tags: hit._source.tags,
                createdAt: hit._source.createdAt
            },
            content: hit._source.content
        }));
    }
}

Query Understanding and Expansion

Advanced search systems understand user intent and expand queries to improve recall. This involves synonym expansion, typo correction, and intent classification.

// query-understanding.ts
import OpenAI from 'openai';

interface QueryAnalysis {
    originalQuery: string;
    intent: 'informational' | 'navigational' | 'transactional' | 'ambiguous';
    entities: Entity[];
    expandedQueries: string[];
    suggestedFilters: Record;
    correctedQuery?: string;
}

interface Entity {
    text: string;
    type: 'product' | 'category' | 'brand' | 'attribute' | 'location';
    confidence: number;
}

class QueryUnderstandingService {
    private openai: OpenAI;
    private synonymMap: Map;
    private commonTypos: Map;

    constructor(apiKey: string) {
        this.openai = new OpenAI({ apiKey });
        this.synonymMap = this.loadSynonyms();
        this.commonTypos = this.loadTypoCorrections();
    }

    async analyzeQuery(query: string): Promise {
        const [
            intentResult,
            entities,
            corrections
        ] = await Promise.all([
            this.classifyIntent(query),
            this.extractEntities(query),
            this.correctTypos(query)
        ]);

        const expandedQueries = this.expandQuery(corrections.corrected || query);
        const suggestedFilters = this.inferFilters(entities);

        return {
            originalQuery: query,
            intent: intentResult,
            entities,
            expandedQueries,
            suggestedFilters,
            correctedQuery: corrections.corrected !== query ? corrections.corrected : undefined
        };
    }

    private async classifyIntent(query: string): Promise {
        const response = await this.openai.chat.completions.create({
            model: 'gpt-4o-mini',
            messages: [
                {
                    role: 'system',
                    content: `Classify the search query intent:
                    - informational: seeking knowledge or answers
                    - navigational: looking for a specific page/product
                    - transactional: intent to buy or take action
                    - ambiguous: unclear intent

                    Respond with only the intent type.`
                },
                { role: 'user', content: query }
            ],
            max_tokens: 20
        });

        const intent = response.choices[0].message.content?.trim().toLowerCase();
        return ['informational', 'navigational', 'transactional'].includes(intent!)
            ? intent as QueryAnalysis['intent']
            : 'ambiguous';
    }

    private async extractEntities(query: string): Promise {
        const response = await this.openai.chat.completions.create({
            model: 'gpt-4o-mini',
            messages: [
                {
                    role: 'system',
                    content: `Extract entities from the search query. Return JSON array:
                    [{"text": "entity", "type": "product|category|brand|attribute|location", "confidence": 0.9}]
                    Only return the JSON array, no explanation.`
                },
                { role: 'user', content: query }
            ],
            max_tokens: 200
        });

        try {
            return JSON.parse(response.choices[0].message.content || '[]');
        } catch {
            return [];
        }
    }

    private correctTypos(query: string): { corrected: string; changes: string[] } {
        const words = query.split(/\s+/);
        const changes: string[] = [];

        const correctedWords = words.map(word => {
            const lowerWord = word.toLowerCase();

            if (this.commonTypos.has(lowerWord)) {
                const correction = this.commonTypos.get(lowerWord)!;
                changes.push(`${word} -> ${correction}`);
                return correction;
            }

            // Levenshtein distance check for near matches
            for (const [typo, correction] of this.commonTypos.entries()) {
                if (this.levenshteinDistance(lowerWord, typo) <= 1) {
                    changes.push(`${word} -> ${correction}`);
                    return correction;
                }
            }

            return word;
        });

        return {
            corrected: correctedWords.join(' '),
            changes
        };
    }

    private expandQuery(query: string): string[] {
        const words = query.toLowerCase().split(/\s+/);
        const expansions: Set = new Set([query]);

        // Add synonym expansions
        words.forEach(word => {
            const synonyms = this.synonymMap.get(word);
            if (synonyms) {
                synonyms.forEach(synonym => {
                    const expanded = query.replace(new RegExp(`\\b${word}\\b`, 'gi'), synonym);
                    expansions.add(expanded);
                });
            }
        });

        // Add common query variations
        if (query.includes(' vs ') || query.includes(' versus ')) {
            expansions.add(query.replace(/\s+(vs|versus)\s+/i, ' compared to '));
            expansions.add(query.replace(/\s+(vs|versus)\s+/i, ' or '));
        }

        return Array.from(expansions);
    }

    private inferFilters(entities: Entity[]): Record {
        const filters: Record = {};

        entities.forEach(entity => {
            switch (entity.type) {
                case 'category':
                    filters.category = entity.text;
                    break;
                case 'brand':
                    filters.brand = entity.text;
                    break;
                case 'location':
                    filters.location = entity.text;
                    break;
            }
        });

        return filters;
    }

    private loadSynonyms(): Map {
        return new Map([
            ['laptop', ['notebook', 'computer', 'portable pc']],
            ['cheap', ['affordable', 'budget', 'inexpensive', 'low-cost']],
            ['fast', ['quick', 'speedy', 'rapid', 'high-performance']],
            ['best', ['top', 'leading', 'premier', 'highest-rated']],
            ['buy', ['purchase', 'order', 'get', 'acquire']],
            ['fix', ['repair', 'solve', 'resolve', 'troubleshoot']]
        ]);
    }

    private loadTypoCorrections(): Map {
        return new Map([
            ['recieve', 'receive'],
            ['definately', 'definitely'],
            ['seperate', 'separate'],
            ['occured', 'occurred'],
            ['untill', 'until'],
            ['javascript', 'JavaScript'],
            ['pythn', 'Python'],
            ['recat', 'React']
        ]);
    }

    private levenshteinDistance(a: string, b: string): number {
        const matrix: number[][] = [];

        for (let i = 0; i <= b.length; i++) {
            matrix[i] = [i];
        }
        for (let j = 0; j <= a.length; j++) {
            matrix[0][j] = j;
        }

        for (let i = 1; i <= b.length; i++) {
            for (let j = 1; j <= a.length; j++) {
                if (b.charAt(i - 1) === a.charAt(j - 1)) {
                    matrix[i][j] = matrix[i - 1][j - 1];
                } else {
                    matrix[i][j] = Math.min(
                        matrix[i - 1][j - 1] + 1,
                        matrix[i][j - 1] + 1,
                        matrix[i - 1][j] + 1
                    );
                }
            }
        }

        return matrix[b.length][a.length];
    }
}

// Enhanced search with query understanding
class EnhancedSearchService {
    private searchService: ElasticsearchHybridSearch;
    private queryService: QueryUnderstandingService;

    constructor(
        searchService: ElasticsearchHybridSearch,
        queryService: QueryUnderstandingService
    ) {
        this.searchService = searchService;
        this.queryService = queryService;
    }

    async search(query: string, options: { limit?: number } = {}): Promise<{
        results: SearchResult[];
        queryAnalysis: QueryAnalysis;
        didYouMean?: string;
    }> {
        const queryAnalysis = await this.queryService.analyzeQuery(query);
        const searchQuery = queryAnalysis.correctedQuery || query;

        // Search with expanded queries and combine results
        const allResults = await Promise.all(
            queryAnalysis.expandedQueries.slice(0, 3).map(q =>
                this.searchService.hybridSearch(q, {
                    limit: options.limit || 10,
                    filters: queryAnalysis.suggestedFilters
                })
            )
        );

        // Deduplicate and re-rank
        const resultMap = new Map();
        allResults.flat().forEach(result => {
            const existing = resultMap.get(result.id);
            if (!existing || result.score > existing.score) {
                resultMap.set(result.id, result);
            }
        });

        const results = Array.from(resultMap.values())
            .sort((a, b) => b.score - a.score)
            .slice(0, options.limit || 10);

        return {
            results,
            queryAnalysis,
            didYouMean: queryAnalysis.correctedQuery
        };
    }
}

Personalized Search Results

Combining semantic search with user preferences creates highly relevant, personalized experiences. This builds on concepts from our AI-Driven Personalization Engines article.

// personalized-search.ts

interface UserProfile {
    userId: string;
    preferences: {
        categories: string[];
        brands: string[];
        priceRange?: { min: number; max: number };
    };
    behavior: {
        recentSearches: string[];
        viewedItems: string[];
        purchasedItems: string[];
    };
    embedding?: number[]; // User preference embedding
}

class PersonalizedSearchService {
    private searchService: ElasticsearchHybridSearch;
    private embeddingService: OpenAIEmbeddingService;
    private userProfiles: Map = new Map();

    constructor(
        searchService: ElasticsearchHybridSearch,
        embeddingService: OpenAIEmbeddingService
    ) {
        this.searchService = searchService;
        this.embeddingService = embeddingService;
    }

    async updateUserProfile(userId: string, profile: Partial): Promise {
        const existing = this.userProfiles.get(userId) || {
            userId,
            preferences: { categories: [], brands: [] },
            behavior: { recentSearches: [], viewedItems: [], purchasedItems: [] }
        };

        const updated = {
            ...existing,
            ...profile,
            preferences: { ...existing.preferences, ...profile.preferences },
            behavior: { ...existing.behavior, ...profile.behavior }
        };

        // Generate user preference embedding
        const preferenceText = [
            ...updated.preferences.categories,
            ...updated.preferences.brands,
            ...updated.behavior.recentSearches.slice(-5)
        ].join(' ');

        if (preferenceText) {
            updated.embedding = await this.embeddingService.embed(preferenceText);
        }

        this.userProfiles.set(userId, updated);
    }

    async personalizedSearch(
        userId: string,
        query: string,
        options: { limit?: number; personalizationWeight?: number } = {}
    ): Promise {
        const { limit = 10, personalizationWeight = 0.3 } = options;
        const profile = this.userProfiles.get(userId);

        // Get base search results
        let results = await this.searchService.hybridSearch(query, { limit: limit * 2 });

        if (profile) {
            // Apply personalization boosts
            results = results.map(result => {
                let boost = 0;

                // Category preference boost
                if (profile.preferences.categories.includes(result.metadata.category)) {
                    boost += 0.2;
                }

                // Recently viewed items get a small boost
                if (profile.behavior.viewedItems.includes(result.id)) {
                    boost += 0.1;
                }

                // Embedding similarity boost (if user has preference embedding)
                if (profile.embedding && result.metadata.embedding) {
                    const similarity = this.cosineSimilarity(
                        profile.embedding,
                        result.metadata.embedding
                    );
                    boost += similarity * 0.15;
                }

                return {
                    ...result,
                    score: result.score * (1 + personalizationWeight * boost)
                };
            });

            // Re-sort by personalized score
            results.sort((a, b) => b.score - a.score);
        }

        return results.slice(0, limit);
    }

    // Track user behavior for continuous personalization
    trackSearch(userId: string, query: string): void {
        const profile = this.userProfiles.get(userId);
        if (profile) {
            profile.behavior.recentSearches = [
                query,
                ...profile.behavior.recentSearches.slice(0, 49)
            ];
        }
    }

    trackView(userId: string, itemId: string): void {
        const profile = this.userProfiles.get(userId);
        if (profile) {
            profile.behavior.viewedItems = [
                itemId,
                ...profile.behavior.viewedItems.filter(id => id !== itemId).slice(0, 99)
            ];
        }
    }

    private cosineSimilarity(a: number[], b: number[]): number {
        let dotProduct = 0;
        let normA = 0;
        let normB = 0;

        for (let i = 0; i < a.length; i++) {
            dotProduct += a[i] * b[i];
            normA += a[i] * a[i];
            normB += b[i] * b[i];
        }

        return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
    }
}

Measuring Search Relevance

Quantifying search quality improvements requires proper metrics. Here's how to measure and compare search implementations:

// search-metrics.ts

interface SearchMetrics {
    mrr: number;           // Mean Reciprocal Rank
    ndcg: number;          // Normalized Discounted Cumulative Gain
    precision: number;     // Precision@K
    recall: number;        // Recall@K
    map: number;           // Mean Average Precision
}

interface RelevanceJudgment {
    queryId: string;
    documentId: string;
    relevance: number;     // 0 = not relevant, 1 = somewhat, 2 = very relevant
}

class SearchMetricsCalculator {
    private judgments: Map> = new Map();

    loadJudgments(judgments: RelevanceJudgment[]): void {
        judgments.forEach(j => {
            if (!this.judgments.has(j.queryId)) {
                this.judgments.set(j.queryId, new Map());
            }
            this.judgments.get(j.queryId)!.set(j.documentId, j.relevance);
        });
    }

    // Mean Reciprocal Rank: position of first relevant result
    calculateMRR(queryId: string, results: SearchResult[]): number {
        const queryJudgments = this.judgments.get(queryId);
        if (!queryJudgments) return 0;

        for (let i = 0; i < results.length; i++) {
            const relevance = queryJudgments.get(results[i].id) || 0;
            if (relevance > 0) {
                return 1 / (i + 1);
            }
        }
        return 0;
    }

    // Normalized Discounted Cumulative Gain
    calculateNDCG(queryId: string, results: SearchResult[], k: number = 10): number {
        const queryJudgments = this.judgments.get(queryId);
        if (!queryJudgments) return 0;

        const dcg = this.calculateDCG(queryId, results.slice(0, k));
        const idealDCG = this.calculateIdealDCG(queryId, k);

        return idealDCG === 0 ? 0 : dcg / idealDCG;
    }

    private calculateDCG(queryId: string, results: SearchResult[]): number {
        const queryJudgments = this.judgments.get(queryId)!;
        let dcg = 0;

        results.forEach((result, i) => {
            const relevance = queryJudgments.get(result.id) || 0;
            dcg += (Math.pow(2, relevance) - 1) / Math.log2(i + 2);
        });

        return dcg;
    }

    private calculateIdealDCG(queryId: string, k: number): number {
        const queryJudgments = this.judgments.get(queryId)!;
        const sortedRelevances = Array.from(queryJudgments.values())
            .sort((a, b) => b - a)
            .slice(0, k);

        let idcg = 0;
        sortedRelevances.forEach((relevance, i) => {
            idcg += (Math.pow(2, relevance) - 1) / Math.log2(i + 2);
        });

        return idcg;
    }

    // Precision at K: fraction of retrieved docs that are relevant
    calculatePrecision(queryId: string, results: SearchResult[], k: number = 10): number {
        const queryJudgments = this.judgments.get(queryId);
        if (!queryJudgments) return 0;

        const topK = results.slice(0, k);
        const relevant = topK.filter(r => (queryJudgments.get(r.id) || 0) > 0);

        return relevant.length / k;
    }

    // Full evaluation across multiple queries
    async evaluateSearchSystem(
        searchFn: (query: string) => Promise,
        queries: { id: string; text: string }[],
        k: number = 10
    ): Promise {
        let totalMRR = 0;
        let totalNDCG = 0;
        let totalPrecision = 0;
        let totalMAP = 0;

        for (const query of queries) {
            const results = await searchFn(query.text);

            totalMRR += this.calculateMRR(query.id, results);
            totalNDCG += this.calculateNDCG(query.id, results, k);
            totalPrecision += this.calculatePrecision(query.id, results, k);
            totalMAP += this.calculateAveragePrecision(query.id, results);
        }

        const n = queries.length;

        return {
            mrr: totalMRR / n,
            ndcg: totalNDCG / n,
            precision: totalPrecision / n,
            recall: 0, // Requires knowing total relevant docs
            map: totalMAP / n
        };
    }

    private calculateAveragePrecision(queryId: string, results: SearchResult[]): number {
        const queryJudgments = this.judgments.get(queryId);
        if (!queryJudgments) return 0;

        let sum = 0;
        let relevantCount = 0;

        results.forEach((result, i) => {
            const relevance = queryJudgments.get(result.id) || 0;
            if (relevance > 0) {
                relevantCount++;
                sum += relevantCount / (i + 1);
            }
        });

        const totalRelevant = Array.from(queryJudgments.values()).filter(r => r > 0).length;
        return totalRelevant === 0 ? 0 : sum / totalRelevant;
    }

    // A/B test comparison
    async compareSearchSystems(
        systemA: (query: string) => Promise,
        systemB: (query: string) => Promise,
        queries: { id: string; text: string }[]
    ): Promise<{
        systemA: SearchMetrics;
        systemB: SearchMetrics;
        improvement: { metric: string; percentage: number }[];
    }> {
        const metricsA = await this.evaluateSearchSystem(systemA, queries);
        const metricsB = await this.evaluateSearchSystem(systemB, queries);

        const improvement = [
            { metric: 'MRR', percentage: ((metricsB.mrr - metricsA.mrr) / metricsA.mrr) * 100 },
            { metric: 'NDCG', percentage: ((metricsB.ndcg - metricsA.ndcg) / metricsA.ndcg) * 100 },
            { metric: 'Precision', percentage: ((metricsB.precision - metricsA.precision) / metricsA.precision) * 100 },
            { metric: 'MAP', percentage: ((metricsB.map - metricsA.map) / metricsA.map) * 100 }
        ];

        return { systemA: metricsA, systemB: metricsB, improvement };
    }
}

Key Takeaways

Remember These Points

Semantic search understands meaning: Unlike keyword matching, vector embeddings capture conceptual similarity, finding relevant results even without matching terms
Hybrid search delivers best results: Combining BM25 keyword search with vector similarity (typically 30/70 or 50/50 weighting) outperforms either approach alone
Choose the right vector database: Pinecone for serverless simplicity, Weaviate for built-in AI modules, Elasticsearch for full-text + vector hybrid
Query understanding improves relevance: Intent classification, entity extraction, and query expansion help bridge the gap between user queries and document content
Personalization adds context: User preferences and behavior signals can boost search relevance by 15-25% for returning users
Measure improvements rigorously: Use NDCG, MRR, and precision metrics to quantify the 40%+ relevance improvements semantic search delivers
Consider embedding costs: OpenAI text-embedding-3-small costs $0.02/1M tokens; cache embeddings and batch operations to optimize spend

Conclusion

AI-enhanced search with vector embeddings and semantic understanding represents a fundamental advancement in how users discover content. The 40-60% improvements in search relevance translate directly to better user experiences, higher conversion rates, and reduced bounce rates from failed searches.

Start with a hybrid approach combining Elasticsearch's proven BM25 with vector search capabilities. Add query understanding to handle user intent variations, then layer in personalization for returning users. The investment in modern search infrastructure pays dividends across all metrics that matter.

For deeper exploration, review our related articles on Building Custom AI Assistants with LangChain which covers RAG patterns, and AI-Driven Personalization Engines for recommendation system fundamentals. The Pinecone Learning Center and Weaviate Academy offer excellent deep-dives into vector database concepts.