Traditional keyword-based search is fundamentally limited: it cannot understand that "affordable laptop" and "budget computer" mean the same thing. Users searching for "how to fix a slow website" won't find your excellent article titled "Web Performance Optimization Techniques." This semantic gap costs businesses significantly. According to Algolia research, semantic search implementations show 40-60% improvements in search relevance over traditional keyword matching.
In this comprehensive guide, we'll explore how vector embeddings and semantic search transform user experiences. You'll learn to implement production-ready search systems using Elasticsearch, Pinecone, and Weaviate, create embedding pipelines with OpenAI, and build hybrid search systems that combine the best of keyword and semantic approaches.
Understanding Semantic Search
Semantic search represents a paradigm shift from matching strings to understanding meaning. While keyword search asks "which documents contain these exact words?", semantic search asks "which documents are about the same concepts as this query?"
The Evolution of Search
Search technology has evolved through distinct generations:
// search-evolution.ts
// Generation 1: Exact Match (1990s)
// "laptop" only finds documents with "laptop"
const exactMatch = (query: string, documents: string[]) => {
return documents.filter(doc =>
doc.toLowerCase().includes(query.toLowerCase())
);
};
// Generation 2: TF-IDF / BM25 (2000s)
// Ranks by term frequency and inverse document frequency
interface BM25Result {
document: string;
score: number;
}
const bm25Search = (query: string, documents: string[]): BM25Result[] => {
const queryTerms = query.toLowerCase().split(' ');
const k1 = 1.2;
const b = 0.75;
const avgDocLength = documents.reduce((sum, d) => sum + d.length, 0) / documents.length;
return documents.map(doc => {
const docTerms = doc.toLowerCase().split(' ');
let score = 0;
queryTerms.forEach(term => {
const tf = docTerms.filter(t => t === term).length;
const idf = Math.log((documents.length + 1) /
(documents.filter(d => d.includes(term)).length + 1));
const docLength = docTerms.length;
score += idf * ((tf * (k1 + 1)) /
(tf + k1 * (1 - b + b * (docLength / avgDocLength))));
});
return { document: doc, score };
}).sort((a, b) => b.score - a.score);
};
// Generation 3: Semantic Search (2020s)
// Understands meaning through vector embeddings
interface SemanticResult {
document: string;
similarity: number;
embedding: number[];
}
const semanticSearch = async (
query: string,
documentEmbeddings: Map,
embeddingModel: EmbeddingModel
): Promise => {
const queryEmbedding = await embeddingModel.embed(query);
const results: SemanticResult[] = [];
documentEmbeddings.forEach((embedding, document) => {
const similarity = cosineSimilarity(queryEmbedding, embedding);
results.push({ document, similarity, embedding });
});
return results.sort((a, b) => b.similarity - a.similarity);
};
// Cosine similarity: measures angle between vectors
const cosineSimilarity = (a: number[], b: number[]): number => {
let dotProduct = 0;
let normA = 0;
let normB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
};
Vector Embeddings: The Foundation
Vector embeddings are the mathematical representation that makes semantic search possible. Modern embedding models like OpenAI's text-embedding-3 convert text into high-dimensional vectors where semantic similarity translates to spatial proximity.
Creating Embeddings with OpenAI
// embedding-service.ts
import OpenAI from 'openai';
interface EmbeddingConfig {
model: 'text-embedding-3-small' | 'text-embedding-3-large' | 'text-embedding-ada-002';
dimensions?: number; // Only for text-embedding-3 models
batchSize: number;
}
interface EmbeddingResult {
text: string;
embedding: number[];
tokens: number;
}
class OpenAIEmbeddingService {
private client: OpenAI;
private config: EmbeddingConfig;
private cache: Map = new Map();
constructor(apiKey: string, config: Partial = {}) {
this.client = new OpenAI({ apiKey });
this.config = {
model: config.model || 'text-embedding-3-small',
dimensions: config.dimensions || 1536,
batchSize: config.batchSize || 100
};
}
// Generate embedding for single text
async embed(text: string): Promise {
// Check cache first
const cacheKey = this.getCacheKey(text);
if (this.cache.has(cacheKey)) {
return this.cache.get(cacheKey)!;
}
const response = await this.client.embeddings.create({
model: this.config.model,
input: text,
dimensions: this.config.dimensions
});
const embedding = response.data[0].embedding;
this.cache.set(cacheKey, embedding);
return embedding;
}
// Batch embed multiple texts
async embedBatch(texts: string[]): Promise {
const results: EmbeddingResult[] = [];
const uncachedTexts: { text: string; index: number }[] = [];
// Check cache and identify uncached texts
texts.forEach((text, index) => {
const cacheKey = this.getCacheKey(text);
if (this.cache.has(cacheKey)) {
results[index] = {
text,
embedding: this.cache.get(cacheKey)!,
tokens: 0
};
} else {
uncachedTexts.push({ text, index });
}
});
// Process uncached texts in batches
for (let i = 0; i < uncachedTexts.length; i += this.config.batchSize) {
const batch = uncachedTexts.slice(i, i + this.config.batchSize);
const batchTexts = batch.map(item => item.text);
const response = await this.client.embeddings.create({
model: this.config.model,
input: batchTexts,
dimensions: this.config.dimensions
});
response.data.forEach((data, j) => {
const originalIndex = batch[j].index;
const text = batch[j].text;
results[originalIndex] = {
text,
embedding: data.embedding,
tokens: response.usage?.total_tokens || 0
};
this.cache.set(this.getCacheKey(text), data.embedding);
});
// Rate limiting: wait between batches
if (i + this.config.batchSize < uncachedTexts.length) {
await this.sleep(100);
}
}
return results;
}
// Chunk text for embedding (handles long documents)
chunkText(text: string, maxTokens: number = 8000): string[] {
const chunks: string[] = [];
const sentences = text.split(/[.!?]+\s+/);
let currentChunk = '';
for (const sentence of sentences) {
// Rough token estimate: 1 token ~= 4 characters
const estimatedTokens = (currentChunk + sentence).length / 4;
if (estimatedTokens > maxTokens && currentChunk) {
chunks.push(currentChunk.trim());
currentChunk = sentence;
} else {
currentChunk += (currentChunk ? '. ' : '') + sentence;
}
}
if (currentChunk) {
chunks.push(currentChunk.trim());
}
return chunks;
}
// Embed long document with chunking and averaging
async embedDocument(document: string): Promise {
const chunks = this.chunkText(document);
const embeddings = await this.embedBatch(chunks);
// Average all chunk embeddings
const dimension = embeddings[0].embedding.length;
const averaged = new Array(dimension).fill(0);
embeddings.forEach(result => {
result.embedding.forEach((val, i) => {
averaged[i] += val / embeddings.length;
});
});
// Normalize the averaged vector
const norm = Math.sqrt(averaged.reduce((sum, val) => sum + val * val, 0));
return averaged.map(val => val / norm);
}
private getCacheKey(text: string): string {
return `${this.config.model}:${this.config.dimensions}:${text.substring(0, 100)}`;
}
private sleep(ms: number): Promise {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
// Usage example
const embeddingService = new OpenAIEmbeddingService(process.env.OPENAI_API_KEY!, {
model: 'text-embedding-3-small',
dimensions: 1536
});
// Embed a query
const queryEmbedding = await embeddingService.embed("best practices for React performance");
// Batch embed documents
const documents = [
"React optimization techniques including memoization and lazy loading",
"Vue.js component lifecycle and performance tips",
"Angular change detection strategies for faster apps"
];
const docEmbeddings = await embeddingService.embedBatch(documents);
Vector Databases: Pinecone, Weaviate, and Qdrant
Vector databases are purpose-built for storing and querying embeddings at scale. They use specialized indexing algorithms like HNSW (Hierarchical Navigable Small World) to enable sub-second similarity searches across millions of vectors.
Pinecone Implementation
// pinecone-search.ts
import { Pinecone } from '@pinecone-database/pinecone';
interface Document {
id: string;
content: string;
metadata: {
title: string;
category: string;
author: string;
createdAt: string;
tags: string[];
};
}
interface SearchResult {
id: string;
score: number;
metadata: Document['metadata'];
content?: string;
}
class PineconeSearchService {
private client: Pinecone;
private indexName: string;
private embeddingService: OpenAIEmbeddingService;
private namespace: string;
constructor(
apiKey: string,
indexName: string,
embeddingService: OpenAIEmbeddingService,
namespace: string = 'default'
) {
this.client = new Pinecone({ apiKey });
this.indexName = indexName;
this.embeddingService = embeddingService;
this.namespace = namespace;
}
// Initialize index (run once)
async createIndex(dimension: number = 1536): Promise {
const existingIndexes = await this.client.listIndexes();
if (!existingIndexes.indexes?.find(idx => idx.name === this.indexName)) {
await this.client.createIndex({
name: this.indexName,
dimension,
metric: 'cosine',
spec: {
serverless: {
cloud: 'aws',
region: 'us-east-1'
}
}
});
// Wait for index to be ready
await this.waitForIndexReady();
}
}
private async waitForIndexReady(): Promise {
let ready = false;
while (!ready) {
const description = await this.client.describeIndex(this.indexName);
ready = description.status?.ready || false;
if (!ready) {
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
}
// Index documents
async indexDocuments(documents: Document[]): Promise {
const index = this.client.index(this.indexName);
// Generate embeddings for all documents
const embeddings = await this.embeddingService.embedBatch(
documents.map(doc => doc.content)
);
// Prepare vectors for upsert
const vectors = documents.map((doc, i) => ({
id: doc.id,
values: embeddings[i].embedding,
metadata: {
...doc.metadata,
content: doc.content.substring(0, 1000) // Store truncated content
}
}));
// Upsert in batches of 100
const batchSize = 100;
for (let i = 0; i < vectors.length; i += batchSize) {
const batch = vectors.slice(i, i + batchSize);
await index.namespace(this.namespace).upsert(batch);
}
console.log(`Indexed ${documents.length} documents`);
}
// Semantic search
async search(
query: string,
options: {
topK?: number;
filter?: Record;
includeMetadata?: boolean;
} = {}
): Promise {
const { topK = 10, filter, includeMetadata = true } = options;
const index = this.client.index(this.indexName);
const queryEmbedding = await this.embeddingService.embed(query);
const results = await index.namespace(this.namespace).query({
vector: queryEmbedding,
topK,
filter,
includeMetadata
});
return results.matches?.map(match => ({
id: match.id,
score: match.score || 0,
metadata: match.metadata as Document['metadata'],
content: (match.metadata as any)?.content
})) || [];
}
// Search with metadata filtering
async searchWithFilters(
query: string,
filters: {
category?: string;
tags?: string[];
dateRange?: { start: string; end: string };
}
): Promise {
const filter: Record = {};
if (filters.category) {
filter.category = { $eq: filters.category };
}
if (filters.tags?.length) {
filter.tags = { $in: filters.tags };
}
if (filters.dateRange) {
filter.createdAt = {
$gte: filters.dateRange.start,
$lte: filters.dateRange.end
};
}
return this.search(query, { filter, topK: 20 });
}
// Delete documents
async deleteDocuments(ids: string[]): Promise {
const index = this.client.index(this.indexName);
await index.namespace(this.namespace).deleteMany(ids);
}
// Update document
async updateDocument(document: Document): Promise {
await this.deleteDocuments([document.id]);
await this.indexDocuments([document]);
}
}
// Usage
const searchService = new PineconeSearchService(
process.env.PINECONE_API_KEY!,
'my-search-index',
embeddingService
);
// Search with filters
const results = await searchService.searchWithFilters(
"React performance optimization",
{
category: "frontend",
tags: ["react", "performance"]
}
);
Weaviate Implementation
// weaviate-search.ts
import weaviate, { WeaviateClient, ApiKey } from 'weaviate-ts-client';
interface WeaviateConfig {
host: string;
apiKey?: string;
openAIKey: string;
}
class WeaviateSearchService {
private client: WeaviateClient;
private className: string;
constructor(config: WeaviateConfig, className: string = 'Document') {
this.client = weaviate.client({
scheme: 'https',
host: config.host,
apiKey: config.apiKey ? new ApiKey(config.apiKey) : undefined,
headers: {
'X-OpenAI-Api-Key': config.openAIKey
}
});
this.className = className;
}
// Create schema with vectorizer
async createSchema(): Promise {
const schema = {
class: this.className,
vectorizer: 'text2vec-openai',
moduleConfig: {
'text2vec-openai': {
model: 'text-embedding-3-small',
dimensions: 1536,
type: 'text'
},
'generative-openai': {
model: 'gpt-4o-mini'
}
},
properties: [
{
name: 'title',
dataType: ['text'],
moduleConfig: {
'text2vec-openai': {
skip: false,
vectorizePropertyName: false
}
}
},
{
name: 'content',
dataType: ['text'],
moduleConfig: {
'text2vec-openai': {
skip: false,
vectorizePropertyName: false
}
}
},
{
name: 'category',
dataType: ['text'],
moduleConfig: {
'text2vec-openai': {
skip: true // Don't vectorize metadata fields
}
}
},
{
name: 'author',
dataType: ['text'],
moduleConfig: {
'text2vec-openai': { skip: true }
}
},
{
name: 'tags',
dataType: ['text[]']
},
{
name: 'createdAt',
dataType: ['date']
}
]
};
try {
await this.client.schema.classCreator().withClass(schema).do();
console.log(`Created class ${this.className}`);
} catch (error: any) {
if (error.message?.includes('already exists')) {
console.log(`Class ${this.className} already exists`);
} else {
throw error;
}
}
}
// Index documents (Weaviate handles vectorization automatically)
async indexDocuments(documents: Document[]): Promise {
let batcher = this.client.batch.objectsBatcher();
let batchSize = 0;
for (const doc of documents) {
batcher = batcher.withObject({
class: this.className,
properties: {
title: doc.metadata.title,
content: doc.content,
category: doc.metadata.category,
author: doc.metadata.author,
tags: doc.metadata.tags,
createdAt: doc.metadata.createdAt
},
id: doc.id
});
batchSize++;
if (batchSize >= 100) {
await batcher.do();
batcher = this.client.batch.objectsBatcher();
batchSize = 0;
}
}
if (batchSize > 0) {
await batcher.do();
}
console.log(`Indexed ${documents.length} documents`);
}
// Semantic search with nearText
async search(
query: string,
options: {
limit?: number;
offset?: number;
filters?: {
category?: string;
tags?: string[];
};
} = {}
): Promise {
const { limit = 10, offset = 0, filters } = options;
let queryBuilder = this.client.graphql
.get()
.withClassName(this.className)
.withFields('title content category author tags createdAt _additional { id certainty distance }')
.withNearText({ concepts: [query] })
.withLimit(limit)
.withOffset(offset);
// Add filters if provided
if (filters) {
const whereFilter = this.buildWhereFilter(filters);
if (whereFilter) {
queryBuilder = queryBuilder.withWhere(whereFilter);
}
}
const result = await queryBuilder.do();
return result.data.Get[this.className]?.map((item: any) => ({
id: item._additional.id,
score: item._additional.certainty,
metadata: {
title: item.title,
category: item.category,
author: item.author,
tags: item.tags,
createdAt: item.createdAt
},
content: item.content
})) || [];
}
// Hybrid search (combines BM25 + vector search)
async hybridSearch(
query: string,
options: {
limit?: number;
alpha?: number; // 0 = pure BM25, 1 = pure vector
} = {}
): Promise {
const { limit = 10, alpha = 0.5 } = options;
const result = await this.client.graphql
.get()
.withClassName(this.className)
.withFields('title content category author _additional { id score }')
.withHybrid({
query,
alpha // Balance between keyword and semantic
})
.withLimit(limit)
.do();
return result.data.Get[this.className]?.map((item: any) => ({
id: item._additional.id,
score: item._additional.score,
metadata: {
title: item.title,
category: item.category,
author: item.author
},
content: item.content
})) || [];
}
// Generative search (RAG - search + generate answer)
async generateAnswer(
query: string,
options: { limit?: number } = {}
): Promise<{ answer: string; sources: SearchResult[] }> {
const { limit = 5 } = options;
const result = await this.client.graphql
.get()
.withClassName(this.className)
.withFields('title content _additional { id certainty }')
.withNearText({ concepts: [query] })
.withLimit(limit)
.withGenerate({
groupedTask: `Based on the following documents, answer this question: "${query}".
Provide a comprehensive answer and cite the relevant sources.`
})
.do();
const data = result.data.Get[this.className];
return {
answer: data?._additional?.generate?.groupedResult || 'No answer generated',
sources: data?.map((item: any) => ({
id: item._additional.id,
score: item._additional.certainty,
metadata: { title: item.title },
content: item.content
})) || []
};
}
private buildWhereFilter(filters: { category?: string; tags?: string[] }): any {
const operands: any[] = [];
if (filters.category) {
operands.push({
path: ['category'],
operator: 'Equal',
valueText: filters.category
});
}
if (filters.tags?.length) {
operands.push({
path: ['tags'],
operator: 'ContainsAny',
valueTextArray: filters.tags
});
}
if (operands.length === 0) return null;
if (operands.length === 1) return operands[0];
return {
operator: 'And',
operands
};
}
}
Elasticsearch with Vector Search
Elasticsearch 8.x now supports native vector search alongside its powerful full-text capabilities, making it ideal for hybrid search implementations.
// elasticsearch-hybrid-search.ts
import { Client } from '@elastic/elasticsearch';
interface ElasticsearchConfig {
node: string;
auth?: {
username: string;
password: string;
};
cloud?: {
id: string;
};
}
class ElasticsearchHybridSearch {
private client: Client;
private indexName: string;
private embeddingService: OpenAIEmbeddingService;
constructor(
config: ElasticsearchConfig,
indexName: string,
embeddingService: OpenAIEmbeddingService
) {
this.client = new Client(config);
this.indexName = indexName;
this.embeddingService = embeddingService;
}
// Create index with vector field
async createIndex(): Promise {
const indexExists = await this.client.indices.exists({ index: this.indexName });
if (indexExists) {
console.log(`Index ${this.indexName} already exists`);
return;
}
await this.client.indices.create({
index: this.indexName,
body: {
settings: {
number_of_shards: 1,
number_of_replicas: 1,
'index.knn': true
},
mappings: {
properties: {
title: {
type: 'text',
analyzer: 'english',
fields: {
keyword: { type: 'keyword' }
}
},
content: {
type: 'text',
analyzer: 'english'
},
category: { type: 'keyword' },
author: { type: 'keyword' },
tags: { type: 'keyword' },
createdAt: { type: 'date' },
embedding: {
type: 'dense_vector',
dims: 1536,
index: true,
similarity: 'cosine'
}
}
}
}
});
console.log(`Created index ${this.indexName}`);
}
// Index documents with embeddings
async indexDocuments(documents: Document[]): Promise {
const embeddings = await this.embeddingService.embedBatch(
documents.map(doc => doc.content)
);
const operations = documents.flatMap((doc, i) => [
{ index: { _index: this.indexName, _id: doc.id } },
{
title: doc.metadata.title,
content: doc.content,
category: doc.metadata.category,
author: doc.metadata.author,
tags: doc.metadata.tags,
createdAt: doc.metadata.createdAt,
embedding: embeddings[i].embedding
}
]);
const result = await this.client.bulk({ operations, refresh: true });
if (result.errors) {
console.error('Bulk indexing errors:', result.items.filter(item => item.index?.error));
} else {
console.log(`Indexed ${documents.length} documents`);
}
}
// Pure keyword search (BM25)
async keywordSearch(query: string, limit: number = 10): Promise {
const result = await this.client.search({
index: this.indexName,
body: {
query: {
multi_match: {
query,
fields: ['title^2', 'content', 'tags'],
type: 'best_fields',
fuzziness: 'AUTO'
}
},
size: limit
}
});
return this.formatResults(result);
}
// Pure vector search (semantic)
async vectorSearch(query: string, limit: number = 10): Promise {
const queryEmbedding = await this.embeddingService.embed(query);
const result = await this.client.search({
index: this.indexName,
body: {
knn: {
field: 'embedding',
query_vector: queryEmbedding,
k: limit,
num_candidates: limit * 10
},
_source: {
excludes: ['embedding'] // Don't return the embedding
}
}
});
return this.formatResults(result);
}
// Hybrid search: combines BM25 and vector search
async hybridSearch(
query: string,
options: {
limit?: number;
keywordWeight?: number;
vectorWeight?: number;
filters?: {
category?: string;
tags?: string[];
dateRange?: { start: string; end: string };
};
} = {}
): Promise {
const {
limit = 10,
keywordWeight = 0.3,
vectorWeight = 0.7,
filters
} = options;
const queryEmbedding = await this.embeddingService.embed(query);
// Build filter query
const filterClauses: any[] = [];
if (filters?.category) {
filterClauses.push({ term: { category: filters.category } });
}
if (filters?.tags?.length) {
filterClauses.push({ terms: { tags: filters.tags } });
}
if (filters?.dateRange) {
filterClauses.push({
range: {
createdAt: {
gte: filters.dateRange.start,
lte: filters.dateRange.end
}
}
});
}
const result = await this.client.search({
index: this.indexName,
body: {
query: {
bool: {
must: [
{
// Hybrid scoring using script_score
script_score: {
query: {
bool: {
should: [
{
multi_match: {
query,
fields: ['title^2', 'content'],
type: 'best_fields',
boost: keywordWeight
}
}
],
filter: filterClauses.length > 0 ? filterClauses : undefined
}
},
script: {
source: `
double vectorScore = cosineSimilarity(params.queryVector, 'embedding') + 1.0;
double keywordScore = _score;
return (params.vectorWeight * vectorScore) + (params.keywordWeight * keywordScore);
`,
params: {
queryVector: queryEmbedding,
vectorWeight,
keywordWeight
}
}
}
}
]
}
},
size: limit,
_source: {
excludes: ['embedding']
}
}
});
return this.formatResults(result);
}
// Reciprocal Rank Fusion (RRF) for hybrid search
async rrfHybridSearch(
query: string,
limit: number = 10,
k: number = 60 // RRF constant
): Promise {
// Get results from both search methods
const [keywordResults, vectorResults] = await Promise.all([
this.keywordSearch(query, limit * 2),
this.vectorSearch(query, limit * 2)
]);
// Calculate RRF scores
const rrfScores = new Map();
keywordResults.forEach((result, rank) => {
const rrfScore = 1 / (k + rank + 1);
rrfScores.set(result.id, {
score: rrfScore,
data: result
});
});
vectorResults.forEach((result, rank) => {
const rrfScore = 1 / (k + rank + 1);
const existing = rrfScores.get(result.id);
if (existing) {
existing.score += rrfScore;
} else {
rrfScores.set(result.id, {
score: rrfScore,
data: result
});
}
});
// Sort by combined RRF score
const combinedResults = Array.from(rrfScores.values())
.sort((a, b) => b.score - a.score)
.slice(0, limit)
.map(item => ({
...item.data,
score: item.score
}));
return combinedResults;
}
// Autocomplete with semantic boosting
async autocomplete(
prefix: string,
limit: number = 5
): Promise<{ suggestion: string; score: number }[]> {
const result = await this.client.search({
index: this.indexName,
body: {
query: {
bool: {
should: [
{
prefix: {
'title.keyword': {
value: prefix.toLowerCase(),
boost: 2
}
}
},
{
match_phrase_prefix: {
title: {
query: prefix,
boost: 1.5
}
}
}
]
}
},
size: limit,
_source: ['title']
}
});
return result.hits.hits.map((hit: any) => ({
suggestion: hit._source.title,
score: hit._score
}));
}
private formatResults(result: any): SearchResult[] {
return result.hits.hits.map((hit: any) => ({
id: hit._id,
score: hit._score,
metadata: {
title: hit._source.title,
category: hit._source.category,
author: hit._source.author,
tags: hit._source.tags,
createdAt: hit._source.createdAt
},
content: hit._source.content
}));
}
}
Query Understanding and Expansion
Advanced search systems understand user intent and expand queries to improve recall. This involves synonym expansion, typo correction, and intent classification.
// query-understanding.ts
import OpenAI from 'openai';
interface QueryAnalysis {
originalQuery: string;
intent: 'informational' | 'navigational' | 'transactional' | 'ambiguous';
entities: Entity[];
expandedQueries: string[];
suggestedFilters: Record;
correctedQuery?: string;
}
interface Entity {
text: string;
type: 'product' | 'category' | 'brand' | 'attribute' | 'location';
confidence: number;
}
class QueryUnderstandingService {
private openai: OpenAI;
private synonymMap: Map;
private commonTypos: Map;
constructor(apiKey: string) {
this.openai = new OpenAI({ apiKey });
this.synonymMap = this.loadSynonyms();
this.commonTypos = this.loadTypoCorrections();
}
async analyzeQuery(query: string): Promise {
const [
intentResult,
entities,
corrections
] = await Promise.all([
this.classifyIntent(query),
this.extractEntities(query),
this.correctTypos(query)
]);
const expandedQueries = this.expandQuery(corrections.corrected || query);
const suggestedFilters = this.inferFilters(entities);
return {
originalQuery: query,
intent: intentResult,
entities,
expandedQueries,
suggestedFilters,
correctedQuery: corrections.corrected !== query ? corrections.corrected : undefined
};
}
private async classifyIntent(query: string): Promise {
const response = await this.openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{
role: 'system',
content: `Classify the search query intent:
- informational: seeking knowledge or answers
- navigational: looking for a specific page/product
- transactional: intent to buy or take action
- ambiguous: unclear intent
Respond with only the intent type.`
},
{ role: 'user', content: query }
],
max_tokens: 20
});
const intent = response.choices[0].message.content?.trim().toLowerCase();
return ['informational', 'navigational', 'transactional'].includes(intent!)
? intent as QueryAnalysis['intent']
: 'ambiguous';
}
private async extractEntities(query: string): Promise {
const response = await this.openai.chat.completions.create({
model: 'gpt-4o-mini',
messages: [
{
role: 'system',
content: `Extract entities from the search query. Return JSON array:
[{"text": "entity", "type": "product|category|brand|attribute|location", "confidence": 0.9}]
Only return the JSON array, no explanation.`
},
{ role: 'user', content: query }
],
max_tokens: 200
});
try {
return JSON.parse(response.choices[0].message.content || '[]');
} catch {
return [];
}
}
private correctTypos(query: string): { corrected: string; changes: string[] } {
const words = query.split(/\s+/);
const changes: string[] = [];
const correctedWords = words.map(word => {
const lowerWord = word.toLowerCase();
if (this.commonTypos.has(lowerWord)) {
const correction = this.commonTypos.get(lowerWord)!;
changes.push(`${word} -> ${correction}`);
return correction;
}
// Levenshtein distance check for near matches
for (const [typo, correction] of this.commonTypos.entries()) {
if (this.levenshteinDistance(lowerWord, typo) <= 1) {
changes.push(`${word} -> ${correction}`);
return correction;
}
}
return word;
});
return {
corrected: correctedWords.join(' '),
changes
};
}
private expandQuery(query: string): string[] {
const words = query.toLowerCase().split(/\s+/);
const expansions: Set = new Set([query]);
// Add synonym expansions
words.forEach(word => {
const synonyms = this.synonymMap.get(word);
if (synonyms) {
synonyms.forEach(synonym => {
const expanded = query.replace(new RegExp(`\\b${word}\\b`, 'gi'), synonym);
expansions.add(expanded);
});
}
});
// Add common query variations
if (query.includes(' vs ') || query.includes(' versus ')) {
expansions.add(query.replace(/\s+(vs|versus)\s+/i, ' compared to '));
expansions.add(query.replace(/\s+(vs|versus)\s+/i, ' or '));
}
return Array.from(expansions);
}
private inferFilters(entities: Entity[]): Record {
const filters: Record = {};
entities.forEach(entity => {
switch (entity.type) {
case 'category':
filters.category = entity.text;
break;
case 'brand':
filters.brand = entity.text;
break;
case 'location':
filters.location = entity.text;
break;
}
});
return filters;
}
private loadSynonyms(): Map {
return new Map([
['laptop', ['notebook', 'computer', 'portable pc']],
['cheap', ['affordable', 'budget', 'inexpensive', 'low-cost']],
['fast', ['quick', 'speedy', 'rapid', 'high-performance']],
['best', ['top', 'leading', 'premier', 'highest-rated']],
['buy', ['purchase', 'order', 'get', 'acquire']],
['fix', ['repair', 'solve', 'resolve', 'troubleshoot']]
]);
}
private loadTypoCorrections(): Map {
return new Map([
['recieve', 'receive'],
['definately', 'definitely'],
['seperate', 'separate'],
['occured', 'occurred'],
['untill', 'until'],
['javascript', 'JavaScript'],
['pythn', 'Python'],
['recat', 'React']
]);
}
private levenshteinDistance(a: string, b: string): number {
const matrix: number[][] = [];
for (let i = 0; i <= b.length; i++) {
matrix[i] = [i];
}
for (let j = 0; j <= a.length; j++) {
matrix[0][j] = j;
}
for (let i = 1; i <= b.length; i++) {
for (let j = 1; j <= a.length; j++) {
if (b.charAt(i - 1) === a.charAt(j - 1)) {
matrix[i][j] = matrix[i - 1][j - 1];
} else {
matrix[i][j] = Math.min(
matrix[i - 1][j - 1] + 1,
matrix[i][j - 1] + 1,
matrix[i - 1][j] + 1
);
}
}
}
return matrix[b.length][a.length];
}
}
// Enhanced search with query understanding
class EnhancedSearchService {
private searchService: ElasticsearchHybridSearch;
private queryService: QueryUnderstandingService;
constructor(
searchService: ElasticsearchHybridSearch,
queryService: QueryUnderstandingService
) {
this.searchService = searchService;
this.queryService = queryService;
}
async search(query: string, options: { limit?: number } = {}): Promise<{
results: SearchResult[];
queryAnalysis: QueryAnalysis;
didYouMean?: string;
}> {
const queryAnalysis = await this.queryService.analyzeQuery(query);
const searchQuery = queryAnalysis.correctedQuery || query;
// Search with expanded queries and combine results
const allResults = await Promise.all(
queryAnalysis.expandedQueries.slice(0, 3).map(q =>
this.searchService.hybridSearch(q, {
limit: options.limit || 10,
filters: queryAnalysis.suggestedFilters
})
)
);
// Deduplicate and re-rank
const resultMap = new Map();
allResults.flat().forEach(result => {
const existing = resultMap.get(result.id);
if (!existing || result.score > existing.score) {
resultMap.set(result.id, result);
}
});
const results = Array.from(resultMap.values())
.sort((a, b) => b.score - a.score)
.slice(0, options.limit || 10);
return {
results,
queryAnalysis,
didYouMean: queryAnalysis.correctedQuery
};
}
}
Personalized Search Results
Combining semantic search with user preferences creates highly relevant, personalized experiences. This builds on concepts from our AI-Driven Personalization Engines article.
// personalized-search.ts
interface UserProfile {
userId: string;
preferences: {
categories: string[];
brands: string[];
priceRange?: { min: number; max: number };
};
behavior: {
recentSearches: string[];
viewedItems: string[];
purchasedItems: string[];
};
embedding?: number[]; // User preference embedding
}
class PersonalizedSearchService {
private searchService: ElasticsearchHybridSearch;
private embeddingService: OpenAIEmbeddingService;
private userProfiles: Map = new Map();
constructor(
searchService: ElasticsearchHybridSearch,
embeddingService: OpenAIEmbeddingService
) {
this.searchService = searchService;
this.embeddingService = embeddingService;
}
async updateUserProfile(userId: string, profile: Partial): Promise {
const existing = this.userProfiles.get(userId) || {
userId,
preferences: { categories: [], brands: [] },
behavior: { recentSearches: [], viewedItems: [], purchasedItems: [] }
};
const updated = {
...existing,
...profile,
preferences: { ...existing.preferences, ...profile.preferences },
behavior: { ...existing.behavior, ...profile.behavior }
};
// Generate user preference embedding
const preferenceText = [
...updated.preferences.categories,
...updated.preferences.brands,
...updated.behavior.recentSearches.slice(-5)
].join(' ');
if (preferenceText) {
updated.embedding = await this.embeddingService.embed(preferenceText);
}
this.userProfiles.set(userId, updated);
}
async personalizedSearch(
userId: string,
query: string,
options: { limit?: number; personalizationWeight?: number } = {}
): Promise {
const { limit = 10, personalizationWeight = 0.3 } = options;
const profile = this.userProfiles.get(userId);
// Get base search results
let results = await this.searchService.hybridSearch(query, { limit: limit * 2 });
if (profile) {
// Apply personalization boosts
results = results.map(result => {
let boost = 0;
// Category preference boost
if (profile.preferences.categories.includes(result.metadata.category)) {
boost += 0.2;
}
// Recently viewed items get a small boost
if (profile.behavior.viewedItems.includes(result.id)) {
boost += 0.1;
}
// Embedding similarity boost (if user has preference embedding)
if (profile.embedding && result.metadata.embedding) {
const similarity = this.cosineSimilarity(
profile.embedding,
result.metadata.embedding
);
boost += similarity * 0.15;
}
return {
...result,
score: result.score * (1 + personalizationWeight * boost)
};
});
// Re-sort by personalized score
results.sort((a, b) => b.score - a.score);
}
return results.slice(0, limit);
}
// Track user behavior for continuous personalization
trackSearch(userId: string, query: string): void {
const profile = this.userProfiles.get(userId);
if (profile) {
profile.behavior.recentSearches = [
query,
...profile.behavior.recentSearches.slice(0, 49)
];
}
}
trackView(userId: string, itemId: string): void {
const profile = this.userProfiles.get(userId);
if (profile) {
profile.behavior.viewedItems = [
itemId,
...profile.behavior.viewedItems.filter(id => id !== itemId).slice(0, 99)
];
}
}
private cosineSimilarity(a: number[], b: number[]): number {
let dotProduct = 0;
let normA = 0;
let normB = 0;
for (let i = 0; i < a.length; i++) {
dotProduct += a[i] * b[i];
normA += a[i] * a[i];
normB += b[i] * b[i];
}
return dotProduct / (Math.sqrt(normA) * Math.sqrt(normB));
}
}
Measuring Search Relevance
Quantifying search quality improvements requires proper metrics. Here's how to measure and compare search implementations:
// search-metrics.ts
interface SearchMetrics {
mrr: number; // Mean Reciprocal Rank
ndcg: number; // Normalized Discounted Cumulative Gain
precision: number; // Precision@K
recall: number; // Recall@K
map: number; // Mean Average Precision
}
interface RelevanceJudgment {
queryId: string;
documentId: string;
relevance: number; // 0 = not relevant, 1 = somewhat, 2 = very relevant
}
class SearchMetricsCalculator {
private judgments: Map> = new Map();
loadJudgments(judgments: RelevanceJudgment[]): void {
judgments.forEach(j => {
if (!this.judgments.has(j.queryId)) {
this.judgments.set(j.queryId, new Map());
}
this.judgments.get(j.queryId)!.set(j.documentId, j.relevance);
});
}
// Mean Reciprocal Rank: position of first relevant result
calculateMRR(queryId: string, results: SearchResult[]): number {
const queryJudgments = this.judgments.get(queryId);
if (!queryJudgments) return 0;
for (let i = 0; i < results.length; i++) {
const relevance = queryJudgments.get(results[i].id) || 0;
if (relevance > 0) {
return 1 / (i + 1);
}
}
return 0;
}
// Normalized Discounted Cumulative Gain
calculateNDCG(queryId: string, results: SearchResult[], k: number = 10): number {
const queryJudgments = this.judgments.get(queryId);
if (!queryJudgments) return 0;
const dcg = this.calculateDCG(queryId, results.slice(0, k));
const idealDCG = this.calculateIdealDCG(queryId, k);
return idealDCG === 0 ? 0 : dcg / idealDCG;
}
private calculateDCG(queryId: string, results: SearchResult[]): number {
const queryJudgments = this.judgments.get(queryId)!;
let dcg = 0;
results.forEach((result, i) => {
const relevance = queryJudgments.get(result.id) || 0;
dcg += (Math.pow(2, relevance) - 1) / Math.log2(i + 2);
});
return dcg;
}
private calculateIdealDCG(queryId: string, k: number): number {
const queryJudgments = this.judgments.get(queryId)!;
const sortedRelevances = Array.from(queryJudgments.values())
.sort((a, b) => b - a)
.slice(0, k);
let idcg = 0;
sortedRelevances.forEach((relevance, i) => {
idcg += (Math.pow(2, relevance) - 1) / Math.log2(i + 2);
});
return idcg;
}
// Precision at K: fraction of retrieved docs that are relevant
calculatePrecision(queryId: string, results: SearchResult[], k: number = 10): number {
const queryJudgments = this.judgments.get(queryId);
if (!queryJudgments) return 0;
const topK = results.slice(0, k);
const relevant = topK.filter(r => (queryJudgments.get(r.id) || 0) > 0);
return relevant.length / k;
}
// Full evaluation across multiple queries
async evaluateSearchSystem(
searchFn: (query: string) => Promise,
queries: { id: string; text: string }[],
k: number = 10
): Promise {
let totalMRR = 0;
let totalNDCG = 0;
let totalPrecision = 0;
let totalMAP = 0;
for (const query of queries) {
const results = await searchFn(query.text);
totalMRR += this.calculateMRR(query.id, results);
totalNDCG += this.calculateNDCG(query.id, results, k);
totalPrecision += this.calculatePrecision(query.id, results, k);
totalMAP += this.calculateAveragePrecision(query.id, results);
}
const n = queries.length;
return {
mrr: totalMRR / n,
ndcg: totalNDCG / n,
precision: totalPrecision / n,
recall: 0, // Requires knowing total relevant docs
map: totalMAP / n
};
}
private calculateAveragePrecision(queryId: string, results: SearchResult[]): number {
const queryJudgments = this.judgments.get(queryId);
if (!queryJudgments) return 0;
let sum = 0;
let relevantCount = 0;
results.forEach((result, i) => {
const relevance = queryJudgments.get(result.id) || 0;
if (relevance > 0) {
relevantCount++;
sum += relevantCount / (i + 1);
}
});
const totalRelevant = Array.from(queryJudgments.values()).filter(r => r > 0).length;
return totalRelevant === 0 ? 0 : sum / totalRelevant;
}
// A/B test comparison
async compareSearchSystems(
systemA: (query: string) => Promise,
systemB: (query: string) => Promise,
queries: { id: string; text: string }[]
): Promise<{
systemA: SearchMetrics;
systemB: SearchMetrics;
improvement: { metric: string; percentage: number }[];
}> {
const metricsA = await this.evaluateSearchSystem(systemA, queries);
const metricsB = await this.evaluateSearchSystem(systemB, queries);
const improvement = [
{ metric: 'MRR', percentage: ((metricsB.mrr - metricsA.mrr) / metricsA.mrr) * 100 },
{ metric: 'NDCG', percentage: ((metricsB.ndcg - metricsA.ndcg) / metricsA.ndcg) * 100 },
{ metric: 'Precision', percentage: ((metricsB.precision - metricsA.precision) / metricsA.precision) * 100 },
{ metric: 'MAP', percentage: ((metricsB.map - metricsA.map) / metricsA.map) * 100 }
];
return { systemA: metricsA, systemB: metricsB, improvement };
}
}
Key Takeaways
Remember These Points
- Semantic search understands meaning: Unlike keyword matching, vector embeddings capture conceptual similarity, finding relevant results even without matching terms
- Hybrid search delivers best results: Combining BM25 keyword search with vector similarity (typically 30/70 or 50/50 weighting) outperforms either approach alone
- Choose the right vector database: Pinecone for serverless simplicity, Weaviate for built-in AI modules, Elasticsearch for full-text + vector hybrid
- Query understanding improves relevance: Intent classification, entity extraction, and query expansion help bridge the gap between user queries and document content
- Personalization adds context: User preferences and behavior signals can boost search relevance by 15-25% for returning users
- Measure improvements rigorously: Use NDCG, MRR, and precision metrics to quantify the 40%+ relevance improvements semantic search delivers
- Consider embedding costs: OpenAI text-embedding-3-small costs $0.02/1M tokens; cache embeddings and batch operations to optimize spend
Conclusion
AI-enhanced search with vector embeddings and semantic understanding represents a fundamental advancement in how users discover content. The 40-60% improvements in search relevance translate directly to better user experiences, higher conversion rates, and reduced bounce rates from failed searches.
Start with a hybrid approach combining Elasticsearch's proven BM25 with vector search capabilities. Add query understanding to handle user intent variations, then layer in personalization for returning users. The investment in modern search infrastructure pays dividends across all metrics that matter.
For deeper exploration, review our related articles on Building Custom AI Assistants with LangChain which covers RAG patterns, and AI-Driven Personalization Engines for recommendation system fundamentals. The Pinecone Learning Center and Weaviate Academy offer excellent deep-dives into vector database concepts.