How can AI help optimize database queries?

AI can analyze EXPLAIN plans to identify bottlenecks, suggest optimal indexes based on query patterns, detect N+1 query problems in ORM code, recommend query rewrites for better performance, and help interpret complex execution plans that would take humans much longer to understand.

What is the N+1 query problem and how can AI detect it?

The N+1 problem occurs when code executes one query to fetch parent records, then N additional queries for each parent's related data. AI can analyze ORM code patterns, identify missing eager loading, and suggest batch loading strategies. Tools like pg_stat_statements combined with AI analysis can surface these patterns from production logs.

How do I use pg_stat_statements with AI for query optimization?

Enable pg_stat_statements in PostgreSQL to collect query statistics. Export the top queries by total time or frequency, then provide this data to AI assistants along with your schema. AI can prioritize which queries need optimization, suggest indexes, and recommend query rewrites based on actual production usage patterns.

Database Query Optimization Using AI Tools

Database performance is often the critical bottleneck in web applications. A single slow query can bring down an entire system, yet optimizing SQL requires deep expertise in execution plans, indexing strategies, and database internals. AI assistants are revolutionizing this domain, making expert-level query optimization accessible to every developer.

In this comprehensive guide, you'll learn how to leverage AI tools for analyzing EXPLAIN plans, identifying missing indexes, detecting N+1 query problems, and rewriting queries for optimal performance. We'll work through real PostgreSQL examples with measurable performance improvements, and integrate these techniques into your development workflow.

Why AI Excels at Query Optimization

Query optimization is fundamentally a pattern recognition problem. Given a query, schema, and execution plan, the optimizer must recognize which access patterns are inefficient and recommend improvements. This plays directly to AI's strengths:

Pattern matching - AI recognizes common anti-patterns across millions of queries it was trained on
Contextual reasoning - AI understands how schema design, data distribution, and query structure interact
Explain plan interpretation - AI can parse complex execution plans and identify bottlenecks faster than humans
Index recommendation - AI suggests covering indexes based on query access patterns

However, AI has limitations. It cannot execute queries to measure actual performance, doesn't know your specific data distribution, and may suggest indexes that hurt write performance. Always validate AI suggestions with benchmarks.

Analyzing EXPLAIN Plans with AI

The EXPLAIN command reveals how PostgreSQL executes your query. Understanding this output is crucial for optimization, but the output can be overwhelming. Here's how to use AI to interpret it effectively.

Understanding EXPLAIN Output

Let's start with a problematic query and its execution plan:

-- Sample schema
CREATE TABLE orders (
    id SERIAL PRIMARY KEY,
    customer_id INTEGER NOT NULL,
    order_date TIMESTAMP DEFAULT NOW(),
    total_amount DECIMAL(10,2),
    status VARCHAR(20)
);

CREATE TABLE order_items (
    id SERIAL PRIMARY KEY,
    order_id INTEGER REFERENCES orders(id),
    product_id INTEGER NOT NULL,
    quantity INTEGER,
    unit_price DECIMAL(10,2)
);

CREATE TABLE customers (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) UNIQUE,
    name VARCHAR(255),
    created_at TIMESTAMP DEFAULT NOW()
);

-- Insert sample data (1 million orders, 5 million order items)
INSERT INTO customers (email, name)
SELECT
    'customer' || i || '@example.com',
    'Customer ' || i
FROM generate_series(1, 100000) i;

INSERT INTO orders (customer_id, order_date, total_amount, status)
SELECT
    (random() * 99999 + 1)::int,
    NOW() - (random() * 365 || ' days')::interval,
    (random() * 1000)::decimal(10,2),
    CASE (random() * 3)::int
        WHEN 0 THEN 'pending'
        WHEN 1 THEN 'shipped'
        ELSE 'delivered'
    END
FROM generate_series(1, 1000000);

INSERT INTO order_items (order_id, product_id, quantity, unit_price)
SELECT
    (random() * 999999 + 1)::int,
    (random() * 10000 + 1)::int,
    (random() * 10 + 1)::int,
    (random() * 100)::decimal(10,2)
FROM generate_series(1, 5000000);

Now let's analyze a slow query:

-- Slow query: Find high-value orders from the last month
EXPLAIN (ANALYZE, BUFFERS, FORMAT TEXT)
SELECT
    c.name,
    c.email,
    o.id as order_id,
    o.total_amount,
    o.order_date
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.order_date > NOW() - INTERVAL '30 days'
AND o.total_amount > 500
ORDER BY o.total_amount DESC
LIMIT 100;

-- Output:
/*
Limit  (cost=45234.12..45234.37 rows=100 width=96)
       (actual time=892.451..892.468 rows=100 loops=1)
  Buffers: shared hit=12453 read=8234
  ->  Sort  (cost=45234.12..45312.45 rows=31334 width=96)
           (actual time=892.449..892.459 rows=100 loops=1)
        Sort Key: o.total_amount DESC
        Sort Method: top-N heapsort  Memory: 51kB
        Buffers: shared hit=12453 read=8234
        ->  Hash Join  (cost=3254.12..44123.89 rows=31334 width=96)
                       (actual time=45.234..878.123 rows=28456 loops=1)
              Hash Cond: (o.customer_id = c.id)
              Buffers: shared hit=12453 read=8234
              ->  Seq Scan on orders o  (cost=0.00..40234.00 rows=31334 width=28)
                                        (actual time=0.023..834.567 rows=28456 loops=1)
                    Filter: ((order_date > ...) AND (total_amount > 500))
                    Rows Removed by Filter: 971544
                    Buffers: shared hit=10234 read=8234
              ->  Hash  (cost=1726.00..1726.00 rows=100000 width=72)
                        (actual time=44.234..44.234 rows=100000 loops=1)
                    Buckets: 131072  Batches: 1  Memory Usage: 8234kB
                    Buffers: shared hit=2219
                    ->  Seq Scan on customers c  (cost=0.00..1726.00 rows=100000 width=72)
                                                  (actual time=0.012..21.456 rows=100000 loops=1)
                          Buffers: shared hit=2219
Planning Time: 0.234 ms
Execution Time: 892.567 ms
*/

Prompting AI to Analyze EXPLAIN Plans

Here's an effective prompt template for AI-assisted EXPLAIN analysis:

// AI Prompt Template for EXPLAIN Analysis
`Analyze this PostgreSQL EXPLAIN ANALYZE output and provide optimization recommendations.

Schema:
${schemaDefinition}

Query:
${queryText}

EXPLAIN ANALYZE output:
${explainOutput}

Please provide:
1. Summary of the execution plan in plain English
2. Identify the main performance bottlenecks
3. Calculate the selectivity of each filter
4. Recommend specific indexes to create
5. Suggest query rewrites if applicable
6. Estimate the expected improvement

Focus on:
- Sequential scans on large tables
- High row estimates vs actual rows (cardinality issues)
- Sort operations in memory vs disk
- Hash/Merge join efficiency
- Buffer statistics (shared hit vs read)`

AI analysis of our example query would identify these issues:

AI-Identified Bottlenecks

Sequential scan on orders - Scanning 1M rows to find 28K matches (2.8% selectivity)
Missing index on order_date + total_amount - Both filter columns lack indexes
Full customer table hash - Loading all 100K customers when only ~20K are needed
High buffer reads - 8,234 pages read from disk indicate cold cache or missing index

Implementing AI Recommendations

Based on AI analysis, let's create the recommended indexes:

-- AI-recommended composite index for the filter conditions
CREATE INDEX CONCURRENTLY idx_orders_date_amount
ON orders (order_date, total_amount)
WHERE total_amount > 100;  -- Partial index for high-value orders

-- AI-recommended index for the join condition with included columns
CREATE INDEX CONCURRENTLY idx_orders_customer_covering
ON orders (customer_id)
INCLUDE (order_date, total_amount, status)
WHERE order_date > NOW() - INTERVAL '90 days';

-- Run ANALYZE to update statistics
ANALYZE orders;
ANALYZE customers;

After creating indexes, re-run EXPLAIN ANALYZE:

-- Optimized execution plan
/*
Limit  (cost=234.12..234.37 rows=100 width=96)
       (actual time=12.451..12.468 rows=100 loops=1)
  Buffers: shared hit=345
  ->  Sort  (cost=234.12..312.45 rows=31334 width=96)
           (actual time=12.449..12.459 rows=100 loops=1)
        Sort Key: o.total_amount DESC
        Sort Method: top-N heapsort  Memory: 51kB
        Buffers: shared hit=345
        ->  Nested Loop  (cost=0.87..123.89 rows=31334 width=96)
                         (actual time=0.034..11.123 rows=28456 loops=1)
              Buffers: shared hit=345
              ->  Index Scan using idx_orders_date_amount on orders o
                  (cost=0.43..89.00 rows=31334 width=28)
                  (actual time=0.023..4.567 rows=28456 loops=1)
                    Index Cond: ((order_date > ...) AND (total_amount > 500))
                    Buffers: shared hit=123
              ->  Index Scan using customers_pkey on customers c
                  (cost=0.44..0.46 rows=1 width=72)
                  (actual time=0.001..0.001 rows=1 loops=28456)
                    Index Cond: (id = o.customer_id)
                    Buffers: shared hit=222
Planning Time: 0.156 ms
Execution Time: 12.567 ms
*/

Performance improved from 892ms to 12ms - a 70x improvement.

Detecting and Fixing N+1 Query Problems

The N+1 query problem is one of the most common performance killers in web applications. AI excels at identifying these patterns in ORM code.

Understanding N+1 Queries

Here's a typical N+1 problem in a Node.js application using Prisma:

// BAD: N+1 Query Pattern
async function getOrdersWithCustomers() {
  // Query 1: Fetch all orders
  const orders = await prisma.order.findMany({
    where: {
      orderDate: { gte: thirtyDaysAgo }
    }
  });

  // N Queries: Fetch customer for each order
  const ordersWithCustomers = await Promise.all(
    orders.map(async (order) => {
      const customer = await prisma.customer.findUnique({
        where: { id: order.customerId }
      });
      return { ...order, customer };
    })
  );

  return ordersWithCustomers;
}

// This generates:
// SELECT * FROM orders WHERE order_date >= '2025-01-01';  -- 1 query
// SELECT * FROM customers WHERE id = 1;   -- N queries
// SELECT * FROM customers WHERE id = 2;
// SELECT * FROM customers WHERE id = 3;
// ... (repeated for each order)

AI Prompt for N+1 Detection

// Prompt for N+1 query detection
`Analyze this code for N+1 query problems and suggest optimizations.

ORM: Prisma with PostgreSQL
Code:
${codeSnippet}

Database schema:
${schemaDefinition}

Please identify:
1. Any N+1 query patterns
2. The total number of queries executed
3. Recommended eager loading strategy
4. Optimized code with proper includes/joins
5. Expected query count after optimization`

AI-Recommended Fix

// GOOD: Single query with eager loading
async function getOrdersWithCustomers() {
  const orders = await prisma.order.findMany({
    where: {
      orderDate: { gte: thirtyDaysAgo }
    },
    include: {
      customer: {
        select: {
          id: true,
          name: true,
          email: true
        }
      }
    }
  });

  return orders;
}

// This generates a single query with JOIN:
// SELECT o.*, c.id, c.name, c.email
// FROM orders o
// LEFT JOIN customers c ON c.id = o.customer_id
// WHERE o.order_date >= '2025-01-01';

For more complex scenarios with multiple relations:

// Complex eager loading with nested relations
async function getOrderDetails(orderId: string) {
  return prisma.order.findUnique({
    where: { id: orderId },
    include: {
      customer: true,
      orderItems: {
        include: {
          product: {
            include: {
              category: true,
              reviews: {
                take: 5,
                orderBy: { createdAt: 'desc' }
              }
            }
          }
        }
      },
      shipments: {
        orderBy: { createdAt: 'desc' },
        take: 1
      }
    }
  });
}

// AI recommendation: Use select to limit fields and reduce payload
async function getOrderDetailsOptimized(orderId: string) {
  return prisma.order.findUnique({
    where: { id: orderId },
    select: {
      id: true,
      totalAmount: true,
      status: true,
      orderDate: true,
      customer: {
        select: {
          name: true,
          email: true
        }
      },
      orderItems: {
        select: {
          quantity: true,
          unitPrice: true,
          product: {
            select: {
              name: true,
              sku: true,
              category: {
                select: { name: true }
              }
            }
          }
        }
      }
    }
  });
}

Using pg_stat_statements with AI

The pg_stat_statements extension collects execution statistics for all SQL statements. Combined with AI analysis, it becomes a powerful optimization tool.

Enabling pg_stat_statements

-- Enable the extension
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

-- Configure in postgresql.conf
-- shared_preload_libraries = 'pg_stat_statements'
-- pg_stat_statements.track = all
-- pg_stat_statements.max = 10000

-- Reset statistics (do this after making changes)
SELECT pg_stat_statements_reset();

Extracting Query Statistics for AI Analysis

-- Get top 20 queries by total execution time
SELECT
    substring(query, 1, 100) as query_preview,
    calls,
    total_exec_time::numeric(10,2) as total_time_ms,
    mean_exec_time::numeric(10,2) as avg_time_ms,
    rows,
    shared_blks_hit,
    shared_blks_read,
    ROUND(100.0 * shared_blks_hit /
        NULLIF(shared_blks_hit + shared_blks_read, 0), 2) as cache_hit_ratio
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 20;

-- Get queries with low cache hit ratio (disk-heavy)
SELECT
    substring(query, 1, 200) as query,
    calls,
    mean_exec_time::numeric(10,2) as avg_ms,
    shared_blks_read as disk_reads,
    ROUND(100.0 * shared_blks_hit /
        NULLIF(shared_blks_hit + shared_blks_read, 0), 2) as cache_hit_pct
FROM pg_stat_statements
WHERE shared_blks_read > 1000
ORDER BY shared_blks_read DESC
LIMIT 10;

-- Get queries with high row estimates variance
SELECT
    substring(query, 1, 150) as query,
    calls,
    rows / NULLIF(calls, 0) as avg_rows_returned,
    mean_exec_time::numeric(10,2) as avg_ms
FROM pg_stat_statements
WHERE calls > 100
ORDER BY mean_exec_time DESC
LIMIT 20;

AI Prompt for pg_stat_statements Analysis

// Prompt template for pg_stat_statements analysis
`Analyze these PostgreSQL query statistics and prioritize optimization efforts.

Query Statistics:
${pgStatOutput}

Database Schema:
${schemaDefinition}

Current Indexes:
${indexList}

Please provide:
1. Priority ranking of queries to optimize (based on total time impact)
2. For each top query:
   - Root cause analysis
   - Specific index recommendations
   - Query rewrite suggestions
3. Identify any redundant or unused indexes
4. Overall database health assessment
5. Estimated time savings from recommended optimizations`

AI-Assisted Index Optimization

Creating the right indexes is crucial for query performance. AI can analyze query patterns and recommend optimal indexing strategies.

Understanding Index Types

-- B-tree index (default): Best for equality and range queries
CREATE INDEX idx_orders_date ON orders (order_date);

-- Partial index: Only index rows matching a condition
CREATE INDEX idx_orders_pending ON orders (customer_id, order_date)
WHERE status = 'pending';

-- Covering index: Include non-key columns to enable index-only scans
CREATE INDEX idx_orders_covering ON orders (customer_id)
INCLUDE (order_date, total_amount, status);

-- Composite index: Multiple columns for multi-column queries
CREATE INDEX idx_orders_composite ON orders (customer_id, order_date DESC);

-- Expression index: Index computed values
CREATE INDEX idx_customers_email_lower ON customers (LOWER(email));

-- GIN index: For full-text search and array operations
CREATE INDEX idx_products_tags ON products USING GIN (tags);

-- BRIN index: For large tables with natural ordering
CREATE INDEX idx_logs_timestamp ON logs USING BRIN (created_at);

AI Prompt for Index Recommendations

// Prompt for index optimization
`Analyze these queries and recommend optimal indexes.

Queries and their frequency:
${queryList}

Current schema:
${schemaDefinition}

Existing indexes:
${existingIndexes}

Table statistics:
${tableStats}

Please provide:
1. Recommended indexes with exact CREATE INDEX statements
2. Explanation for each recommendation
3. Indexes that should be dropped (unused or redundant)
4. Estimated storage impact
5. Potential write performance impact
6. Priority order for creating indexes (highest impact first)`

Finding Unused and Missing Indexes

-- Find unused indexes
SELECT
    schemaname,
    tablename,
    indexname,
    pg_size_pretty(pg_relation_size(indexrelid)) as index_size,
    idx_scan as number_of_scans,
    idx_tup_read as tuples_read,
    idx_tup_fetch as tuples_fetched
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelname NOT LIKE '%_pkey'
ORDER BY pg_relation_size(indexrelid) DESC;

-- Find missing indexes (tables with high sequential scans)
SELECT
    schemaname,
    relname as table_name,
    seq_scan,
    seq_tup_read,
    idx_scan,
    idx_tup_fetch,
    pg_size_pretty(pg_relation_size(relid)) as table_size,
    CASE
        WHEN seq_scan > 0 THEN
            ROUND(100.0 * idx_scan / (seq_scan + idx_scan), 2)
        ELSE 100
    END as index_usage_pct
FROM pg_stat_user_tables
WHERE seq_scan > 1000
ORDER BY seq_tup_read DESC
LIMIT 20;

-- Find duplicate indexes
SELECT
    pg_size_pretty(sum(pg_relation_size(idx))::bigint) as size,
    (array_agg(idx))[1] as idx1,
    (array_agg(idx))[2] as idx2
FROM (
    SELECT
        indexrelid::regclass as idx,
        (indrelid::text || E'\n' || indclass::text || E'\n' ||
         indkey::text || E'\n' || coalesce(indexprs::text, '') ||
         E'\n' || coalesce(indpred::text, '')) as key
    FROM pg_index
) sub
GROUP BY key
HAVING count(*) > 1
ORDER BY sum(pg_relation_size(idx)) DESC;

AI-Powered Query Rewriting

Sometimes the best optimization is rewriting the query entirely. AI can suggest more efficient query structures.

Common Query Rewrites

-- BEFORE: Correlated subquery (slow)
SELECT c.name, c.email,
    (SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id) as order_count,
    (SELECT SUM(total_amount) FROM orders o WHERE o.customer_id = c.id) as total_spent
FROM customers c
WHERE c.created_at > '2024-01-01';

-- AFTER: Single join with aggregation (fast)
SELECT
    c.name,
    c.email,
    COUNT(o.id) as order_count,
    COALESCE(SUM(o.total_amount), 0) as total_spent
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE c.created_at > '2024-01-01'
GROUP BY c.id, c.name, c.email;

-- BEFORE: NOT IN with subquery (can be slow with NULLs)
SELECT * FROM products
WHERE id NOT IN (
    SELECT product_id FROM order_items WHERE order_id IN (
        SELECT id FROM orders WHERE order_date > '2024-01-01'
    )
);

-- AFTER: LEFT JOIN with NULL check (faster and NULL-safe)
SELECT p.*
FROM products p
LEFT JOIN (
    SELECT DISTINCT oi.product_id
    FROM order_items oi
    JOIN orders o ON o.id = oi.order_id
    WHERE o.order_date > '2024-01-01'
) recent_products ON recent_products.product_id = p.id
WHERE recent_products.product_id IS NULL;

-- BEFORE: DISTINCT with large result set
SELECT DISTINCT customer_id FROM orders WHERE order_date > '2024-01-01';

-- AFTER: Use EXISTS for checking presence
SELECT c.id
FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.id
    AND o.order_date > '2024-01-01'
);

-- Or use GROUP BY for counts
SELECT customer_id
FROM orders
WHERE order_date > '2024-01-01'
GROUP BY customer_id;

Optimizing with Window Functions

-- BEFORE: Multiple self-joins for running totals
SELECT
    o1.id,
    o1.order_date,
    o1.total_amount,
    (SELECT SUM(o2.total_amount)
     FROM orders o2
     WHERE o2.customer_id = o1.customer_id
     AND o2.order_date <= o1.order_date) as running_total
FROM orders o1
WHERE o1.customer_id = 123
ORDER BY o1.order_date;

-- AFTER: Window function (much faster)
SELECT
    id,
    order_date,
    total_amount,
    SUM(total_amount) OVER (
        PARTITION BY customer_id
        ORDER BY order_date
        ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
    ) as running_total
FROM orders
WHERE customer_id = 123
ORDER BY order_date;

-- BEFORE: Subquery for ranking
SELECT * FROM (
    SELECT
        o.*,
        (SELECT COUNT(*) + 1
         FROM orders o2
         WHERE o2.customer_id = o.customer_id
         AND o2.total_amount > o.total_amount) as rank
    FROM orders o
) ranked
WHERE rank <= 3;

-- AFTER: ROW_NUMBER window function
SELECT * FROM (
    SELECT
        *,
        ROW_NUMBER() OVER (
            PARTITION BY customer_id
            ORDER BY total_amount DESC
        ) as rank
    FROM orders
) ranked
WHERE rank <= 3;

Optimizing Batch Operations

Bulk inserts and updates require special optimization strategies. AI can help design efficient batch operations.

// BEFORE: Individual inserts (slow)
async function insertOrders(orders: Order[]) {
  for (const order of orders) {
    await prisma.order.create({ data: order });
  }
}

// AFTER: Batch insert with createMany
async function insertOrdersBatch(orders: Order[]) {
  await prisma.order.createMany({
    data: orders,
    skipDuplicates: true
  });
}

// For even better performance with raw SQL
async function insertOrdersRaw(orders: Order[]) {
  const values = orders.map(o =>
    `(${o.customerId}, '${o.orderDate}', ${o.totalAmount}, '${o.status}')`
  ).join(',');

  await prisma.$executeRawUnsafe(`
    INSERT INTO orders (customer_id, order_date, total_amount, status)
    VALUES ${values}
    ON CONFLICT (id) DO NOTHING
  `);
}

-- Bulk update with CASE expression
UPDATE orders
SET status = CASE id
    WHEN 1 THEN 'shipped'
    WHEN 2 THEN 'delivered'
    WHEN 3 THEN 'cancelled'
END,
updated_at = NOW()
WHERE id IN (1, 2, 3);

-- Bulk update from temporary table (for large updates)
CREATE TEMP TABLE order_updates (
    id INTEGER PRIMARY KEY,
    new_status VARCHAR(20)
);

INSERT INTO order_updates VALUES
    (1, 'shipped'),
    (2, 'delivered'),
    (3, 'cancelled');

UPDATE orders o
SET
    status = u.new_status,
    updated_at = NOW()
FROM order_updates u
WHERE o.id = u.id;

DROP TABLE order_updates;

Integrating Query Optimization into Development Workflow

Make query optimization a continuous process, not a one-time effort.

CI/CD Integration

// GitHub Action for query analysis
// .github/workflows/db-analysis.yml
name: Database Query Analysis

on:
  pull_request:
    paths:
      - 'prisma/**'
      - 'src/**/*.sql'

jobs:
  analyze-queries:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Start PostgreSQL
        run: |
          docker run -d \
            --name postgres \
            -e POSTGRES_PASSWORD=test \
            -p 5432:5432 \
            postgres:15

      - name: Run migrations
        run: npx prisma migrate deploy

      - name: Analyze new queries
        run: |
          # Extract queries from changed files
          # Run EXPLAIN ANALYZE on each
          # Fail if any query has seq scan on large tables
          node scripts/analyze-queries.js

      - name: Check for N+1 patterns
        run: |
          # Static analysis for N+1 patterns
          npx eslint --rule 'no-await-in-loop: error' src/

Production Query Monitoring

// Query monitoring with automatic alerting
import { Pool } from 'pg';

interface SlowQuery {
  query: string;
  duration: number;
  timestamp: Date;
}

class QueryMonitor {
  private slowQueries: SlowQuery[] = [];
  private threshold = 1000; // 1 second

  constructor(private pool: Pool) {
    this.setupMonitoring();
  }

  private setupMonitoring() {
    // Log slow queries
    this.pool.on('query', (query) => {
      const start = Date.now();

      query.on('end', () => {
        const duration = Date.now() - start;

        if (duration > this.threshold) {
          this.slowQueries.push({
            query: query.text,
            duration,
            timestamp: new Date()
          });

          this.alertSlowQuery(query.text, duration);
        }
      });
    });
  }

  private async alertSlowQuery(query: string, duration: number) {
    // Send to monitoring service
    console.warn(`Slow query detected (${duration}ms):`, query);

    // Optional: Send to Slack/PagerDuty
    // await this.sendAlert({ query, duration });
  }

  async getSlowQueryReport() {
    const report = await this.pool.query(`
      SELECT
        query,
        calls,
        mean_exec_time::numeric(10,2) as avg_ms,
        total_exec_time::numeric(10,2) as total_ms
      FROM pg_stat_statements
      WHERE mean_exec_time > $1
      ORDER BY total_exec_time DESC
      LIMIT 20
    `, [this.threshold]);

    return report.rows;
  }
}

Real-World Benchmark Examples

Performance Improvements Achieved

E-commerce order search: 892ms to 12ms (74x faster) with composite index
Customer dashboard: 45 queries to 3 queries (N+1 fix) - 850ms to 120ms
Report generation: 12 seconds to 340ms with window functions
Bulk import: 45 minutes to 2 minutes with batch inserts
Search autocomplete: 200ms to 15ms with trigram index

Best Practices for AI-Assisted Query Optimization

Key Recommendations

Always provide schema context - AI needs table definitions, existing indexes, and relationships
Include actual data statistics - Row counts, data distribution, and cardinality help AI make better recommendations
Use EXPLAIN ANALYZE, not just EXPLAIN - Actual execution times reveal real bottlenecks
Benchmark before and after - Validate AI suggestions with actual performance measurements
Consider write performance - Every index slows down INSERT/UPDATE operations
Monitor in production - Query patterns change over time; set up continuous monitoring
Use pg_stat_statements religiously - It's the single best tool for finding optimization opportunities
Test with production-like data - Query plans change dramatically with different data volumes

Conclusion

AI-assisted database query optimization democratizes expertise that traditionally required years of DBA experience. By providing AI with proper context - schema definitions, EXPLAIN plans, and query statistics - you can quickly identify bottlenecks, create optimal indexes, and rewrite inefficient queries.

The key insight is that AI excels at pattern recognition and can instantly spot common anti-patterns that humans might overlook. However, AI cannot replace understanding your specific data and workload patterns. Use AI as an expert advisor, but always validate recommendations with benchmarks on production-representative data.

Start by enabling pg_stat_statements in your production database today. Export the top 20 slowest queries weekly, analyze them with AI assistance, and implement the recommended optimizations. This simple workflow can yield dramatic performance improvements with minimal effort.

In our next article, we'll explore AI-Assisted CSS and Styling: From Design to Code, where you'll learn how to use AI tools to transform design mockups into production-ready CSS.