Why Your E-commerce Search Sucks

Every e-commerce company I've audited has the same problem: users search for "running shoes" and get back dress shoes, hiking boots, and shoe polish. The search works technically—it returns results containing those words—but it fails at the one job it has: helping users find what they want.

The Keyword Matching Trap

Traditional search engines like Elasticsearch with BM25 do exactly what they're designed to do: find documents containing the query terms, weighted by term frequency and document length. This works great for exact matches.

# BM25 scoring (simplified)
def bm25_score(query, document):
    score = 0
    for term in query:
        tf = term_frequency(term, document)
        idf = inverse_document_frequency(term)
        score += idf * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * doc_length))
    return score

But users don't search the way documents are written. They search for:

Synonyms: "sneakers" vs "athletic shoes" vs "trainers"
Misspellings: "runing shoes" (we've all done it)
Concepts: "shoes for marathon training" (no product contains these exact words)

The Semantic Search Overcorrection

Many teams respond by throwing vector embeddings at the problem. "Just embed everything with OpenAI and do cosine similarity!"

This helps with synonyms but creates new problems:

Vocabulary mismatch: Generic embeddings don't know your product taxonomy
Over-generalization: "Nike Air Max" and "Adidas Ultraboost" become too similar
Lost precision: Exact matches get buried under semantic approximations

The Hybrid Approach

The solution isn't either/or—it's both. Hybrid search combines:

BM25 for precision: When users search for "Nike Air Max 90 Size 10", give them exactly that
Semantic search for recall: When users search for "comfortable shoes for standing all day", understand the intent
Learned ranking: Use click data to learn what users actually want

Here's the architecture that actually works:

Query → Query Understanding → [BM25 Retrieval + Semantic Retrieval] → Fusion → Reranking → Results

Measuring What Matters

Before you change anything, establish baselines:

NDCG@10: How well are you ranking relevant results?
MRR: How quickly do users find what they want?
Click-through rate: Are users engaging with results?
Zero-result rate: How often do users get nothing?

Without metrics, you're just guessing.

What's Next?

If this sounds like your search, you're not alone. Most e-commerce search is broken in exactly these ways. The good news: these are solved problems with proven solutions.

Start with an audit to quantify where you stand, then build a roadmap based on data, not assumptions.