Why Your E-commerce Search Sucks
Generic search engines fail at understanding user intent. Here's what you can do about it.
Every e-commerce company I've audited has the same problem: users search for "running shoes" and get back dress shoes, hiking boots, and shoe polish. The search works technically—it returns results containing those words—but it fails at the one job it has: helping users find what they want.
The Keyword Matching Trap
Traditional search engines like Elasticsearch with BM25 do exactly what they're designed to do: find documents containing the query terms, weighted by term frequency and document length. This works great for exact matches.
# BM25 scoring (simplified)
def bm25_score(query, document):
score = 0
for term in query:
tf = term_frequency(term, document)
idf = inverse_document_frequency(term)
score += idf * (tf * (k1 + 1)) / (tf + k1 * (1 - b + b * doc_length))
return scoreBut users don't search the way documents are written. They search for:
- Synonyms: "sneakers" vs "athletic shoes" vs "trainers"
- Misspellings: "runing shoes" (we've all done it)
- Concepts: "shoes for marathon training" (no product contains these exact words)
The Semantic Search Overcorrection
Many teams respond by throwing vector embeddings at the problem. "Just embed everything with OpenAI and do cosine similarity!"
This helps with synonyms but creates new problems:
- Vocabulary mismatch: Generic embeddings don't know your product taxonomy
- Over-generalization: "Nike Air Max" and "Adidas Ultraboost" become too similar
- Lost precision: Exact matches get buried under semantic approximations
The Hybrid Approach
The solution isn't either/or—it's both. Hybrid search combines:
- BM25 for precision: When users search for "Nike Air Max 90 Size 10", give them exactly that
- Semantic search for recall: When users search for "comfortable shoes for standing all day", understand the intent
- Learned ranking: Use click data to learn what users actually want
Here's the architecture that actually works:
Query → Query Understanding → [BM25 Retrieval + Semantic Retrieval] → Fusion → Reranking → Results
Measuring What Matters
Before you change anything, establish baselines:
- NDCG@10: How well are you ranking relevant results?
- MRR: How quickly do users find what they want?
- Click-through rate: Are users engaging with results?
- Zero-result rate: How often do users get nothing?
Without metrics, you're just guessing.
What's Next?
If this sounds like your search, you're not alone. Most e-commerce search is broken in exactly these ways. The good news: these are solved problems with proven solutions.
Start with an audit to quantify where you stand, then build a roadmap based on data, not assumptions.
Struggling with search relevance? Get an audit.
Book a Discovery Call