Fixing Vector Search Blind Spots: Hybrid Exclusions and LLM Reasoning

Vector search is great until it suggests chicken when you explicitly ask for 'no chicken'. Here is how I solved semantic blind spots by combining hard database tags with forced LLM chain-of-thought.

Fixing Vector Search Blind Spots: Hybrid Exclusions and LLM Reasoning
Photo by Josh Calabrese / Unsplash

Semantic search is a double-edged sword. It's incredibly powerful at finding "relevant" content, but sometimes it’s too relevant. I experienced this friction firsthand in my meal-planner project: I'd explicitly ask for "no chicken", but the underlying RAG (Retrieval-Augmented Generation) system would still fetch chicken recipes. Why? Because the semantic embedding for "chicken" strongly matches chicken dishes, and vector math is notoriously blind to boolean logic like "NOT".

To fix this, I moved from a pure vector search to a hybrid model that supports hard exclusions via tagging. But updating the database wasn't enough; I had to force the LLM to actually think about its constraints before querying the RAG. Here is how I connected the dots.

The Architecture of Exclusion

In previous posts, I've talked about Taming the Pull and giving agents Agency. But agency without guardrails leads to context bloat and poor recommendations.

The solution required two pieces: a hard tagging layer in the database, and an updated tool schema for the agent. Here is the flow:

graph TD
    A[User Request] --> B[Analyst Agent]
    B --> C{Tool Schema: Reasoning}
    C -->|Identify Constraints| D[Execute Search Tool]
    D --> E[Tag-based Filter: SQL]
    D --> F[Vector Search: Embeddings]
    E --> G[Hybrid Search Service]
    F --> G
    G --> H[Filtered Search Results]

Schema Evolution & Bilingual Tagging

First, I needed a way to store these tags. I opted for a simple junction-like table to keep things flexible, with an index on the tag itself for fast filtering.

-- internal/database/migrations/008_add_recipe_tags.up.sql
CREATE TABLE IF NOT EXISTS recipe_tags (
    recipe_id TEXT NOT NULL,
    tag TEXT NOT NULL,
    PRIMARY KEY (recipe_id, tag),
    FOREIGN KEY (recipe_id) REFERENCES recipes(id) ON DELETE CASCADE
);

CREATE INDEX IF NOT EXISTS idx_recipe_tags_tag ON recipe_tags(tag);

Because my recipes are often in Portuguese but my interactions are in English, I updated the Extractor Agent during ingestion to generate tags in both languages. This ensures a search for "chicken" or "frango" correctly hits the metadata without requiring expensive real-time translation loops.

Refactoring the Search Interface

The RecipeSearcher interface in Go needed to evolve from a simple query-based search to one that accepts strict filters.

// internal/shared/interfaces.go
type RecipeSearcher interface {
    RecipeSemanticSearch(
        ctx context.Context, 
        query string, 
        excludeIDs []string, 
        excludeTags []string, // The new hard exclusion filter
    ) ([]value.Recipe, error)
}

Enforcement: Filtering Before the Math

Passing the tags down to the database layer is the final step. To ensure the LLM never even sees the forbidden recipes, I enforce the exclusion before calculating cosine similarity.

First, the SearchService resolves the tags into concrete recipe_ids. Then, in the VectorRepository, I build an exclusion map and skip the similarity math entirely for those IDs:

// internal/llm/vector_repository.go
func (r *VectorRepository) FindSimilar(ctx context.Context, queryEmbedding []float32, limit int, excludeIDs []string) ([]string, error) {
    // ... (fetch all embeddings)

    // Create a map for O(1) exclusion lookup
    excludeMap := make(map[string]struct{})
    for _, id := range excludeIDs {
        excludeMap[id] = struct{}{}
    }

    // ...
    for _, dbEmbed := range allEmbeddings {
        // Hard-filter: Skip the math entirely if the ID is excluded
        if _, excluded := excludeMap[dbEmbed.RecipeID]; excluded {
            continue
        }

        // Only calculate similarity for allowed recipes
        score := cosineSimilarity(queryEmbedding, embed)
        // ...
    }
}

This guarantees that no matter how semantically similar "chicken" is to the query, it never makes it into the scored results.

The System Prompt as a Parser

Having a structured field in the tool is only half the battle. I needed to bridge the gap between a messy user request like "I'm tired of chicken, give me something else" and the structured exclude_tags: ["chicken"] parameter.

Instead of writing a complex regex-based parser in Go, I delegated this "translation" to the system prompt. In analyst_prompt.md, I added a specific directive that acts as the logic for my runtime query parser:

**Negative Constraints**: Strictly respect any "don't want", "exclude",
or "avoid" instructions. If a user asks to exclude an ingredient, use the 
`exclude_tags` parameter when searching. You MUST provide the exclusion 
tag in English (e.g., use 'chicken' even if the user says 'sem frango').

This instruction ensures the LLM doesn't just "try" to avoid chicken; it knows exactly which structured tool parameter to use and, crucially, handles the translation to my English-indexed tags automatically.

Forcing Chain-of-Thought via Tool Schemas

This was the real "aha" moment. Having the database filter is useless if the agent blindly passes empty arrays. I noticed the Analyst agent would sometimes forget to apply the exclusions it had just acknowledged in chat.

The fix was to update the JSON Schema for the search tool to require a Reasoning field before the arguments.

// Example of the updated Tool Schema structure
{
  "name": "RecipeSemanticSearch",
  "parameters": {
    "type": "object",
    "properties": {
      "reasoning": {
        "type": "string",
        "description": "Explain WHY you are searching for this and why you chose specific excludeTags."
      },
      "query": { "type": "string" },
      "excludeTags": {
        "type": "array",
        "items": { "type": "string" }
      }
    },
    "required": ["reasoning", "query", "excludeTags"]
  }
}

By forcing the LLM to output its reasoning first, I engaged its "chain-of-thought." It essentially talks itself into using the right tags: "The user wants no chicken, so I must ensure 'chicken' and 'frango' are in the excludeTags array." This massively reduced instances where the model hallucinated or ignored constraints.

Regaining Control Over Discovery

Adding this hybrid layer changed the experience from "AI guessing" to "AI following instructions."

Vector search excels at fuzzy discovery but fails at boolean logic. By combining semantic search with hard metadata filtering, and forcing the agent to reason about those filters, we get the best of both worlds. Instead of wrestling with a RAG system about why it keeps fetching chicken dishes when I said "no chicken", I've simply removed them from its "vision" entirely.

References & Resources