Hybrid Search in Azure AI Search

What is hybrid search?

Hybrid search combines full-text and vector queries into a single request, running against an index that holds both plain-text content and generated embeddings.

Vector search excels at surfacing information that is conceptually related to your query, even when no exact keywords match in the inverted index. Full-text search, on the other hand, offers precision, and can be paired with optional semantic ranking to further improve the quality of results.

Prerequisites

A Cosmos database with sample documents to index
An Azure AI Search service
A search index containing searchable vector and nonvector fields.
Search service must have the semantic ranker enabled

Overview of the services

The Cosmos database has five documents with the following structure:

{
    "id": "doc-005",
    "title": "Azure OpenAI Integration Patterns",
    "category": "AI Architecture",
    "tags": [
        "OpenAI",
        "Azure",
        "Integration"
    ],
    "content": "Azure OpenAI can be integrated with backend systems using APIs, event-driven architectures, or orchestration frameworks like Semantic Kernel. It enables intelligent applications with natural language capabilities.",
    "summary": "Patterns for integrating Azure OpenAI into applications.",
    "author": "Michael Green",
    "createdDate": "2025-05-01T16:20:00Z",
    "region": "EU",
    "accessLevel": "public"
}

The Azure AI Search contains an index with field configuration:

Notice the text_vecor field used for the vector query.

Finally, the semantic configuration defined:

Rise of the working example

Let’s walk through how to build this in C#. Here’s the code:

using Azure;
using Azure.Identity;
using Azure.Search.Documents;
using Azure.Search.Documents.Models;
using Azure.AI.OpenAI;

string searchEndpoint = Environment.GetEnvironmentVariable("AZURE_SEARCH_ENDPOINT")!;
string indexName = "rag-hybrid";

string openAiEndpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string deploymentName = "text-embedding-ada-itt";

var azureOpenAIClient = new AzureOpenAIClient(
    new Uri(openAiEndpoint),
    new DefaultAzureCredential());

// Create embedding client
var embeddingClient = azureOpenAIClient.GetEmbeddingClient(deploymentName);

string query = "how to integrate RAG";

// Generate embedding
var embeddingResponse = await embeddingClient.GenerateEmbeddingAsync(query);

// Convert to float[]
var queryVector = embeddingResponse.Value.ToFloats();

string searchKey = Environment.GetEnvironmentVariable("AZURE_SEARCH_KEY")!;

SearchClient client = new(
    new Uri(searchEndpoint),
    indexName,
    new AzureKeyCredential(searchKey));

SearchResults<SearchDocument> results = await client.SearchAsync<SearchDocument>(
    query,
    new SearchOptions
    {
        VectorSearch = new()
        {
            Queries =
            {
                new VectorizedQuery(queryVector)
                {
                    Fields = { "text_vector" },
                    Exhaustive = true,
                    KNearestNeighborsCount = 5
                },
            }
        },
        QueryType = SearchQueryType.Semantic,
        SemanticSearch = new SemanticSearchOptions
        {
            SemanticConfigurationName = "rag-hybrid-semantic-configuration"
        },
        Select = { "summary", "chunk", "category", "author" },
        Size = 5
    });

await foreach (SearchResult<SearchDocument> result in results.GetResultsAsync())
{
    Console.WriteLine($"{result.Document["category"]} -> {result.Document["summary"]} \n" +
        $"{result.Document["chunk"]}");
    Console.WriteLine();
}

The core of the example is the SearchAsync call, which sends a query that does four things simultaneously.

First, the query string is passed as the first argument to SearchAsync, which triggers traditional keyword search.

Second, the VectorizedQuery uses the float vector generated earlier to search the text_vector field in the index. It also finds the 5 nearest neighbors.

Third, the results from keyword search and vector search are merged, producing a single ranked list.

Fourth, the QueryType is set to Semantic, which activates Azure's semantic re-ranker. This re-ranks the fused results using a language model that understands context, not just keywords. The semantic configuration defines how the re-ranker interprets fields like title and content, guiding how relevance is computed.

Let's observe the results returned by running two different queries. The first query is string query = "how to integrate RAG"; while the second is string query = "written by Michael Green";

The Secret Behind Better RAG Results: Hybrid Search in Azure AI Search

What is hybrid search?

Prerequisites

Overview of the services

Rise of the working example

Comments

The Azure Behind the Madness

Using Logic Apps with Foundry Agents

More from this blog

Book Review: Design Multi-Agent AI Systems Using MCP and A2A

Agentic Architectural Patterns for Building Multi-Agent Systems - Book Review

Using Logic Apps with Foundry Agents

Microsoft Foundry Agents

Command Palette

What is hybrid search?

Prerequisites

Overview of the services

Rise of the working example

Comments

The Azure Behind the Madness

Using Logic Apps with Foundry Agents

More from this blog