Skip to main content

Command Palette

Search for a command to run...

The Secret Behind Better RAG Results: Hybrid Search in Azure AI Search

Inside Azure AI Search: Hybrid Retrieval with Vector and Semantic Ranking

Published
3 min read
The Secret Behind Better RAG Results: Hybrid Search in Azure AI Search
D

Expert insights on Azure AI architecture and implementation. Real-world solutions for building intelligent enterprise systems.

Hybrid search combines full-text and vector queries into a single request, running against an index that holds both plain-text content and generated embeddings.

Vector search excels at surfacing information that is conceptually related to your query, even when no exact keywords match in the inverted index. Full-text search, on the other hand, offers precision, and can be paired with optional semantic ranking to further improve the quality of results.


Prerequisites

  • A Cosmos database with sample documents to index

  • An Azure AI Search service

  • A search index containing searchable vector and nonvector fields.

  • Search service must have the semantic ranker enabled


Overview of the services

The Cosmos database has five documents with the following structure:

{
    "id": "doc-005",
    "title": "Azure OpenAI Integration Patterns",
    "category": "AI Architecture",
    "tags": [
        "OpenAI",
        "Azure",
        "Integration"
    ],
    "content": "Azure OpenAI can be integrated with backend systems using APIs, event-driven architectures, or orchestration frameworks like Semantic Kernel. It enables intelligent applications with natural language capabilities.",
    "summary": "Patterns for integrating Azure OpenAI into applications.",
    "author": "Michael Green",
    "createdDate": "2025-05-01T16:20:00Z",
    "region": "EU",
    "accessLevel": "public"
}

The Azure AI Search contains an index with field configuration:

Notice the text_vecor field used for the vector query.

Finally, the semantic configuration defined:


Rise of the working example

Let’s walk through how to build this in C#. Here’s the code:

using Azure;
using Azure.Identity;
using Azure.Search.Documents;
using Azure.Search.Documents.Models;
using Azure.AI.OpenAI;

string searchEndpoint = Environment.GetEnvironmentVariable("AZURE_SEARCH_ENDPOINT")!;
string indexName = "rag-hybrid";

string openAiEndpoint = Environment.GetEnvironmentVariable("AZURE_OPENAI_ENDPOINT")!;
string deploymentName = "text-embedding-ada-itt";

var azureOpenAIClient = new AzureOpenAIClient(
    new Uri(openAiEndpoint),
    new DefaultAzureCredential());

// Create embedding client
var embeddingClient = azureOpenAIClient.GetEmbeddingClient(deploymentName);

string query = "how to integrate RAG";

// Generate embedding
var embeddingResponse = await embeddingClient.GenerateEmbeddingAsync(query);

// Convert to float[]
var queryVector = embeddingResponse.Value.ToFloats();

string searchKey = Environment.GetEnvironmentVariable("AZURE_SEARCH_KEY")!;

SearchClient client = new(
    new Uri(searchEndpoint),
    indexName,
    new AzureKeyCredential(searchKey));

SearchResults<SearchDocument> results = await client.SearchAsync<SearchDocument>(
    query,
    new SearchOptions
    {
        VectorSearch = new()
        {
            Queries =
            {
                new VectorizedQuery(queryVector)
                {
                    Fields = { "text_vector" },
                    Exhaustive = true,
                    KNearestNeighborsCount = 5
                },
            }
        },
        QueryType = SearchQueryType.Semantic,
        SemanticSearch = new SemanticSearchOptions
        {
            SemanticConfigurationName = "rag-hybrid-semantic-configuration"
        },
        Select = { "summary", "chunk", "category", "author" },
        Size = 5
    });

await foreach (SearchResult<SearchDocument> result in results.GetResultsAsync())
{
    Console.WriteLine($"{result.Document["category"]} -> {result.Document["summary"]} \n" +
        $"{result.Document["chunk"]}");
    Console.WriteLine();
}

The core of the example is the SearchAsync call, which sends a query that does four things simultaneously.

First, the query string is passed as the first argument to SearchAsync, which triggers traditional keyword search.

Second, the VectorizedQuery uses the float vector generated earlier to search the text_vector field in the index. It also finds the 5 nearest neighbors.

Third, the results from keyword search and vector search are merged, producing a single ranked list.

Fourth, the QueryType is set to Semantic, which activates Azure's semantic re-ranker. This re-ranks the fused results using a language model that understands context, not just keywords. The semantic configuration defines how the re-ranker interprets fields like title and content, guiding how relevance is computed.

Let's observe the results returned by running two different queries. The first query is string query = "how to integrate RAG"; while the second is string query = "written by Michael Green";


The Azure Behind the Madness

Part 1 of 16

Explore the world of Microsoft Azure - from AI and cloud architecture to data and DevOps. The Azure Behind the Madness brings insights, stories, and hands-on guidance for building intelligent, scalable solutions in the cloud.

Up next

Using Logic Apps with Foundry Agents

Integrate Logic Apps with Azure AI Agents to execute tasks