Cookbook
Integrations
RAG & Retrieval
View on GitHub

RouterQuery engine

Route queries to different indices using LlamaIndex RouterQueryEngine for multi-document search.

Ravi Theja
Ravi Theja
@ravi03071991
Published on March 1, 2024

RouterQuery Engine

In this notebook we will look into RouterQueryEngine to route the user queries to one of the available query engine tools. These tools can be different indices/ query engine on same documents/ different documents.

Installation

python
!pip install llama-index
!pip install llama-index-llms-anthropic
!pip install llama-index-embeddings-huggingface

Set Logging

python
# NOTE: This is ONLY necessary in jupyter notebook.
# Details: Jupyter runs an event-loop behind the scenes.
#          This results in nested event-loops when we start an event-loop to make async queries.
#          This is normally not allowed, we use nest_asyncio to allow it for convenience.
import nest_asyncio
 
nest_asyncio.apply()
 
import logging
import sys
 
# Set up the root logger
logger = logging.getLogger()
logger.setLevel(logging.INFO)  # Set logger level to INFO
 
# Clear out any existing handlers
logger.handlers = []
 
# Set up the StreamHandler to output to sys.stdout (Colab's output)
handler = logging.StreamHandler(sys.stdout)
handler.setLevel(logging.INFO)  # Set handler level to INFO
 
# Add the handler to the logger
logger.addHandler(handler)
 
from IPython.display import HTML, display

Set Claude API Key

python
import os
 
os.environ["ANTHROPIC_API_KEY"] = "YOUR Claude API KEY"

Set LLM and Embedding model

We will use anthropic latest released Claude-3 Opus LLM.

python
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.anthropic import Anthropic
python
llm = Anthropic(temperature=0.0, model="claude-opus-4-1")
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-base-en-v1.5")
python
from llama_index.core import Settings
 
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512

Download Document

python
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/jerryjliu/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
--2024-03-08 07:04:27--  https://raw.githubusercontent.com/jerryjliu/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’

data/paul_graham/pa 100%[===================>]  73.28K  --.-KB/s    in 0.002s  

2024-03-08 07:04:27 (28.6 MB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]

Load Document

python
# load documents
from llama_index.core import SimpleDirectoryReader
 
documents = SimpleDirectoryReader("data/paul_graham").load_data()

Create Indices and Query Engines.

python
from llama_index.core import SummaryIndex, VectorStoreIndex
 
# Summary Index for summarization questions
summary_index = SummaryIndex.from_documents(documents)
 
# Vector Index for answering specific context questions
vector_index = VectorStoreIndex.from_documents(documents)
python
# Summary Index Query Engine
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    use_async=True,
)
 
# Vector Index Query Engine
vector_query_engine = vector_index.as_query_engine()

Create tools for summary and vector query engines.

python
from llama_index.core.tools.query_engine import QueryEngineTool
 
# Summary Index tool
summary_tool = QueryEngineTool.from_defaults(
    query_engine=summary_query_engine,
    description="Useful for summarization questions related to Paul Graham eassy on What I Worked On.",
)
 
# Vector Index tool
vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description="Useful for retrieving specific context from Paul Graham essay on What I Worked On.",
)

Create Router Query Engine

python
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.selectors.llm_selectors import LLMSingleSelector
python
# Create Router Query Engine
query_engine = RouterQueryEngine(
    selector=LLMSingleSelector.from_defaults(),
    query_engine_tools=[
        summary_tool,
        vector_tool,
    ],
)

Test Queries

python
response = query_engine.query("What is the summary of the document?")
HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
Selecting query engine 0: The question is asking for a summary of the document. Choice 1 specifically mentions that it is useful for summarization questions related to Paul Graham's essay on What I Worked On, making it the most relevant choice for answering the given question..
HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
python
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))
python
response = query_engine.query("What did Paul Graham do growing up?")
HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
Selecting query engine 1: The question asks about specific details from Paul Graham's life, which would likely be found in the original essay. A summary of the essay may not include all the relevant details about what he did growing up..
HTTP Request: POST https://api.anthropic.com/v1/messages "HTTP/1.1 200 OK"
python
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))