Traditional RAG excels at retrieving relevant chunks, but struggles with questions requiring multi-hop reasoning across entities and relationships. Graph RAG solves this by combining knowledge graphs with vector search.
When to Use Graph RAG
Graph RAG shines for queries involving:
- Entity relationships (“Who reported to X in 2023?”)
- Multi-hop reasoning (“What products does competitor Y offer?”)
- Temporal connections (“How did policy Z evolve?”)
- Complex hierarchies (org charts, dependency trees)
Architecture Comparison
AspectVector RAGGraph RAGRetrievalSemantic similarityGraph traversal + similarityRelationshipsImplicitExplicit, typedMulti-hopPoorExcellentComplexityLowMedium-HighSetup timeHoursDays-Weeks## Knowledge Graph Construction
Extract entities and relationships from text using LLMs:
import anthropic
client = anthropic.Anthropic()
def extract_graph(text: str):
prompt = f"""Extract entities and relationships from this text.
Text: {text}
Output as JSON:
{{
"entities": [
{{"id": "e1", "type": "Person", "name": "John Smith"}},
{{"id": "e2", "type": "Company", "name": "Acme Corp"}}
],
"relationships": [
{{"from": "e1", "to": "e2", "type": "WORKS_AT", "since": "2020"}}
]
}}"""
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=2000,
messages=[{"role": "user", "content": prompt}]
)
return json.loads(response.content[0].text)
Storing in Neo4j
from neo4j import GraphDatabase
driver = GraphDatabase.driver("bolt://localhost:7687", auth=("neo4j", "password"))
def insert_graph(graph_data):
with driver.session() as session:
# Create entities
for entity in graph_data["entities"]:
session.run(
f"""
MERGE (e:{entity["type"]} {{id: $id}})
SET e.name = $name
""",
id=entity["id"],
name=entity["name"]
)
# Create relationships
for rel in graph_data["relationships"]:
session.run(
f"""
MATCH (from {{id: $from_id}})
MATCH (to {{id: $to_id}})
MERGE (from)-[r:{rel["type"]}]->(to)
SET r += $properties
""",
from_id=rel["from"],
to_id=rel["to"],
properties=rel.get("properties", {})
)
Hybrid Vector + Graph Search
Combine graph traversal with embeddings for powerful queries:
def graph_rag_search(query: str):
# 1. Extract entities from query
query_entities = extract_entities(query)
# 2. Find similar entities in graph using embeddings
entity_matches = []
for entity in query_entities:
embedding = embed_text(entity["name"])
similar = vector_db.search(embedding, top_k=5, filter={"type": entity["type"]})
entity_matches.extend(similar)
# 3. Graph traversal from matched entities
subgraphs = []
with driver.session() as session:
for entity_id in entity_matches:
# Get 2-hop neighborhood
result = session.run(
"""
MATCH path = (start {id: $entity_id})-[*1..2]-(connected)
RETURN path
""",
entity_id=entity_id
)
subgraphs.append(result.data())
# 4. Convert subgraph to text context
context = subgraph_to_text(subgraphs)
# 5. Generate answer with LLM
return llm.generate(f"Context:n{context}nnQuestion: {query}")
GraphRAG by Microsoft
Microsoft’s GraphRAG automates graph construction and querying:
from graphrag import GraphRAG
# Initialize
rag = GraphRAG(
llm="gpt-4",
embedding_model="text-embedding-3-large",
graph_db="neo4j://localhost:7687"
)
# Index documents (auto-extracts graph)
rag.index_documents(documents)
# Query with graph-aware retrieval
response = rag.query(
"What products did competitors launch after our Q2 release?",
depth=2 # 2-hop traversal
)
Community Detection for Summarization
GraphRAG uses community detection to generate hierarchical summaries:
from graphrag.community import detect_communities
# Detect communities in knowledge graph
communities = detect_communities(graph, algorithm="leiden")
# Generate summary for each community
for community_id, nodes in communities.items():
# Get all text associated with community nodes
texts = get_texts_for_nodes(nodes)
# Generate community summary
summary = llm.generate(f"Summarize the following related information:n{texts}")
# Store summary as community node
graph.add_node(f"COMMUNITY_{community_id}", summary=summary, level=1)
Cypher Query Generation
Let LLMs generate Cypher queries from natural language:
def nl_to_cypher(question: str, schema: str):
prompt = f"""Given this graph schema:
{schema}
Generate a Cypher query to answer:
{question}
Cypher query:"""
cypher = llm.generate(prompt)
# Execute query
with driver.session() as session:
result = session.run(cypher)
return result.data()
# Example
schema = """
Nodes: Person(name, title), Company(name, industry)
Relationships: WORKS_AT(since), REPORTS_TO
"""
results = nl_to_cypher("Who are the senior engineers at Acme Corp?", schema)
Temporal Graphs
Model how relationships evolve over time:
// Create versioned relationship
MATCH (p:Person {name: "Alice"})
MATCH (c:Company {name: "Acme"})
CREATE (p)-[r:WORKED_AT {
from: date("2020-01-01"),
to: date("2023-06-30"),
title: "Engineer"
}]->(c)
// Query: Where did Alice work in 2022?
MATCH (p:Person {name: "Alice"})-[r:WORKED_AT]->(c:Company)
WHERE date("2022-06-01") >= r.from AND date("2022-06-01") <= r.to
RETURN c.name, r.title
Performance Optimization
Index frequently queried properties:
CREATE INDEX person_name FOR (p:Person) ON (p.name)
CREATE INDEX company_industry FOR (c:Company) ON (c.industry)
Limit traversal depth:
# Bad: unbounded traversal
MATCH (p:Person)-[*]-(connected) RETURN connected
# Good: limited depth
MATCH (p:Person)-[*1..3]-(connected) RETURN connected
Evaluation Metrics
Measure Graph RAG effectiveness:
- Entity extraction F1: Accuracy of entity/relation extraction
- Graph density: Edges per node (target: 2-5)
- Query latency: p95 should be < 500ms for 2-hop
- Answer accuracy: Compare to ground truth on multi-hop questions
Use Case: Financial Analysis
# Query: "What companies in fintech raised Series B in 2025 and hired ex-Google employees?"
def complex_query():
cypher = """
MATCH (c:Company)-[:IN_INDUSTRY]->(:Industry {name: "Fintech"})
MATCH (c)-[:RAISED_ROUND]->(r:FundingRound {type: "Series B"})
WHERE r.date >= date("2025-01-01")
MATCH (c)(prev:Company {name: "Google"})
RETURN DISTINCT c.name, r.amount, collect(p.name) AS hires
ORDER BY r.amount DESC
"""
results = session.run(cypher)
# Format for LLM
context = format_results(results)
return llm.generate(f"Based on this data:n{context}nnProvide analysis...")
Hybrid Approach: Best of Both
Combine traditional RAG and Graph RAG:
- Use vector RAG for semantic search across document chunks
- Use Graph RAG for entity-relationship queries
- Route queries based on question type (classifier model)
- Fuse results when both approaches return relevant info
This hybrid system delivers the best accuracy across diverse query types.
Frequently Asked Questions
When does Graph RAG beat vector RAG?
On multi-hop questions like “which engineers worked on Project X and also report to Y?”. Vector search returns related chunks but can’t traverse relationships. A knowledge graph follows the chain of edges and gives a precise answer.
Do I need to build a graph from scratch?
No. Tools like Neo4j’s LLM Graph Builder, LlamaIndex KGExtract, or Microsoft’s GraphRAG library extract entities and relationships from your documents into a graph automatically.
How big should my graph be?
Start with the documents you already use for vector RAG. Most production deployments end up with 10K to 1M nodes. Beyond that, graph queries get slow and you need a real graph database.
Is Graph RAG worth the extra complexity?
Only if your queries actually need it. For “what does this document say” questions, hybrid vector + BM25 still wins. Graph RAG pays off in domains with structured knowledge: medicine, law, enterprise IT, supply chains.
Which graph database should I use?
Neo4j is the default and has the best LLM integration. ArangoDB and Memgraph are alternatives if you want multi-model. For very small graphs, NetworkX in Python works fine without a real database.