Technical Blog
Hello GraphRAG: Building a Graph-Aware RAG Pipeline with Neo4j
graphragneo4jpythonrag
A step-by-step walkthrough of wiring a minimal GraphRAG prototype in Python using Neo4j and an LLM.
Retrieval-Augmented Generation (RAG) is great when your knowledge lives in documents. But once relationships start to matter — who depends on what, which component talks to which — you usually outgrow a simple vector store.
That is where GraphRAG comes in.
Why GraphRAG?
A graph lets you model:
- Entities: machines, components, people, documents
- Relations:
DEPENDS_ON,BUILT_BY,LOCATED_IN - Context: paths across the graph that give you why something matters
Instead of retrieving a single chunk, you can walk the graph to gather a small, high-signal subgraph and feed that into your prompt.
Minimal Python setup
Below is a tiny (but realistic) starting point for working with Neo4j in Python.
from neo4j import GraphDatabase
from dataclasses import dataclass
from typing import List
@dataclass
class GraphNode:
id: str
label: str
properties: dict
class GraphRAGClient:
def __init__(self, uri: str, user: str, password: str):
self._driver = GraphDatabase.driver(uri, auth=(user, password))
def close(self) -> None:
self._driver.close()
def query_subgraph(self, component_id: str, depth: int = 2) -> List[GraphNode]:
cypher = """
MATCH p = (c:Component {id: $component_id})-[*1..$depth]-(n)
WITH nodes(p) AS ns
UNWIND ns AS n
RETURN DISTINCT n
"""
with self._driver.session() as session:
records = session.run(cypher, component_id=component_id, depth=depth)
return [
GraphNode(
id=record["n"].get("id"),
label=list(record["n"].labels)[0],
properties=dict(record["n"])
)
for record in records
]
if __name__ == "__main__":
client = GraphRAGClient(
uri="neo4j+s://demo-instance.databases.neo4j.io",
user="neo4j",
password="demo-password",
)
nodes = client.query_subgraph(component_id="pump-42", depth=2)
for node in nodes:
print(node.label, node.id, node.properties.get("name"))
client.close()This is not production-ready, but it is intentionally compact:
- It gives you a
GraphRAGClientabstraction you can grow over time. - It demonstrates a useful graph query pattern for local neighborhoods.
- It is small enough to paste into a notebook and start experimenting today.
From here you can:
- Add embedding-backed search to find the starting nodes.
- Expand the subgraph with domain-specific traversals.
- Format that subgraph into a prompt for your LLM of choice.