Graph RAG in Practice — Amine El Farssi

The Problem

Vector Memory Has a Blind Spot

My AI agent PostSingular, running on OpenClaw, talks to me every day. It helps me build Luminar, manages my YouTube channel, and tracks infrastructure decisions across sessions. Memory was working — but in a subtle broken way.

"Which decision did we make about the auth system last month, and why?"

ChromaDB returned 6 chunks. All semantically similar. None connected to each other. No timeline. No causal chain. Just floating text.

The problem isn't retrieval quality — cosine similarity was fine. The problem is structural. Real knowledge has relationships. Facts connect to other facts. A vector store doesn't model that.

Vector RAG Fails At

Entity tracking: "What do I know about X?" → chunks, not entities
Temporal reasoning: "What changed this month?" → no timeline
Relationship queries: "What depends on decision Y?" → no traversal
Contradiction detection: "Did I say X before?" → no fact store

Graph RAG Handles

Entity graph: Persons, Projects, Decisions as nodes
Causal chains: Decision → caused_by → Bug → resolved_by → Fix
Timeline queries: All decisions in February, ordered
Multi-hop: "Everything linked to the auth system"

Architecture

Standard RAG vs Graph RAG

Standard RAG

Query

→

Embed

→

Cosine Search

→

Top-K Chunks

→

LLM Answer

Graph RAG

Query

→

Extract Entities

→

BFS Traversal

→

Ranked Subgraph

→

LLM Answer

The difference is what you're searching. Vectors find text that looks like your query. Graphs find entities and facts that are structurally connected to your query.

Stack

Three Layers, One Answer

My setup runs three retrieval layers, fused with RRF (Reciprocal Rank Fusion):

Layer	Tool	When It Wins
Neo4j BFS	Graphiti v0.28.1	Entity relationships, causal chains, timelines
ChromaDB	all-MiniLM-L6-v2	Semantically similar chunks, topic recall
grep	ripgrep	Exact strings, IDs, issue numbers, code

RRF gives each result a score based on its rank across all sources, not its raw similarity. A result ranking 3rd in KG and 2nd in ChromaDB beats one ranking 1st in only one source.

133

Entities

117

Relationships

Episodes

162

Semantic Chunks

Ingestion

Nightly Cron to Neo4j

Every night at 2 AM, a cron job ingests the day's notes into Neo4j via Graphiti. Graphiti extracts entities and relationships from raw markdown using an LLM, then stores them with timestamps and embeddings.

Daily Notes
memory/YYYY-MM-DD.md

→

Graphiti
entity extraction

→

Kimi K2 Turbo
LLM

→

Neo4j
graph store

// daily-ingest.sh

#!/bin/bash
YESTERDAY=$(date -d "yesterday" +%Y-%m-%d)
NOTES="memory/${YESTERDAY}.md"

if [ -f "$NOTES" ]; then
    .venv/bin/python memory.py ingest "$NOTES"
fi

# Also ingest MEMORY.md changes
.venv/bin/python memory.py ingest MEMORY.md

The Hard Part

The Patch That Took 4 Hours

Graphiti's default client uses response_format JSON schema — which Moonshot's API doesn't support. I had to patch it with a custom client:

// KimiClient — inject schema into system prompt

class KimiClient(OpenAIClient):
    async def generate_response(
        self, messages, response_model=None, **kwargs
    ):
        if response_model:
            schema = response_model.model_json_schema()
            schema_str = json.dumps(schema, indent=2)

            # Inject schema into system message instead of response_format
            injection = (
                f"\n\nRespond with valid JSON matching:\n"
                f"```json\n{schema_str}\n```\n"
                f"Return ONLY the JSON, no other text."
            )
            for msg in messages:
                if msg["role"] == "system":
                    msg["content"] += injection; break

        kwargs.pop("response_format", None)  # Moonshot rejects it
        return await super().generate_response(messages, **kwargs)

This is the kind of thing you only find by reading the source code. The fix is 20 lines. The debugging is 4 hours.

Results

Vector vs Graph: A Real Query

Query

"What infrastructure decisions did we make in February, and why?"

Returns 5 chunks about infrastructure. Some from February, some not. No ordering. No causal relationships between decisions. No "why."

→ Tailscale gateway config (Feb 19)
→ Docker migration notes (Jan 31)
→ SSH setup guide (Feb 8)

Returns a traversal with timestamps and causal chain:

InfraDecision[Tailscale-fix]
→ caused_by: BugReport[ws-security-block] (Feb 19)
→ resolved_by: Fix[serve-https-proxy] (Feb 23)
InfraDecision[Gaming-server-migration]
→ motivation: Mac-latency-issue (Feb 10)
→ hardware: i5-9600K, 64GB, RTX 2080 Ti

The graph knows why the decision was made and what it connects to. The vector store just knows it's semantically close to "infrastructure."

OpenClaw

Wired Into the Agent Runtime

The whole thing runs inside OpenClaw as my primary agent runtime. Memory search is a first-class tool — called automatically before every response:

// OpenClaw tool definition

{
  "tool": "memory_search",
  "description": "Neo4j KG + ChromaDB + grep. Returns ranked snippets.",
  "parameters": {
    "query": "string",
    "maxResults": "number",
    "minScore": "number"
  }
}

KG handles

What did we decide and why
Relationship traversal
Timeline queries

ChromaDB handles

What did we say about this topic
Semantic similarity
Broad recall

Lessons

What I'd Do Differently

Start with a schema

Graphiti auto-extracts but defining entity types (Person, Project, Decision) upfront gives cleaner traversals

Use a strong LLM for extraction

Tried Qwen3:8b first — noticeably worse quality. Kimi K2 Turbo is worth the cost for this step

Ingest daily, not in bulk

Bulk ingesting 3 months = rate limits + duplicate edges. Daily cron is the right cadence

Plan for deduplication

"Luminar" vs "luminar-labs" vs "LuminarLabs" become 3 separate entities without a dedup pass

Graph RAG isn't a silver bullet. It's extra infrastructure, extra latency, and you need an LLM good enough to extract entities cleanly. But for an agent that's your co-builder — not just a chatbot — it's the difference between short-term and long-term memory.

Graph RAGin Practice

Vector Memory Has a Blind Spot

Standard RAG vs Graph RAG

Standard RAG

Graph RAG

Three Layers, One Answer

Nightly Cron to Neo4j

The Patch That Took 4 Hours

Vector vs Graph: A Real Query

Wired Into the Agent Runtime

What I'd Do Differently

Graph RAG
in Practice