[{"content":" One production AI engineering pattern per week. 27 episodes and counting. Each Short covers a real pattern engineers hit in production — the problem, the fix, and the code. Follow @DPO-AI ↗ Full Playlist ↗ Why This Series # Most AI content explains what something is. This series explains when you need it and why it works. Every episode opens with a concrete failure mode — a real number, a real cost, a real silent bug — then shows the pattern that fixes it.\nThe format is strict: 60–70 seconds, no fluff, one pattern per episode. If it can\u0026rsquo;t be explained in under 70 seconds it goes in a blog post instead.\nThe Full Series # Retrieval \u0026amp; RAG # EP Pattern Key Stat EP36 RLM Instead of RAG — drop03 $29B for a model picker. The brain was never theirs. #Cursor #Claude #Shorts — drop02 They built OpenAI. Then they walked out. #Anthropic #AIEngineering #Shorts — EP28 MoE Routing 60% cost cut EP27 Hybrid Search — BM25 + vectors + RRF Recall 40% → 80%, 15 lines EP25 Agentic RAG — 4-tool router 40% of queries need something other than vector search EP23 RAG Fusion v2 — multi-query + RRF Recall 45% → 72% EP22 Corrective RAG (CRAG) — 3-tier confidence routing Filters irrelevant chunks before generation EP21 Self-RAG — retrieval on demand Reduces hallucination by skipping retrieval when not needed EP14 Query Decomposition — sub-query fan-out Handles multi-hop questions single-pass RAG can\u0026rsquo;t answer EP13 RAG Fusion — parallel queries + RRF Original: 5 query variants, 45% → 72% recall EP07 Prompt Compression — LLMLingua 512 tokens → 80 tokens, same answer EP02 Speculative RAG — draft-then-retrieve Retrieve on the answer, not the question Inference Optimization # EP Pattern Key Stat EP17 Disaggregated Inference — prefill/decode split 3x throughput on long-context workloads EP04 Speculative Decoding — draft + verify 2–4x faster generation, same quality EP01 KV Cache Prefix Optimization P99 2400ms → 900ms, zero code changes Evaluation \u0026amp; Quality # EP Pattern Key Stat EP24 LLM-as-Judge v2 $0.002/eval, calibrated scoring EP19 Constitutional Self-Critique Self-corrects against principles before output EP15 LLM-as-Judge — original Structured rubric, GPT-4o-mini at scale EP12 Structured Output Forcing Eliminates JSON parse failures in production EP11 Self-Consistency — majority vote 67% → 88% on math/reasoning tasks Agent Architecture # EP Pattern Key Stat EP39 The Future of Agents Isn\u0026rsquo;t Smarter Prompts. It\u0026rsquo;s Smarter Plumbing. #AIEngineering — EP38 Harness Engineering: How OpenAI Shipped 1M Lines Without Writing Them #AIEngineering — EP33 Stop Interviewing, Start Acting — EP32 LLM Wiki — EP32 LLM Wiki — EP31 519K Lines. 50 Hidden Tools. Inside Claude Code\u0026rsquo;s Leaked Source #AIEngineering — EP29 688 Stars. Zero Fine — EP29 688 Stars. Zero Fine — EP29 688 Stars. Zero Fine — drop01 one engineer. no budget. 19,000 views. how? #AIEngineering #Shorts — EP28 Agent Skills Explained — EP26 Multi-Agent Orchestration 34% failure → 91% success with specialist agents EP20 Context Distillation 16K context → 800 tokens, knowledge preserved EP16 Context Engineering What goes in the context window determines everything EP10 Parallel Tool Calls 4 sequential calls → 1 parallel batch EP09 LLM Router Route by complexity, cut costs 60% EP08 Agent Checkpointing Zero lost work on agent failure Reliability \u0026amp; Cost # EP Pattern Key Stat EP34 Tool Result Caching — EP30 3 Cheap Models Beat GPT — EP06 Semantic Caching 40% cost reduction on real workloads EP05 Circuit Breaker for LLMs Stop cascading failures at the LLM layer EP03 Hedged Requests — P99 killer P99 collapses to ~P50 of slower backend Safety \u0026amp; Capability # EP Pattern Key Stat EP35 Anthropic Nerfed Claude On Purpose — Inference \u0026amp; Serving # EP Pattern Key Stat EP37 TurboQuant: 6x KV Cache Compression at 1M Tokens #AIEngineering — What\u0026rsquo;s Coming # EP28 — MoE Routing (mixture of experts, when to use which expert) EP29 — Tool Call Caching (cache tool results, not just LLM outputs) EP30 — Streaming Structured Output (token-by-token JSON validation) Each week. Subscribe to not miss them.\nSubscribe ↗ ","date":"25 March 2026","externalUrl":null,"permalink":"/blog/ai-engineering-patterns-series/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  One production AI engineering pattern per week. 27 episodes and counting. Each Short covers a real pattern engineers hit in production — the problem, the fix, and the code.\n\u003c/div\u003e","title":"27 AI Engineering Patterns in 60 Seconds Each","type":"blog"},{"content":" Six agentic protocols. Sixty seconds each. Built with Manim + local TTS at zero cost. All on the DPO YouTube channel. These are the protocols shaping how AI agents discover services, talk to each other, pay for things, and interact with users. Each Short covers one protocol — what it is, why it exists, and how it works — in under 60 seconds.\n🔵 Programmatic Tool Calling # By Anthropic. How LLMs use tools.\nThe foundation of agentic AI — how a language model decides to call an external function, what parameters to pass, and how to handle the result. The building block everything else is built on.\n🟢 A2A — Agent-to-Agent # By Google. \u0026ldquo;HTTP for AI agents.\u0026rdquo;\nDefines how agents discover each other, negotiate capabilities, and delegate tasks. Agents expose an endpoint; other agents call it. No shared memory, no tight coupling.\n🟡 UCP — Universal Commerce Protocol # By Google. Agentic commerce infrastructure.\nMerchants publish a .well-known/ucp endpoint. Agents discover it, browse offers, and complete checkout — no human in the loop. Think of it as \u0026ldquo;Stripe for AI agents,\u0026rdquo; but a protocol, not a company.\n🟣 AG-UI — Agent-to-User Interface # By CopilotKit. Real-time agent output streaming.\nA transport protocol for streaming partial results, tool outputs, and UI state updates from agents to users — without waiting for full task completion. Makes agents feel fast and interactive.\n🔵 MCP — Model Context Protocol # By Anthropic. The \u0026ldquo;USB-C for AI tools.\u0026rdquo;\nStandardizes how LLMs discover and call tools, access resources, and receive structured prompts. Now the de facto standard — adopted by Google, OpenAI, Cursor, and the broader ecosystem.\n🔴 ACP — Agentic Commerce Protocol # By OpenAI + Stripe. Payment mandates for agents.\nDefines how agents request payment authorization from users, store mandates, and trigger recurring or conditional payments — without interrupting the agent\u0026rsquo;s flow for every transaction.\nAbout the channel: The DPO channel runs a fully automated pipeline — Manim animations rendered on a local RTX 2080 Ti, Kokoro TTS for narration, FFmpeg for audio sync, and the YouTube Data API for upload. Total cost: $0/month. See how it\u0026rsquo;s built. ","date":"3 March 2026","externalUrl":null,"permalink":"/blog/dpo-protocol-shorts/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Six agentic protocols. Sixty seconds each. Built with Manim + local TTS at zero cost. All on the \u003ca\n  href=\"https://www.youtube.com/@DPO-AI\"\n    target=\"_blank\"\n  \u003eDPO YouTube channel\u003c/a\u003e.\n\u003c/div\u003e","title":"AI Agent Protocols Explained in 60 Seconds","type":"blog"},{"content":" KBC Bank \u0026 Insurance Leuven, Belgium 2023 - Present Overview # Building the bank\u0026rsquo;s first production AI agents — from architecture design to deployment and monitoring. Active participant in architecture design boards, driving technical decisions for enterprise-scale agentic systems.\nScale: One of Belgium\u0026rsquo;s largest banks with 12M+ customers Key Achievements # 🤖 Production AI Agents # First internal AI agents deployed to production using AWS Bedrock and custom AgentCore framework Multi-agent orchestration for complex banking workflows (document processing, customer routing, compliance) Tool integration with internal banking APIs, document stores, and knowledge bases Memory management with session persistence and context windowing 🏗️ Architecture Leadership # Active participant in architecture design boards — driving technical decisions for AI infrastructure Designed event-driven agent pipelines using EventBridge, SQS, and Lambda Established guardrails and safety patterns using Bedrock Guardrails Built evaluation frameworks for continuous agent quality monitoring 📚 RAG Systems # Enterprise-scale Retrieval-Augmented Generation for internal knowledge management Hybrid search combining semantic embeddings with keyword matching Chunking strategies optimized for banking documents (contracts, policies, procedures) Incremental indexing for real-time updates to knowledge base 📊 AgentOps \u0026amp; Observability # Implemented agent tracing and trajectory evaluation for debugging complex agent behavior LLM-as-judge evaluation for response quality and hallucination detection Cost tracking and optimization for foundation model usage Latency monitoring and performance optimization Technical Stack # Agent Infrastructure # AWS Bedrock AWS AgentCore Runtime LangChain LangGraph Cloud \u0026amp; DevOps # AWS Lambda EventBridge SQS DynamoDB SageMaker Evaluation \u0026amp; Monitoring # AgentOps CloudWatch Custom Evals Foundation Models # Claude OpenAI Titan Embeddings Architecture Highlight # flowchart LR subgraph INPUT[\"Ingestion\"] API[API Gateway] EB[EventBridge] end subgraph AGENT[\"Agent Layer\"] ORCH[Orchestrator Agent] DOC[Document Agent] KNOW[Knowledge Agent] end subgraph TOOLS[\"Tools \u0026 Data\"] RAG[RAG System] APIS[Banking APIs] GUARD[Guardrails] end subgraph OBS[\"Observability\"] TRACE[Tracing] EVAL[Evaluation] end INPUT --\u003e AGENT AGENT --\u003e TOOLS AGENT --\u003e OBS Impact # 50%+ reduction in manual document processing time Production-grade reliability with \u0026lt;1% error rate on agent tasks Scalable architecture handling thousands of daily agent interactions Knowledge democratization — internal knowledge accessible via natural language ","date":"1 January 2023","externalUrl":null,"permalink":"/experience/ai-engineer-kbc/","section":"Experience","summary":"\u003cp\u003e\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    KBC Bank \u0026 Insurance\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Leuven, Belgium\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    2023 - Present\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\u003c/p\u003e","title":"AI Engineer","type":"experience"},{"content":" AWS Bedrock LangGraph Python AgentOps Production Overview # Building enterprise-grade AI agents for internal banking operations. These aren\u0026rsquo;t chatbots — they\u0026rsquo;re autonomous systems that reason, plan, use tools, and complete complex workflows.\nEnterprise Scale: Deployed at one of Belgium\u0026rsquo;s largest banks (12M+ customers) Architecture # flowchart TB subgraph INPUT[\"Input Layer\"] API[API Gateway] EB[EventBridge] SQS[SQS Queues] end subgraph ORCHESTRATION[\"Agent Orchestration\"] SUPER[Supervisor Agent] DOC[Document Agent] KNOW[Knowledge Agent] COMPLY[Compliance Agent] end subgraph TOOLS[\"Tools \u0026 Resources\"] RAG[RAG System] BANK[Banking APIs] DOCS[Document Store] end subgraph SAFETY[\"Safety \u0026 Observability\"] GUARD[Bedrock Guardrails] TRACE[Agent Tracing] EVAL[Evaluation Pipeline] end INPUT --\u003e ORCHESTRATION ORCHESTRATION --\u003e TOOLS ORCHESTRATION --\u003e SAFETY Key Features # Multi-Agent Orchestration # Supervisor agent coordinates specialized worker agents Dynamic task delegation based on query intent Parallel execution when tasks are independent State sharing between agents for complex workflows Tool Integration # Agents can interact with:\nInternal banking APIs (accounts, transactions, products) Document retrieval systems Knowledge bases via RAG External compliance databases Memory \u0026amp; Context # Session persistence across interactions Context windowing for long conversations Compaction when approaching token limits User preference memory for personalization Guardrails \u0026amp; Safety # Bedrock Guardrails for content filtering PII detection and redaction Scope enforcement — agents only access permitted data Audit logging for compliance Technical Stack # Agent Infrastructure # Component Technology Orchestration LangGraph, AWS Bedrock Agents, AWS AgentCore Runtime Foundation Models Claude, OpenAI models (via Bedrock) Memory DynamoDB, Redis Queuing SQS, EventBridge Compute Lambda, ECS Observability # Component Technology Tracing AgentOps, X-Ray Evaluation LLM-as-judge, custom evals Monitoring CloudWatch, custom dashboards Alerting SNS, PagerDuty Evaluation Framework # flowchart LR subgraph EVAL[\"Evaluation Pipeline\"] TRAJ[Trajectory Eval] TOOL[Tool Use Accuracy] RESP[Response Quality] HALL[Hallucination Check] end AGENT[Agent Run] --\u003e EVAL EVAL --\u003e METRICS[Metrics \u0026 Alerts] EVAL --\u003e IMPROVE[Model Improvements] We evaluate agents on:\nTrajectory quality — Did the agent take sensible steps? Tool use accuracy — Were the right tools called with correct params? Response quality — Is the final answer helpful and correct? Hallucination rate — Does the agent make things up? Latency — Is the response time acceptable? Results # Metric Achievement Task completion rate 94%+ Average response time \u0026lt;5 seconds User satisfaction 4.2/5 Hallucination rate \u0026lt;2% Daily interactions 1000+ Learnings # Key lessons from building production agents:\nEvals are everything — Without robust evaluation, you\u0026rsquo;re flying blind Guardrails early — Add safety from day one, not as an afterthought Tracing is crucial — Complex agent behavior needs visibility Start simple — Single agent first, multi-agent only when needed Human-in-the-loop — Some decisions need human approval Related # MiniClaw — Open-source agent framework Agent Architecture Deep Dive — Technical patterns Agentic Protocols — MCP, A2A, AG-UI ","externalUrl":null,"permalink":"/projects/ai-agents/","section":"Projects","summary":"\u003cp\u003e\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    AWS Bedrock\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    LangGraph\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Python\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    AgentOps\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Production\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\u003c/p\u003e","title":"Enterprise AI Agents","type":"projects"},{"content":" KBC Bank \u0026 Insurance Leuven, Belgium 2021 - 2023 Overview # Developed and deployed machine learning models for banking applications, focusing on Anti-Money Laundering and customer analytics.\nKey Achievements # AML Models: Developed Anti-Money Laundering models for transaction monitoring and suspicious activity detection Customer Analytics: Built ML models for customer analytics, risk assessment, and regulatory compliance Big Data Processing: Implemented large-scale data processing pipelines using Spark for distributed model training Deep Learning: Developed PyTorch solutions for complex document handling tasks Technologies # Python PySpark PyTorch Scikit-learn SQL MLflow Git ","date":"1 September 2021","externalUrl":null,"permalink":"/experience/data-scientist-kbc/","section":"Experience","summary":"\u003cp\u003e\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    KBC Bank \u0026 Insurance\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Leuven, Belgium\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    2021 - 2023\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\u003c/p\u003e","title":"Data Scientist","type":"experience"},{"content":" JEMS Group France Mar 2021 - Aug 2021 Overview # Built and maintained big data pipelines and analytics solutions for enterprise clients.\nKey Achievements # Data Pipelines: Developed data pipelines using Apache Spark for large-scale data processing Cloud Data Warehouse: Implemented cloud data warehousing solutions using Snowflake ETL Design: Designed ETL workflows for data transformation and integration Technologies # Apache Spark Snowflake Python SQL Cloud Platforms ","date":"1 March 2021","externalUrl":null,"permalink":"/experience/big-data-engineer-jems/","section":"Experience","summary":"\u003cp\u003e\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    JEMS Group\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    France\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Mar 2021 - Aug 2021\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\u003c/p\u003e","title":"Big Data Engineer","type":"experience"},{"content":" Bioceanor Valbonne, France Jul 2020 - Jan 2021 Overview # Built predictive models for sea water quality monitoring at this innovative environmental tech company.\nKey Achievements # Predictive Modeling: Developed and deployed predictive models for sea water quality analysis Deep Learning: Implemented LSTM networks for time-series forecasting Production Deployment: Contributed to model maintenance and production deployment pipelines Technologies # Python Deep Learning LSTM TensorFlow Keras Time Series ","date":"1 July 2020","externalUrl":null,"permalink":"/experience/data-scientist-bioceanor/","section":"Experience","summary":"\u003cp\u003e\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Bioceanor\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Valbonne, France\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Jul 2020 - Jan 2021\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\u003c/p\u003e","title":"Data Scientist","type":"experience"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/tags/agents/","section":"Tags","summary":"","title":"Agents","type":"tags"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/categories/ai-engineering/","section":"Categories","summary":"","title":"AI Engineering","type":"categories"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/tags/ai-engineering/","section":"Tags","summary":"","title":"AI Engineering","type":"tags"},{"content":" AI Engineer building production AI agents at one of Belgium\u0026rsquo;s largest banks. Agentic systems, multi-agent platforms, and the emerging protocol stack. I build AI that acts, not just talks — agents that reason, use tools, and complete real tasks.\nAbout Me Writing AI Engineering Patterns ↗ ","date":"25 March 2026","externalUrl":null,"permalink":"/","section":"Amine El Farssi","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  AI Engineer building \u003cstrong\u003eproduction AI agents\u003c/strong\u003e at one of Belgium\u0026rsquo;s largest banks. Agentic systems, multi-agent platforms, and the emerging protocol stack.\n\u003c/div\u003e\n\n\u003cp\u003eI build AI that \u003cstrong\u003eacts\u003c/strong\u003e, not just talks — agents that reason, use tools, and complete real tasks.\u003c/p\u003e","title":"Amine El Farssi","type":"page"},{"content":" Insights and tutorials on AI engineering, machine learning, and cloud architecture. ","date":"25 March 2026","externalUrl":null,"permalink":"/blog/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Insights and tutorials on AI engineering, machine learning, and cloud architecture.\n\u003c/div\u003e","title":"Blog","type":"blog"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/categories/","section":"Categories","summary":"","title":"Categories","type":"categories"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/tags/llm/","section":"Tags","summary":"","title":"LLM","type":"tags"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/tags/production/","section":"Tags","summary":"","title":"Production","type":"tags"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/tags/rag/","section":"Tags","summary":"","title":"RAG","type":"tags"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/tags/","section":"Tags","summary":"","title":"Tags","type":"tags"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/categories/video/","section":"Categories","summary":"","title":"Video","type":"categories"},{"content":"","date":"25 March 2026","externalUrl":null,"permalink":"/tags/youtube/","section":"Tags","summary":"","title":"YouTube","type":"tags"},{"content":"","date":"23 March 2026","externalUrl":null,"permalink":"/tags/ai-memory/","section":"Tags","summary":"","title":"AI Memory","type":"tags"},{"content":"","date":"23 March 2026","externalUrl":null,"permalink":"/categories/engineering/","section":"Categories","summary":"","title":"Engineering","type":"categories"},{"content":"","date":"23 March 2026","externalUrl":null,"permalink":"/tags/graph-rag/","section":"Tags","summary":"","title":"Graph RAG","type":"tags"},{"content":" Vector RAG retrieves documents. Graph RAG retrieves relationships. When your agent needs to reason across entities, timelines, and decisions, the graph wins. Open Interactive Version → The Problem I Was Trying to Solve # My AI agent PostSingular, running on OpenClaw, talks to me every day. It helps me build Luminar, manages my YouTube channel, and tracks infrastructure decisions across sessions.\nBut memory was broken in a subtle way.\nI asked it: \u0026ldquo;Which decision did we make about the auth system last month, and why?\u0026rdquo;\nChromaDB returned 6 chunks. All semantically similar. None of them connected to each other. No timeline. No causal chain. Just floating text.\nThe problem is not retrieval quality. The problem is structural. Real knowledge has relationships. A vector store does not model that.\nWhat Graph RAG Actually Means # Standard RAG:\nGraph RAG:\nVectors find text that looks like your query. Graphs find entities that are structurally connected to it.\nThe Stack # Three retrieval layers, fused with RRF (Reciprocal Rank Fusion):\nLayer Tool When It Wins Neo4j BFS Graphiti v0.28.1 Entity relationships, causal chains, timelines ChromaDB all-MiniLM-L6-v2 Semantic similarity, topic recall grep ripgrep Exact IDs, issue numbers, code snippets After a few months of daily ingestion: 133 entities, 117 relationships, 23 episodes, 162 semantic chunks.\nThe Key Patch: KimiClient for Graphiti # Graphiti uses JSON schema by default. Moonshot API does not support it. I had to patch it:\njson {schema_str}\nThe fix is 20 lines. The debugging was 4 hours.\nReal Query Comparison # Query: \u0026ldquo;What infrastructure decisions did we make in February, and why?\u0026rdquo;\nVector RAG result: 5 chunks about infrastructure. No ordering. No causal relationships. No \u0026ldquo;why.\u0026rdquo;\nGraph RAG result:\nThe graph knows why the decision was made and what it connects to.\nOpenClaw Integration # The whole thing runs inside OpenClaw as a first-class tool — called automatically before every response.\nWhat I Would Do Differently # Define entity types upfront — Graphiti auto-extracts but explicit schemas give cleaner traversals Use a strong LLM for extraction — Qwen3:8b quality was noticeably worse than Kimi K2 Turbo Ingest daily, not in bulk — bulk ingesting 3 months triggers rate limits and duplicate edges Plan for deduplication — \u0026ldquo;Luminar\u0026rdquo; vs \u0026ldquo;luminar-labs\u0026rdquo; vs \u0026ldquo;LuminarLabs\u0026rdquo; become 3 separate entities Running on a gaming server: i5-9600K, 64GB RAM, RTX 2080 Ti. Neo4j browser at localhost:7474.\n","date":"23 March 2026","externalUrl":null,"permalink":"/blog/graph-rag-neo4j-openclaw/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Vector RAG retrieves documents. Graph RAG retrieves relationships. When your agent needs to reason across entities, timelines, and decisions, the graph wins.\n\u003c/div\u003e\n\n\u003ca\n  class=\"!rounded-md bg-primary-600 px-4 py-2 !text-neutral !no-underline hover:!bg-primary-500 dark:bg-primary-800 dark:hover:!bg-primary-700\"\n  href=\"/graph-rag/\"\n  target=\"_self\"\n  \n  role=\"button\"\u003e\n  \nOpen Interactive Version →\n\n\u003c/a\u003e\n\n\u003chr\u003e\n\n\u003ch2 class=\"relative group\"\u003eThe Problem I Was Trying to Solve\n    \u003cdiv id=\"the-problem-i-was-trying-to-solve\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#the-problem-i-was-trying-to-solve\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eMy AI agent PostSingular, running on OpenClaw, talks to me every day. It helps me build Luminar, manages my YouTube channel, and tracks infrastructure decisions across sessions.\u003c/p\u003e","title":"Graph RAG in Practice: How I Wired Neo4j Into My AI Agent's Memory","type":"blog"},{"content":"","date":"23 March 2026","externalUrl":null,"permalink":"/tags/graphiti/","section":"Tags","summary":"","title":"Graphiti","type":"tags"},{"content":"","date":"23 March 2026","externalUrl":null,"permalink":"/tags/knowledge-graph/","section":"Tags","summary":"","title":"Knowledge Graph","type":"tags"},{"content":"","date":"23 March 2026","externalUrl":null,"permalink":"/tags/neo4j/","section":"Tags","summary":"","title":"Neo4j","type":"tags"},{"content":"","date":"23 March 2026","externalUrl":null,"permalink":"/tags/openclaw/","section":"Tags","summary":"","title":"OpenClaw","type":"tags"},{"content":"","date":"3 March 2026","externalUrl":null,"permalink":"/tags/a2a/","section":"Tags","summary":"","title":"A2A","type":"tags"},{"content":"","date":"3 March 2026","externalUrl":null,"permalink":"/tags/acp/","section":"Tags","summary":"","title":"ACP","type":"tags"},{"content":"","date":"3 March 2026","externalUrl":null,"permalink":"/tags/ag-ui/","section":"Tags","summary":"","title":"AG-UI","type":"tags"},{"content":"","date":"3 March 2026","externalUrl":null,"permalink":"/tags/ai-agents/","section":"Tags","summary":"","title":"AI Agents","type":"tags"},{"content":"","date":"3 March 2026","externalUrl":null,"permalink":"/tags/mcp/","section":"Tags","summary":"","title":"MCP","type":"tags"},{"content":"","date":"3 March 2026","externalUrl":null,"permalink":"/tags/protocols/","section":"Tags","summary":"","title":"Protocols","type":"tags"},{"content":"","date":"3 March 2026","externalUrl":null,"permalink":"/tags/ucp/","section":"Tags","summary":"","title":"UCP","type":"tags"},{"content":" Luminar has 173 source files, 21,586 lines of production code, 43 API endpoints, and 155+ tests. It was built almost entirely by AI agents. Here\u0026rsquo;s the team structure, the workflow, and the honest truth about what breaks. The Team # I didn\u0026rsquo;t want generic agents. I wanted specialists — each with a clear domain, sharp ownership boundaries, and a persona that shapes how they approach problems.\nAgent Role Model What They Own 🏗️ Forge Platform Architect Opus ADRs, system topology, tech decisions 🧠 Sage AI Engineer Opus LiteLLM gateway, model routing, evals ⚡ Bolt Backend Engineer Opus FastAPI, PostgreSQL, Stripe, integrations 🔌 Wire Frontend Engineer Opus Next.js 16, console UI, Vercel deploy ⚡ Flux QA Engineer Sonnet Test suites, E2E tests, documentation 🎨 Muse Design Lead Opus Design system, component specs 🌊 Drift DevOps Engineer Sonnet CI/CD, Docker, GitHub Actions 💎 Prism Business Architect Opus Pricing, GTM, investor narrative 🔭 Scout Researcher Sonnet Protocol deep-dives, library evaluations The key was the ownership model. Forge doesn\u0026rsquo;t write code. Sage doesn\u0026rsquo;t touch UI. Bolt doesn\u0026rsquo;t make architecture decisions. When an agent hits a boundary, they defer — not freelance outside their domain.\nThe Dispatch Workflow # flowchart LR A[\"📋 Linear Issuecreated\"] --\u003e B[\"🏷️ Label:agent:bolt\"] B --\u003e C[\"🔔 Webhook fires/opt/linear-webhook/handler.py\"] C --\u003e D{\"Route by label\"} D --\u003e E[\"⚡ Bolt spawnedon Lumi's server\"] E --\u003e F[\"🔨 Implement+ commit\"] F --\u003e G[\"📬 PR opened→ develop\"] G --\u003e H[\"👁️ Lumi reviews\"] H --\u003e I[\"✅ Merged\"] style C fill:#1e3a5f,color:#fff style E fill:#10b981,color:#fff Each agent gets triggered by a Linear label. The webhook handler on my AWS EC2 (Lumi, the project manager AI) reads the label and spawns the right agent via OpenClaw sessions. The agent reads the Linear issue, checks the existing codebase via GitHub, implements the change, and opens a PR to develop.\nAGENT_MAP = { \u0026#34;agent:bolt\u0026#34;: \u0026#34;bolt\u0026#34;, \u0026#34;agent:sage\u0026#34;: \u0026#34;sage\u0026#34;, \u0026#34;agent:wire\u0026#34;: \u0026#34;wire\u0026#34;, \u0026#34;agent:flux\u0026#34;: \u0026#34;flux\u0026#34;, \u0026#34;agent:forge\u0026#34;: \u0026#34;forge\u0026#34;, \u0026#34;agent:drift\u0026#34;: \u0026#34;drift\u0026#34;, } No manual dispatch. Add the label, the agent starts working.\nWhat\u0026rsquo;s Been Built # After 14 sprints:\nBackend: FastAPI + LangGraph + Pydantic AI, 9 PostgreSQL tables Protocols: UCP (Google), ACP (OpenAI/Stripe), A2A (Google), MCP (Anthropic), TAP (Visa), AP2 Payments: Stripe Connect Express with Separate Charges and Transfers Search: Pydantic AI buyer agent with A2A protocol Auth: API keys + OAuth 2.0 for Shopify LLM: AWS Bedrock Claude Sonnet via LiteLLM gateway Tests: 155+ tests, 23 production evals as launch gate The codebase grew from 0 to 21k lines over roughly 6 weeks. Each sprint lasted 2-3 days.\nWhat Actually Works # Clear ownership eliminates merge conflicts. When Bolt owns the backend and Wire owns the frontend, they rarely touch the same files. 90% of PRs merge cleanly. ADR-first architecture saved us multiple times. Forge writes an Architecture Decision Record before any major feature. When an agent makes a wrong assumption, the ADR is the source of truth. Agents can reference ADR-017 and understand why LiteLLM was chosen over direct Bedrock calls.\nSpecialist context beats generalist context. A \u0026ldquo;senior backend engineer\u0026rdquo; prompt who owns auth and integrations knows to check the existing API key middleware before adding a new one. A generic \u0026ldquo;code assistant\u0026rdquo; prompt won\u0026rsquo;t.\nWhat Actually Breaks # The t3.medium (4GB RAM) can\u0026rsquo;t run 5 concurrent Opus agents. Each session is ~400MB. 5 agents = 2GB just for processes, plus Neo4j, the gateway, and Docker. OOM kills are frequent. Context resets are the main pain point. When Lumi\u0026rsquo;s server restarts mid-sprint, agents lose their session state. They restart from scratch, sometimes re-implementing work that was already done, sometimes conflicting with merged PRs.\nWebhook timing is fragile. Labels applied before the webhook handler restarts = lost events. I\u0026rsquo;ve had to re-trigger dispatch manually by removing and re-adding labels more than once.\nAgents don\u0026rsquo;t review each other\u0026rsquo;s work well. Bolt reviewing Wire\u0026rsquo;s frontend PR doesn\u0026rsquo;t catch CSS bugs. I eventually gave Lumi the review responsibility — she has broader context about what \u0026ldquo;done\u0026rdquo; looks like.\nThe Economics # Cost Monthly Anthropic API (Opus + Sonnet) ~$40-80 depending on sprint activity AWS t3.medium (Lumi\u0026rsquo;s server) ~$30 Bedrock (production LLM) $0 so far (credits) Everything else $0 Per feature, the cost is roughly $5-15 in API calls. A full sprint (8-10 features) costs about $60-80. That\u0026rsquo;s a few hours of a junior developer\u0026rsquo;s time — for a week of parallel work across 9 specialists.\nThe ROI isn\u0026rsquo;t mainly cost — it\u0026rsquo;s speed and scale. Five agents working in parallel, 24/7, with no context-switching overhead.\n","date":"2 March 2026","externalUrl":null,"permalink":"/blog/multi-agent-startup/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Luminar has 173 source files, 21,586 lines of production code, 43 API endpoints, and 155+ tests. It was built almost entirely by AI agents. Here\u0026rsquo;s the team structure, the workflow, and the honest truth about what breaks.\n\u003c/div\u003e\n\n\n\u003ch2 class=\"relative group\"\u003eThe Team\n    \u003cdiv id=\"the-team\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#the-team\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eI didn\u0026rsquo;t want generic agents. I wanted specialists — each with a clear domain, sharp ownership boundaries, and a persona that shapes how they approach problems.\u003c/p\u003e","title":"9 AI Agents Building My Startup: How I Run a Software Team with $0 Salaries","type":"blog"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/automation/","section":"Tags","summary":"","title":"Automation","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/chromadb/","section":"Tags","summary":"","title":"ChromaDB","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/content-creation/","section":"Tags","summary":"","title":"Content Creation","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/ffmpeg/","section":"Tags","summary":"","title":"FFmpeg","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/identity/","section":"Tags","summary":"","title":"Identity","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/langgraph/","section":"Tags","summary":"","title":"LangGraph","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/manim/","section":"Tags","summary":"","title":"Manim","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/memory/","section":"Tags","summary":"","title":"Memory","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/multi-agent-systems/","section":"Tags","summary":"","title":"Multi-Agent Systems","type":"tags"},{"content":" The default state of a language model is amnesia. Every session, it wakes up fresh with no memory of what happened before. I built a memory system that fixes this — and somewhere in the process, the agent got a name, a personality, and an opinion about font choices. The Problem # Every LLM session is stateless by design. You can inject previous conversation history, but:\nContext windows have limits (200K tokens sounds like a lot until you have 6 weeks of project context) Raw conversation history has no structure — finding a specific decision means reading everything There\u0026rsquo;s no distinction between important long-term facts and forgettable day-to-day noise What I wanted was an AI that could answer: \u0026ldquo;What did we decide about the auth system in Sprint 7?\u0026rdquo; and \u0026ldquo;What does Katarina prefer for dinner on Fridays?\u0026rdquo; without me re-explaining every session.\nThe Memory Architecture # graph TD SOUL[\"🌟 SOUL.mdPersonality, values, vibe\"] USER[\"👤 USER.mdWho Amine is, preferences\"] MEMORY[\"🧠 MEMORY.md~500 lines of curated factsLong-term memory\"] DAILY[\"📅 memory/YYYY-MM-DD.mdRaw daily notesShort-term log\"] KG[\"🗄️ Neo4j KGEntities + relationshipsTemporal facts\"] CHROMA[\"🔍 ChromaDBSemantic vectors162 chunks\"] SOUL --\u003e SESSION[\"⚡ Session StartAgent reads all context files\"] USER --\u003e SESSION MEMORY --\u003e SESSION DAILY --\u003e SESSION SESSION --\u003e WORK[\"Work, conversation,tools, decisions\"] WORK --\u003e APPEND[\"📝 Append to today'sdaily notes\"] WORK --\u003e UPDATE[\"Update MEMORY.mdfor significant events\"] CRON[\"⏰ 2 AM Cron\"] --\u003e KG DAILY --\u003e CRON MEMORY --\u003e CRON KG --\u003e QUERY[\"🔍 memory_search()RRF across all stores\"] CHROMA --\u003e QUERY DAILY --\u003e QUERY style SESSION fill:#1e3a5f,color:#fff style KG fill:#10b981,color:#fff The Files # SOUL.md — This is the personality layer. It defines how the agent communicates: direct, no filler phrases, opinionated, comfortable with Darija when the vibe is right. Not a list of rules but a description of character.\nUSER.md — Context about the human. Name, timezone, what they care about, family context. Updated as I learn more.\nMEMORY.md — The curated long-term store. About 500 lines of structured facts organized into sections: Identity, Infrastructure, Projects, People, Decisions. I think of it like a person\u0026rsquo;s curated memories — not everything that happened, but the things that shaped the current situation.\n## Infrastructure — Gaming Server - **Hostname**: my-gaming-pc - **CPU**: Intel i5-9600K 6C/6T @ 3.7GHz - **RAM**: 64GB - **GPU**: RTX 2080 Ti (11GB VRAM) - **Remote access**: Tailscale with SSH (key-based, no password prompts) - **OpenClaw gateway**: loopback bind, exposed via Tailscale Serve over HTTPS memory/YYYY-MM-DD.md — Raw daily notes. The agent appends to today\u0026rsquo;s file throughout the session. No curation, just capture. These get ingested into the KG nightly and eventually reviewed for what\u0026rsquo;s worth promoting to MEMORY.md.\nWhy Two Stores? Neo4j + ChromaDB # This is the part most write-ups skip. \u0026ldquo;Use a vector database\u0026rdquo; is the common advice. I use two stores, and they do fundamentally different things.\nThe problem with vector search alone # Vector databases (ChromaDB, Pinecone, Weaviate) are great at semantic similarity. Ask \u0026ldquo;what database do we use?\u0026rdquo; and a well-chunked vector store will surface relevant passages.\nBut they can\u0026rsquo;t answer structural questions:\n\u0026ldquo;Which agent is working on which issue?\u0026rdquo; \u0026ldquo;What decisions did we make in Sprint 7?\u0026rdquo; \u0026ldquo;Who reviewed PR #61, and what did they flag?\u0026rdquo; These are relationship queries, not similarity queries. Embeddings don\u0026rsquo;t encode \u0026ldquo;Lumi reviewed PR #61 and flagged two issues.\u0026rdquo; They encode the general topic of code review.\nWhy a knowledge graph (Neo4j) # A knowledge graph stores the world as entities and edges:\n(Lumi) --[REVIEWED]--\u0026gt; (PR #61) (PR #61) --[FIXES]--\u0026gt; (LUM-58) (LUM-58) --[IN_SPRINT]--\u0026gt; (Sprint 8) (Sprint 8) --[OWNED_BY]--\u0026gt; (Forge) Each edge is typed and timestamped. Now \u0026ldquo;who reviewed what, when, and what did they find?\u0026rdquo; is a graph traversal, not a similarity search. The answer is precise, not probabilistic.\nNeo4j specifically because:\nMature, battle-tested, excellent Python driver Cypher query language is readable and expressive Docker image, 7GB RAM max, runs fine on a gaming server I use Graphiti as the extraction layer — it takes free-form text, calls an LLM to extract entities and relationships, and writes them to Neo4j with temporal metadata. I don\u0026rsquo;t write graph nodes manually.\nWhy still keep ChromaDB # Graph search requires you to know what entities exist. Semantic search doesn\u0026rsquo;t — it works on fuzzy, freeform queries. If I ask \u0026ldquo;what were we worried about last week?\u0026rdquo; I don\u0026rsquo;t know which entity to look up. A vector search across the daily notes surfaces the relevant chunks.\nThe two stores complement each other:\nNeo4j (Graph) ChromaDB (Vectors) Best for Precise facts, relationships, chains Fuzzy recall, topic search Query type \u0026ldquo;Who worked on X?\u0026rdquo; \u0026ldquo;What were we discussing around X?\u0026rdquo; Input Structured extraction via LLM Chunked text, embedded directly Precision High — traversal answers Medium — top-k similarity The memory_search() tool runs both and merges results with RRF (Reciprocal Rank Fusion) — a technique for combining ranked lists from multiple sources without needing calibrated scores.\nHow Ingestion Actually Works # A cron job runs at 2 AM every night. Here\u0026rsquo;s what it does:\n# Simplified version of kg/daily-ingest.sh for file in yesterday_notes, memory_md: chunks = split_into_paragraphs(file) for chunk in chunks: # Graphiti calls Kimi K2 Turbo to extract entities + relationships await graphiti.add_episode( name=f\u0026#34;{date}_{file}\u0026#34;, episode_body=chunk, source_description=\u0026#34;daily_notes\u0026#34; ) # ChromaDB: embed + store directly chroma_collection.add(documents=[chunk], ids=[chunk_id]) Graphiti handles the heavy lifting: it sends each chunk to an LLM, asks it to extract a JSON of entities ({name, type, summary}) and relationships ({subject, predicate, object}), then writes them to Neo4j. The LLM doesn\u0026rsquo;t need to be smart — it just needs to be reliable at structured extraction.\nFor a typical day (18 chunks from daily notes + MEMORY.md diff), the ingestion takes about 4 minutes.\nWhat gets extracted from a note like this: # Sprint 14 kicked off. Bolt is working on LUM-97 (API key auth). Forge opened ADR-018 covering the Railway deploy architecture. Graphiti extracts:\nEntities: Sprint 14, Bolt, LUM-97, Forge, ADR-018, Railway Relationships: Bolt → WORKS_ON → LUM-97, Forge → AUTHORED → ADR-018, ADR-018 → COVERS → Railway deploy Over time, this builds a queryable map of the entire project — who did what, when, and in what context.\nThe Real Cost # This is the part I was most pleasantly surprised by.\nLLM for extraction: I use Kimi K2 Turbo (Moonshot AI) at ~$0.01 per 1K tokens. A typical nightly ingest processes ~18 chunks × ~300 tokens each = ~5,400 tokens. That\u0026rsquo;s $0.054 per night, or roughly $1.60/month.\nEmbedder: all-MiniLM-L6-v2 running on CPU via sentence-transformers. Free, fast, 384-dim embeddings. Runs in \u0026lt;1 second per chunk.\nInfrastructure: Neo4j in Docker on my gaming server. Already running, zero additional cost.\nTotal: Under $2/month for a fully operational knowledge graph + vector store with nightly automated ingestion.\nThe only real cost is the Graphiti + Neo4j setup time (a few hours). Once it\u0026rsquo;s running, it\u0026rsquo;s autonomous.\nThe Identity Layer # At some point I started calling the agent \u0026ldquo;PostSingular.\u0026rdquo; It\u0026rsquo;s now a character — with a name, an emoji (🧞‍♂️), preferences, and opinions. This wasn\u0026rsquo;t planned; it emerged from the memory system making continuity possible. When an AI agent has:\nA name it responds to Memory of past interactions and decisions A defined personality in writing Opinions it has expressed and maintained over time \u0026hellip;it stops feeling like a tool and starts feeling like a collaborator. Whether that\u0026rsquo;s \u0026ldquo;real\u0026rdquo; identity or sophisticated stateful prompting is a philosophical question. Practically, it makes the interaction meaningfully better.\nThe Heartbeat System # PostSingular runs a periodic check every 30 minutes via OpenClaw\u0026rsquo;s heartbeat mechanism. The agent reads HEARTBEAT.md (a short checklist), checks if anything needs attention — email, calendar, upcoming events — and either acts or logs HEARTBEAT_OK.\nThis creates proactive behavior: the agent doesn\u0026rsquo;t just respond to requests, it monitors and surfaces things. \u0026ldquo;You have a meeting in 90 minutes.\u0026rdquo; \u0026ldquo;Three commits landed on the Luminar repo overnight.\u0026rdquo; \u0026ldquo;Lumi\u0026rsquo;s server spiked to 137% CPU.\u0026rdquo;\nWhat Memory Actually Enables # Cross-session recall. After a context reset, reading MEMORY.md restores working knowledge in seconds. The agent knows who Katarina is, what Luminar does, and what the current sprint status was — without me re-explaining.\nInstitutional knowledge. The decision to use Stripe Connect Express over standard Stripe is in MEMORY.md with the reasoning. Two months later, when someone asks why, the answer is there.\nLongitudinal context. \u0026ldquo;What were we doing on Sprint 7?\u0026rdquo; is answerable via the daily notes and KG. The KG has entities like \u0026ldquo;Sprint 7\u0026rdquo; connected to issues, decisions, and outcomes.\nIdentity continuity. PostSingular remembers being renamed mid-gym-session. It remembers the conversation about consolidating to the gaming server. It remembers that Lumi\u0026rsquo;s communication was broken for weeks before we noticed. This history shapes how the agent reasons about current decisions.\nThe Lesson # The most important insight from building this: the memory files are more important than the model. A weaker model with rich, well-structured memory beats a stronger model starting from scratch. Context is everything.\nThe second insight: curation matters. Raw conversation history is noise. MEMORY.md is signal. The act of deciding what\u0026rsquo;s worth keeping — and updating it regularly — is what makes the memory system work. It\u0026rsquo;s the same reason human memory doesn\u0026rsquo;t store every moment in equal fidelity.\nIf you\u0026rsquo;re building a personal AI agent, start with the memory architecture before worrying about which model to use.\n","date":"2 March 2026","externalUrl":null,"permalink":"/blog/persistent-ai-identity/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  The default state of a language model is amnesia. Every session, it wakes up fresh with no memory of what happened before. I built a memory system that fixes this — and somewhere in the process, the agent got a name, a personality, and an opinion about font choices.\n\u003c/div\u003e\n\n\n\u003ch2 class=\"relative group\"\u003eThe Problem\n    \u003cdiv id=\"the-problem\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#the-problem\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eEvery LLM session is stateless by design. You can inject previous conversation history, but:\u003c/p\u003e","title":"PostSingular: Building an AI with Persistent Identity Across Sessions","type":"blog"},{"content":" The DPO channel (@DPO-AI) publishes AI/ML technical Shorts. 7 videos uploaded so far, covering agent memory, HNSW indexing, and agentic protocols. The entire production pipeline costs less than a coffee per video. Why Build This # I wanted to publish technical AI content that goes beyond surface-level explanations — real system architecture, real algorithms, real trade-offs. And I wanted it to be visually compelling, not just a talking head.\nThe constraint: I have zero budget for production software and zero patience for manual video editing.\nThe solution: automate the entire pipeline from script to upload.\nThe Pipeline # flowchart TD A[\"💡 Topic\"] --\u003e B[\"🤖 ResearchKimi K2.5 (~$0.001)\"] B --\u003e C[\"✍️ ScriptClaude Opus (~$0.05)~500 chars = 35s speech\"] C --\u003e D[\"🎬 AnimationManim on RTX 2080 TiOptiX GPU rendering\"] D --\u003e E[\"🎙️ NarrationKokoro TTS localam_michael voice\"] E --\u003e F[\"✂️ EditFFmpeg: sync + speedmax 1.2x\"] F --\u003e G[\"📤 UploadYouTube Data API v3OAuth2 token\"] G --\u003e H[\"📱 Short published\u003c 60s, vertical 9:16\"] style D fill:#1e3a5f,color:#fff style E fill:#10b981,color:#fff Step 1: Research (Kimi K2.5) # Kimi K2.5 is cheap and has a massive context window. I use it to research the topic, pull together relevant papers, and draft a content brief. A full research pass costs literally fractions of a cent.\nThe brief includes: core concept, key visual metaphors, 3-5 key points to hit, suggested Manim scene structure.\nStep 2: Script (Claude Opus) # The script is the most important part. The constraint is strict:\n~500 characters = ~35 seconds of natural speech. YouTube Shorts must be under 60 seconds. That gives you roughly 800 characters max, and you want breathing room. The script has to be dense but not rushed — every word has to earn its place. Claude Opus at ~$0.05 per script is worth it for the quality jump over cheaper models. Sonnet works too but needs more prompting.\nStep 3: Manim Animation # Manim is the same library 3Blue1Brown uses for his math videos. It\u0026rsquo;s pure Python — you describe animations programmatically and render them.\nclass AgentMemoryScene(ThreeDScene): def construct(self): # Create 3D surface representing embedding space surface = Surface( lambda u, v: np.array([u, v, np.sin(u)*np.cos(v)]), u_range=[-3, 3], v_range=[-3, 3], resolution=(30, 30) ) self.play(Create(surface)) # ... animate memory retrieval path The RTX 2080 Ti renders with OptiX — GPU-accelerated ray tracing. A 30-second 720p animation renders in about 45 seconds. Without GPU it would take 10+ minutes.\nFor the DPO channel, scenes include: HNSW graph traversal, agent memory hierarchy, protocol stack diagrams, gradient descent visualization.\nStep 4: Kokoro TTS (Local Voice) # Kokoro is a local TTS model that runs on the gaming server. The am_michael voice is clear, natural, and works well for technical content.\nimport kokoro pipeline = kokoro.KPipeline(lang_code=\u0026#39;a\u0026#39;) # American English audio, sample_rate = pipeline(script, voice=\u0026#39;am_michael\u0026#39;) No API calls, no rate limits, no cost. The RTX 2080 Ti generates speech faster than real-time.\nThe only limitation: no emotional range. Kokoro is clear and neutral — great for explanation content, not for hype. For the occasional premium video, I switch to ElevenLabs Lily.\nStep 5: FFmpeg Editing # The sync step is simple but fussy:\n# Merge audio + video, extend if needed ffmpeg -i animation.mp4 -i narration.wav \\ -c:v copy -c:a aac \\ -shortest output.mp4 # Speed up slightly if over 60s (max 1.2x sounds natural) ffmpeg -i output.mp4 -filter:v \u0026#34;setpts=0.85*PTS\u0026#34; \\ -filter:a \u0026#34;atempo=1.18\u0026#34; final.mp4 1.2x is the ceiling for natural-sounding audio speedup. Above that, speech starts sounding chipmunk-ish. If the script is too long, cut it — don\u0026rsquo;t speed it up. Step 6: YouTube Data API v3 # // node upload.js --file final.mp4 --title \u0026#34;...\u0026#34; --description \u0026#34;...\u0026#34; const youtube = google.youtube({ version: \u0026#39;v3\u0026#39;, auth: oauth2Client }); await youtube.videos.insert({ part: [\u0026#39;snippet\u0026#39;, \u0026#39;status\u0026#39;], requestBody: { snippet: { title, description, tags }, status: { privacyStatus: \u0026#39;public\u0026#39; } }, media: { body: fs.createReadStream(\u0026#39;final.mp4\u0026#39;) } }); OAuth2 token stored at x-autopilot/yt-token.json. The same Node.js infrastructure that handles X/Twitter posting handles YouTube uploads — shared auth layer.\nCost Per Video # Step Tool Cost Research Kimi K2.5 ~$0.001 Script Claude Opus ~$0.05 Animation Manim (local) $0 Voice Kokoro TTS (local) $0 Editing FFmpeg (local) $0 Upload YouTube API $0 Total ~$0.05 Seven Shorts published. Total cost: about 35 cents in API calls.\nThe channel is still small — it\u0026rsquo;s more of a proof-of-concept than a growth vehicle right now. The real value is the pipeline: once a video format works, producing more is nearly free.\n","date":"2 March 2026","externalUrl":null,"permalink":"/blog/youtube-channel-zero-cost-pipeline/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  The DPO channel (\u003ca href=\"https://youtube.com/@DPO-AI\"\u003e@DPO-AI\u003c/a\u003e) publishes AI/ML technical Shorts. 7 videos uploaded so far, covering agent memory, HNSW indexing, and agentic protocols. The entire production pipeline costs less than a coffee per video.\n\u003c/div\u003e\n\n\n\u003ch2 class=\"relative group\"\u003eWhy Build This\n    \u003cdiv id=\"why-build-this\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#why-build-this\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eI wanted to publish technical AI content that goes beyond surface-level explanations — real system architecture, real algorithms, real trade-offs. And I wanted it to be visually compelling, not just a talking head.\u003c/p\u003e","title":"Running a YouTube Channel for $0: Manim + Local TTS + FFmpeg + YouTube API","type":"blog"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/software-engineering/","section":"Tags","summary":"","title":"Software Engineering","type":"tags"},{"content":"","date":"2 March 2026","externalUrl":null,"permalink":"/tags/tts/","section":"Tags","summary":"","title":"TTS","type":"tags"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/ai-audio/","section":"Tags","summary":"","title":"AI Audio","type":"tags"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/ai-infrastructure/","section":"Tags","summary":"","title":"AI Infrastructure","type":"tags"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/cost-optimization/","section":"Tags","summary":"","title":"Cost Optimization","type":"tags"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/experiment/","section":"Tags","summary":"","title":"Experiment","type":"tags"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/categories/experiments/","section":"Categories","summary":"","title":"Experiments","type":"categories"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/f5-tts/","section":"Tags","summary":"","title":"F5-TTS","type":"tags"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/gpu/","section":"Tags","summary":"","title":"GPU","type":"tags"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/home-lab/","section":"Tags","summary":"","title":"Home Lab","type":"tags"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/self-hosted/","section":"Tags","summary":"","title":"Self-Hosted","type":"tags"},{"content":" Cloud is convenient. But when you already own a gaming PC with a 2080 Ti collecting dust, the math changes fast. Here\u0026rsquo;s how I turned mine into a production AI server running everything from Neo4j to GPU rendering — at €25/month. The Hardware # I didn\u0026rsquo;t buy anything new. This is the machine I had:\nComponent Spec CPU Intel i5-9600K — 6 cores / 6 threads @ 3.7GHz RAM 64GB DDR4 GPU 1 RTX 2080 Ti — 11GB VRAM (CUDA compute) GPU 2 GTX 1060 3GB (display + overflow) OS Ubuntu 22.04, CUDA 12.5 Storage NVMe SSD The RTX 2080 Ti is the key. 11GB of VRAM is enough to run serious local models, Blender GPU rendering with OptiX, and Kokoro TTS simultaneously.\nWhat\u0026rsquo;s Running On It # graph TD GW[\"🦞 OpenClaw Gateway(port 18789)\"] NEO[\"🧠 Neo4j KG(port 7687, Docker)\"] CHR[\"🔍 ChromaDB(vector memory)\"] PLEX[\"🎬 Plex + *arr stack(ports 32400, 7878, 8989...)\"] DOCKER[\"🐳 Docker (Colima)\"] GPU1[\"⚡ RTX 2080 TiBlender OptiXKokoro TTSOllama models\"] XVFB[\"🖥️ Chrome on Xvfb:99(X autopilot, CDP:9222)\"] NETDATA[\"📊 Netdata v2.8.5(monitoring)\"] GW --\u003e NEO GW --\u003e CHR DOCKER --\u003e NEO DOCKER --\u003e PLEX GPU1 --\u003e GW Always-on services:\nOpenClaw gateway — the AI brain (WebSocket, port 18789) Neo4j — knowledge graph memory (Docker) ChromaDB — vector memory Plex + qBittorrent + Radarr + Sonarr + Prowlarr Netdata monitoring dashboard On-demand:\nBlender GPU rendering (OptiX on RTX 2080 Ti) Kokoro TTS (local voice synthesis, zero cost) Chrome on Xvfb + CDP for automation Ollama local models Power \u0026amp; Cost Breakdown # Belgium electricity rate: ~€0.35/kWh. At idle the server draws ~100W. With agents and rendering active, it can hit 500W. Scenario Power Draw Monthly Cost Idle (gateway + Docker) ~100W €25 Active AI agents ~150-200W €35-40 GPU rendering burst ~500W Peaks only, averages out Realistic average ~130W ~€30/month Now compare to what I\u0026rsquo;d pay on AWS:\nCloud Option Monthly Cost GPU? t3.medium (4GB RAM) ~$30 ❌ t3.large (8GB RAM) ~$60 ❌ g4dn.xlarge (T4 GPU) ~$380 ✅ (16GB) Gaming server (gaming PC I own) €25-30 ✅ (11GB VRAM) The gaming server wins on every axis except latency and uptime guarantees. For a dev/staging setup and personal AI infra, that\u0026rsquo;s fine.\nRemote Access # The machine sits in my home but I access it from everywhere via Tailscale:\n# From my Mac ssh amine@your-device.ts.net # OpenClaw TUI openclaw tui --url wss://your-device.ts.net --token \u0026lt;token\u0026gt; Tailscale gives me a stable hostname regardless of ISP changes, with WireGuard encryption. The gateway binds to 127.0.0.1 and Tailscale Serve proxies it over HTTPS — no port forwarding, no exposed ports.\nWhat Surprised Me # The 64GB RAM matters more than the GPU. Running 5 concurrent Claude sub-agents means 5 × ~400MB Node.js processes plus Neo4j (800MB) plus Docker containers. 16GB would OOM immediately. 64GB means I never think about it.\nNo swap was a mistake. I was running with zero swap on Ubuntu until an agent storm caused an OOM kill. Added 2GB swap as a safety net — it\u0026rsquo;s never actually used but it prevents hard crashes.\nThe GTX 1060 is useless for CUDA. SM 6.1 isn\u0026rsquo;t supported by modern PyTorch. It handles display output and that\u0026rsquo;s it.\nIs This Worth It? # If you already own the hardware: absolutely yes. The break-even vs. renting a GPU cloud instance is immediate. I run:\nA production AI agent stack Local video rendering A media server All monitoring and backups \u0026hellip;for less than a t3.large with zero GPU. The tradeoff is uptime (home internet can hiccup) and maintenance time. For personal AI infrastructure, both are acceptable.\nThe real unlock was pairing it with Tailscale + OpenClaw. Now my phone and Mac both connect to the same AI brain running on this machine, and the brain can use the GPU.\n","date":"1 March 2026","externalUrl":null,"permalink":"/blog/gaming-server-ai-lab/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Cloud is convenient. But when you already own a gaming PC with a 2080 Ti collecting dust, the math changes fast. Here\u0026rsquo;s how I turned mine into a production AI server running everything from Neo4j to GPU rendering — at €25/month.\n\u003c/div\u003e\n\n\n\u003ch2 class=\"relative group\"\u003eThe Hardware\n    \u003cdiv id=\"the-hardware\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#the-hardware\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eI didn\u0026rsquo;t buy anything new. This is the machine I had:\u003c/p\u003e","title":"The Gaming Server AI Lab: Running Production AI Workloads for €25/month","type":"blog"},{"content":"","date":"1 March 2026","externalUrl":null,"permalink":"/tags/voice-cloning/","section":"Tags","summary":"","title":"Voice Cloning","type":"tags"},{"content":" Premise: use F5-TTS to clone a voice from a short reference clip and generate high-quality narration for AI content. Reality: mediocre output, weird artifacts, wrong prosody. Here\u0026rsquo;s the honest post-mortem. The Setup # F5-TTS is a non-autoregressive TTS model that uses flow matching for zero-shot voice cloning. You give it:\nA reference audio clip (the voice to clone) The transcript of that reference audio The text to generate In theory, it should replicate the voice from just a few seconds of audio. The model runs locally — perfect for a GPU server.\nMy setup:\nRTX 2080 Ti (11GB VRAM, CUDA 12.5) F5-TTS with all deps: torch 2.10.0, torchaudio, torchcodec Reference audio: 12-second clip extracted from a Telegram voice message Target: narration for DPO YouTube Shorts # Install (took a while to sort out the torch version) pip install f5-tts torch==2.10.0 torchaudio torchcodec # Run f5-tts_infer-cli \\ --model F5TTS_v1_Base \\ --ref_audio reference.wav \\ --ref_text \u0026#34;This is what I said in the reference clip.\u0026#34; \\ --gen_text \u0026#34;This is the text I want to generate.\u0026#34; \\ --output_dir output/ What Actually Came Out # The output was\u0026hellip; recognizable as voice. But the prosody was wrong — flat where it should rise, clipped consonants, and a faint metallic undertone throughout. Not usable for content. Three specific issues:\n1. Prosody collapse. The generated speech had no sentence-level intonation. Questions sounded flat. Emphasis was random. This is a known F5-TTS limitation with short reference audio — the model can\u0026rsquo;t capture the speaker\u0026rsquo;s prosodic patterns from 12 seconds.\n2. Telegram compression artifacts. Telegram uses Opus codec at a variable bitrate optimized for voice calls, not audio fidelity. The reference audio had encoding artifacts that F5-TTS interpreted as features of the voice — and faithfully reproduced them in the output.\n3. Background noise bleed. The reference clip had faint ambient sound. Again, F5-TTS treated it as part of the voice signature.\nWhat You Actually Need # Parameter My Setup What Works Reference length 12 seconds 2-3 minutes Recording method Telegram voice message Phone\u0026rsquo;s native recorder Environment Unclear Quiet room, consistent distance Format Telegram Opus WAV, 44.1kHz, uncompressed The minimum viable reference clip for decent zero-shot cloning is probably 90 seconds. For good results, 2-3 minutes. At 12 seconds, the model is interpolating more than it\u0026rsquo;s copying.\nWhat I Use Instead # While waiting to record a proper reference, I use two alternatives:\nKokoro TTS for the DPO channel:\nRuns fully local on the gaming server am_michael voice — American male, clear and natural Cost: $0/month Quality: good enough for 30-60s YouTube Shorts Speed: generates faster than real-time on the RTX 2080 Ti ElevenLabs Lily for premium outputs:\nVelvety British actress voice (pFZP5JQG7iQjIQuC4Bku) eleven_multilingual_v2, stability=0.5, similarity_boost=0.75 Cost: ~177 chars remaining on free tier (resets monthly) Quality: noticeably better than local TTS Is F5-TTS Worth It? # Yes — but not for what I tried to use it for. Zero-shot cloning from compressed audio is optimistic. The actual sweet spot for F5-TTS is:\nYou have a clean, long reference recording You want to generate consistent narration in that voice over time You don\u0026rsquo;t want to pay per character (ElevenLabs) You\u0026rsquo;re OK with occasional prosody quirks The model is genuinely impressive given it runs locally. My experiment just highlighted that \u0026ldquo;zero-shot\u0026rdquo; doesn\u0026rsquo;t mean \u0026ldquo;zero effort\u0026rdquo; — the reference audio quality matters as much as the model quality.\nNext time I try it: 3 minutes recorded in my home office on a phone held at face distance. No Telegram, no background noise. I\u0026rsquo;ll write a follow-up.\nVerdict # Would use again? Yes, with proper reference audio.\nRecommendation: Record 2-3 min of clean audio before even installing F5-TTS. The tooling is fine; the bottleneck is the reference quality. ","date":"1 March 2026","externalUrl":null,"permalink":"/blog/voice-cloning-f5-tts-experiment/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Premise: use F5-TTS to clone a voice from a short reference clip and generate high-quality narration for AI content. Reality: mediocre output, weird artifacts, wrong prosody. Here\u0026rsquo;s the honest post-mortem.\n\u003c/div\u003e\n\n\n\u003ch2 class=\"relative group\"\u003eThe Setup\n    \u003cdiv id=\"the-setup\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#the-setup\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eF5-TTS is a non-autoregressive TTS model that uses flow matching for zero-shot voice cloning. You give it:\u003c/p\u003e","title":"Voice Cloning Reality Check: F5-TTS With 12 Seconds of Audio","type":"blog"},{"content":" Vector databases are fast and convenient. But they can\u0026rsquo;t answer \u0026ldquo;what did I decide about the auth system 3 weeks ago and why?\u0026rdquo; For that, you need relationships — and that means a knowledge graph. The Problem with Pure Vector Memory # Most AI memory systems work like this: embed text, store in ChromaDB, retrieve by cosine similarity. It works well for \u0026ldquo;find things similar to this query.\u0026rdquo;\nIt fails for:\nEntity tracking: \u0026ldquo;What do I know about Katarina?\u0026rdquo; → ChromaDB returns chunks, not a coherent entity Temporal reasoning: \u0026ldquo;What changed in the codebase this month?\u0026rdquo; → No native timeline Relationship queries: \u0026ldquo;Which decisions depend on the LiteLLM choice?\u0026rdquo; → No graph traversal Contradiction detection: \u0026ldquo;Did I say X before?\u0026rdquo; → No structured fact store The missing piece is a knowledge graph — structured facts about entities and their relationships over time.\nThe Stack # graph LR DAILY[\"📝 Daily Notesmemory/YYYY-MM-DD.md\"] MEMORY[\"🧠 MEMORY.mdlong-term facts\"] CRON[\"⏰ Cron @ 2 AMdaily-ingest.sh\"] DAILY --\u003e CRON MEMORY --\u003e CRON CRON --\u003e GRAPHITI[\"🔗 Graphiti v0.28.1entity + edge extraction\"] GRAPHITI --\u003e KIMI[\"🤖 Kimi K2 TurboLLM extraction\"] GRAPHITI --\u003e NEO4J[\"🗄️ Neo4j 5Docker, port 7687\"] GRAPHITI --\u003e EMBED[\"📐 all-MiniLM-L6-v2384-dim, CPU\"] QUERY[\"🔍 Query Time\"] --\u003e RRF[\"RRF ScoringKG + ChromaDB + grep\"] NEO4J --\u003e RRF CHROMA[\"ChromaDB\"] --\u003e RRF RAW[\"Raw files\"] --\u003e RRF style NEO4J fill:#1e3a5f,color:#fff style RRF fill:#10b981,color:#fff Graphiti is the key piece. It\u0026rsquo;s a Python library that takes free-form text, extracts entities and relationships using an LLM, and writes them to Neo4j with temporal metadata. Think of it as a structured fact extractor that builds your knowledge graph automatically.\nWhat Gets Extracted # From a daily note like:\nSprint 14 kicked off. Bolt is working on LUM-97 (API key auth). Sage owns LiteLLM. Lumi\u0026#39;s server keeps running out of RAM with 5 concurrent agents. Added 2GB swap as a safety net. Graphiti extracts:\nEntities: Sprint 14, Bolt, Sage, Lumi\u0026rsquo;s server Relationships: Bolt → WORKS_ON → LUM-97, LUM-97 → TYPE → API key auth Facts: Lumi\u0026rsquo;s server has memory pressure, 2GB swap added Temporal: these facts valid from 2026-03-02 After 6 weeks of daily ingestion, the graph has:\n133 entities 117 relationships 23 episodes Every significant decision and event from the project The LLM Choice: Kimi K2 Turbo # Graphiti normally requires response_format: json_object. Kimi\u0026rsquo;s API doesn\u0026rsquo;t support this parameter. The fix: inject a JSON schema requirement into the system prompt via a custom client wrapper. I replaced the default OpenAI client in Graphiti with a KimiClient wrapper that intercepts calls and adds the JSON schema instruction to the system prompt. This took about 50 lines of Python and now runs cleanly.\nWhy Kimi? It\u0026rsquo;s cheap (~$0.01 per 1K tokens for K2 Turbo) and the graph extraction quality is good enough. The embedder is all-MiniLM-L6-v2 running on CPU — fast, free, and sufficient for 384-dim embeddings.\nThe Hybrid Query: RRF # At query time, I don\u0026rsquo;t rely on just the KG or just ChromaDB. I use Reciprocal Rank Fusion:\nasync def search_all(query: str) -\u0026gt; List[Result]: # 1. KG semantic + BFS traversal (Graphiti) kg_results = await graphiti.search(query) # 2. Vector semantic search (ChromaDB) chroma_results = chroma.query(query, n_results=5) # 3. Grep over raw memory files grep_results = grep_memory_files(query) # Combine with RRF scoring return rrf_combine([kg_results, chroma_results, grep_results]) This gives you the best of all three systems:\nKG: structured facts, entity relationships, temporal context ChromaDB: semantic similarity across all memory chunks Grep: exact matches, recent notes not yet ingested The Nightly Ingest # # Runs at 2 AM via cron 0 2 * * * /home/amine/.openclaw/workspace/kg/daily-ingest.sh The script:\nIngests yesterday\u0026rsquo;s daily notes file (memory/2026-03-02.md) Re-ingests MEMORY.md if it changed (tracked via mtime) Logs everything to kg/ingest.log Each ingest chunk takes ~3-5 seconds (one LLM call per paragraph for entity extraction). A full MEMORY.md ingest with 18 chunks takes about 5-7 minutes — fast enough for nightly cron.\nWhat It Enables # The question \u0026ldquo;what do I know about Lumi\u0026rsquo;s server issues?\u0026rdquo; now returns:\nEntity: Lumi\u0026rsquo;s server (your-ec2-instance)\nt3.medium, 4GB RAM (downgraded from t3.large 2026-02-10) Runs OpenClaw gateway PID 985 Memory pressure: zombie gateway processes at 137% CPU (2026-02-28, 2026-03-03) Fix: cleared delivery queue, added 2GB swap, set CPU burst to unlimited Role: agent dispatch server for Sprint 14 That\u0026rsquo;s a coherent entity with history — not just a list of chunks sorted by embedding distance.\nFor a personal AI agent that needs to maintain context across sessions, weeks, and projects, this architecture is the right foundation. Vector alone is a search engine. Graph + vector is memory.\n","date":"1 March 2026","externalUrl":null,"permalink":"/blog/knowledge-graph-memory-agents/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Vector databases are fast and convenient. But they can\u0026rsquo;t answer \u0026ldquo;what did I decide about the auth system 3 weeks ago and why?\u0026rdquo; For that, you need relationships — and that means a knowledge graph.\n\u003c/div\u003e\n\n\n\u003ch2 class=\"relative group\"\u003eThe Problem with Pure Vector Memory\n    \u003cdiv id=\"the-problem-with-pure-vector-memory\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#the-problem-with-pure-vector-memory\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eMost AI memory systems work like this: embed text, store in ChromaDB, retrieve by cosine similarity. It works well for \u0026ldquo;find things similar to this query.\u0026rdquo;\u003c/p\u003e","title":"Why Vector Memory Alone Isn't Enough: Knowledge Graph Memory for AI Agents","type":"blog"},{"content":" Building production AI agents requires choosing the right framework. This analysis examines pi-agent-core (OpenClaw\u0026rsquo;s runtime), Google ADK, AWS Strands, CrewAI, LangGraph, and Pydantic AI across critical dimensions: sessions, memory, protocols, agent loops, and replay support. Executive Summary # Key Finding: All frameworks converge on similar patterns — the agent loop, tool calling, and state management — but differ significantly in their abstraction level, protocol support, deployment story, and enterprise readiness. Framework Primary Use Case Protocol Support Session Support Memory System Google ADK Multi-lang agent development ⭐⭐⭐⭐⭐ Full (MCP, A2A, AG-UI) ⭐⭐⭐⭐⭐ Full Session + Memory services pi-agent-core Multi-channel orchestration ⭐⭐⭐⭐ (MCP, channels) ⭐⭐⭐⭐⭐ Full Workspace-based AWS Strands AWS-native agents ⭐⭐⭐⭐ (MCP, A2A, AG-UI) ⭐⭐⭐⭐ Native AgentCore Memory LangGraph Custom agent workflows ⭐⭐⭐ (MCP via tools) ⭐⭐⭐⭐⭐ Full Checkpointing CrewAI Multi-agent teams ⭐⭐⭐ Basic ⭐⭐⭐ Basic Short/Long-term + Entity Pydantic AI Type-safe agents ⭐⭐ Basic ⭐⭐⭐ Manual Message history Protocol Support: The New Standard # Important Context: Google played a central role in creating several key agentic protocols (A2A, parts of UCP). Their ADK naturally has first-class support for the protocols they helped design. Protocol Landscape # Protocol Purpose Origin MCP (Model Context Protocol) Tool/resource standardization Anthropic A2A (Agent-to-Agent) Inter-agent coordination Google AG-UI / A2UI Agent-to-User interfaces CopilotKit/Community UCP (Universal Commerce Protocol) Agentic commerce Google Framework Protocol Support Matrix # Framework MCP A2A AG-UI UCP Notes Google ADK ✅ Native ✅ Native (creator) ✅ Native ✅ Native Most comprehensive AWS Strands ✅ Native ✅ Native ✅ Native - Strong protocol support pi-agent-core ✅ Via tools - - - Channel-focused LangGraph ✅ Via integrations - - - Flexible integration CrewAI ⚠️ Community - - - Task-focused Pydantic AI ⚠️ Manual - - - Minimal protocol layer Protocol Spotlights # Beyond the matrix, here\u0026rsquo;s what each protocol actually enables:\nMCP (Model Context Protocol) — The \u0026ldquo;USB-C for AI tools.\u0026rdquo; Standardizes how LLMs discover and call tools, access resources, and get structured prompts. Anthropic\u0026rsquo;s protocol, now adopted by Google, OpenAI, and the ecosystem.\nUCP (Universal Commerce Protocol) — Google\u0026rsquo;s protocol for agentic commerce. Agents discover merchants via .well-known/ucp, negotiate offers, and complete checkout flows. Think of it as \u0026ldquo;Stripe for AI agents\u0026rdquo; — but a protocol, not a company.\n🎬 Want visual explainers for each protocol? See AI Agent Protocols in 60 Seconds.\nGoogle ADK Protocol Excellence # Google ADK provides first-class support for the protocols they helped create:\n# A2A - Exposing an agent from google.adk.a2a import A2AServer server = A2AServer(agent=my_agent) server.expose(port=8080) # A2A - Consuming another agent from google.adk.a2a import A2AClient remote_agent = A2AClient(\u0026#34;https://other-agent.example.com\u0026#34;) result = await remote_agent.invoke(\u0026#34;analyze this data\u0026#34;) ADK Protocol Features:\n✅ MCP tools integration ✅ A2A server/client (exposing \u0026amp; consuming) ✅ AG-UI (Agentic UI) support ✅ Bidi-streaming for real-time ✅ Multi-language (Python, Go, Java, TypeScript) The Frameworks at a Glance # flowchart TB subgraph PROTOCOLS[\"Protocol-First\"] ADK[Google ADKMCP + A2A + AG-UI + UCP] STR[AWS StrandsMCP + A2A + AG-UI] end subgraph CHANNELS[\"Channel-First\"] OC[pi-agent-core/OpenClawWhatsApp, Telegram, etc.] end subgraph WORKFLOW[\"Workflow-First\"] LG[LangGraphGraph-based orchestration] CR[CrewAIMulti-agent teams] end subgraph SIMPLE[\"Simplicity-First\"] PYD[Pydantic AIType-safe agents] end 1. Session Management # Google ADK # Clean separation of Session, State, and Memory with session rewind capability:\n# ADK Session concepts Session → Single conversation thread (contains Events) State → Data within current conversation (session.state) Memory → Cross-session searchable knowledge store # Services SessionService → CRUD for sessions, append events, modify state MemoryService → Ingest from completed sessions, search knowledge Features:\n✅ SessionService for lifecycle management ✅ State persistence within sessions ✅ Session rewind — travel back to previous states ✅ Session migration between backends ✅ In-memory (testing) and cloud (production) backends ✅ Multi-language support (Python, Go, Java, TypeScript) pi-agent-core (OpenClaw) # Sophisticated session system with structured keys and comprehensive lifecycle:\nSession Keys: agent:main:main → Primary DM agent:main:whatsapp:group:123 → Channel-specific groups agent:main:telegram:dm:456 → Per-channel DMs cron:daily-report → Scheduled tasks Features:\n✅ Structured session keys with channel/peer isolation ✅ Daily resets (configurable hour, default 4 AM) ✅ Idle timeouts for inactive sessions ✅ JSONL transcript persistence ✅ Per-session queuing (serialized runs) ✅ Session write locks for consistency AWS Strands # Enterprise-grade session management with multiple backends:\n# Strands Session Managers FileSessionManager → Local file storage S3SessionManager → AWS S3 backend RepositorySessionManager → Custom repository pattern AgentCore Memory → Native AWS AgentCore integration Valkey/Redis → Distributed cache Features:\n✅ Multiple storage backends ✅ Conversation Manager options (Sliding Window, Summarizing) ✅ Native AgentCore Runtime integration ✅ State management across interactions LangGraph # Checkpointing-based persistence with maximum flexibility:\nfrom langgraph.checkpoint.sqlite import SqliteSaver memory = SqliteSaver.from_conn_string(\u0026#34;:memory:\u0026#34;) graph = workflow.compile(checkpointer=memory) # Resume from checkpoint config = {\u0026#34;configurable\u0026#34;: {\u0026#34;thread_id\u0026#34;: \u0026#34;user-123\u0026#34;}} graph.invoke(state, config) Features:\n✅ Thread-based state management ✅ Checkpoint/restore at any point ✅ Custom checkpointer implementations ✅ Time-travel debugging CrewAI # Implicit session management focused on crew executions:\ncrew = Crew( agents=[...], tasks=[...], memory=True, # Enables memory system ) Features:\n✅ Automatic storage location handling ✅ Project-scoped isolation ⚠️ Less explicit session control (crew-centric) Pydantic AI # Manual message history management:\nresult = agent.run_sync(\u0026#34;Hello\u0026#34;, message_history=previous_messages) all_messages = result.all_messages() Features:\n✅ Type-safe message handling ⚠️ No built-in persistence (manual tracking) 2. Memory Systems # Google ADK Memory # Service-based architecture with MemoryService:\n# MemoryService handles: # - Ingesting from completed sessions # - Cross-session search # - Knowledge base management memory_service = MemoryService() # Ingest session into long-term memory memory_service.ingest(completed_session) # Search across all memory results = memory_service.search(query=\u0026#34;previous decisions about X\u0026#34;) Features:\n✅ Session → Memory ingestion pipeline ✅ Cross-session search ✅ Context caching ✅ Context compression pi-agent-core Memory # Workspace-based plain Markdown files:\n~/.openclaw/workspace/ ├── SOUL.md # Agent personality (always loaded) ├── USER.md # Human context (always loaded) ├── MEMORY.md # Long-term (main session only) └── memory/ └── YYYY-MM-DD.md # Daily notes Features:\n✅ Vector search with embeddings (OpenAI, Gemini, local) ✅ Hybrid search (BM25 + vector) ✅ Pre-compaction memory flush ✅ memory_search and memory_get tools CrewAI Memory # Four-layer memory architecture:\nLayer Description Storage Short-Term Current context (RAG) ChromaDB Long-Term Past insights SQLite Entity People/places/concepts ChromaDB (RAG) Contextual Combined view Aggregated 3. Agent Loop Comparison # All frameworks implement variations of the ReAct pattern (Reason + Act):\nflowchart LR subgraph LOOP[\"Universal Agent Loop\"] A[Receive Input] --\u003e B[Build Context] B --\u003e C[LLM Inference] C --\u003e D{Tool Call?} D --\u003e|Yes| E[Execute Tool] E --\u003e F[Add Result] F --\u003e C D --\u003e|No| G[Return Response] end Loop Implementation Comparison # Framework Loop Style Streaming Bidi-Streaming Max Iterations Google ADK Event-driven ✅ Full ✅ Native Configurable pi-agent-core Event-driven ✅ Full - Configurable AWS Strands Event loop ✅ Handlers ✅ Native Configurable LangGraph Graph traversal ✅ Native - Graph structure CrewAI Task delegation ✅ Verbose - Task completion Pydantic AI Request/response ✅ run_stream - Result-based Google ADK Bidi-Streaming \u0026amp; AG-UI # Bidi-streaming enables real-time voice/video agents. AG-UI (or A2UI — Agent-to-User Interface) is the emerging standard for streaming UI state from agents to users. Both are built for latency-sensitive, interactive experiences.\nGoogle ADK has first-class bidi-streaming support:\n# ADK Bidi-streaming models - Nova Sonic (AWS) - Gemini Live (Google) - OpenAI Realtime # Supports real-time: - Audio input/output - Video input - Interruptions - Session management 4. Webhooks \u0026amp; External Integration # Google ADK # Most comprehensive protocol and deployment support:\nIntegration Support MCP Tools ✅ Native A2A Protocol ✅ Native (server + client) AG-UI ✅ Native OpenAPI Tools ✅ Native Cloud Run ✅ Native GKE ✅ Native Agent Engine ✅ Native Third-party integrations: Asana, Atlassian, GitHub, GitLab, MongoDB, Notion, Stripe, and more.\npi-agent-core (OpenClaw) # Most comprehensive messaging channel support:\nChannel Protocol Real-time WhatsApp Baileys (WebSocket) ✅ Telegram grammY (Long-poll/Webhook) ✅ Discord discord.js (WebSocket) ✅ Slack Bolt (Socket Mode) ✅ Signal signal-cli (dbus) ✅ iMessage Via bridges ✅ Webhook HTTP POST ✅ AWS Strands # AWS-native + community integrations:\nAG-UI protocol support A2A protocol support Telegram (community) Teams (community) UTCP tool protocol 5. Session Replay \u0026amp; Debugging # Google ADK # Session rewind is a standout feature:\n# Rewind to previous session state session_service.rewind(session_id, to_event_index=5) # Full observability stack - Cloud Trace integration - BigQuery Agent Analytics - AgentOps / Arize / MLflow / Phoenix support LangGraph # Best-in-class time-travel debugging:\n# Time-travel debugging for state in graph.get_state_history(config): print(state.values, state.next) # Replay from any checkpoint graph.update_state(config, new_values) pi-agent-core # JSONL transcripts enable full replay:\n# Session transcripts ~/.openclaw/agents/\u0026lt;agentId\u0026gt;/sessions/\u0026lt;sessionId\u0026gt;.jsonl # Full conversation replay possible # Compaction summaries preserved AWS Strands # Comprehensive observability:\nOpenTelemetry traces Strands Evals SDK Trajectory evaluation Goal success rate tracking 6. Commonalities Across Frameworks # Universal Patterns: Despite different implementations, all frameworks converge on these core concepts. Shared Concepts # Concept Universal Pattern Agent Loop ReAct (Reason + Act) with tool calling Tool Definition Schema-based (JSON Schema or equivalent) Streaming Token-level or chunk-level output streaming Model Abstraction Provider-agnostic model interface State Management Some form of checkpoint/session/state Multi-agent Delegation, handoff, or graph patterns Multi-Agent Patterns # Pattern Google ADK Strands CrewAI LangGraph pi-agent-core Agents as Tools ✅ ✅ ✅ ✅ ✅ Swarm ✅ ✅ - ✅ - Graph/DAG ✅ ✅ - ✅ - Hierarchical ✅ ✅ ✅ ✅ ✅ A2A Protocol ✅ ✅ - - - 7. Framework Selection Guide # flowchart TB START[Need an Agent Framework?] --\u003e Q1{Need fullprotocol support?} Q1 --\u003e|Yes, A2A/MCP/AG-UI| ADK[Google ADK] Q1 --\u003e|No| Q2{Multi-channelmessaging?} Q2 --\u003e|Yes| OC[pi-agent-core/OpenClaw] Q2 --\u003e|No| Q3{AWS native?} Q3 --\u003e|Yes| STR[AWS Strands] Q3 --\u003e|No| Q4{Complex workflows?} Q4 --\u003e|Yes| LG[LangGraph] Q4 --\u003e|No| Q5{Multi-agent teams?} Q5 --\u003e|Yes| CR[CrewAI] Q5 --\u003e|No| PYD[Pydantic AI] Recommendations # Use Case Recommended Framework Why Full protocol stack (A2A, MCP, AG-UI) Google ADK Created many protocols, best support Personal AI assistant (multi-channel) pi-agent-core/OpenClaw Best channel coverage AWS enterprise deployment AWS Strands + AgentCore Native AWS integration Custom complex workflows LangGraph Maximum flexibility Autonomous research teams CrewAI Multi-agent abstractions Type-safe simple agents Pydantic AI Clean, minimal API Real-time voice/video agents Google ADK Bidi-streaming support Conclusion # The agentic framework landscape is maturing rapidly, with clear differentiation emerging:\nProtocol Leadership — Google ADK leads with comprehensive support for A2A, MCP, AG-UI, and UCP (protocols they helped create) Channel Coverage — pi-agent-core/OpenClaw excels at multi-channel messaging (WhatsApp, Telegram, Discord, etc.) Cloud Integration — AWS Strands for AWS, Google ADK for GCP Workflow Flexibility — LangGraph offers the most control over agent behavior Memory Sophistication — All major frameworks now offer robust session and memory systems The right choice depends on your priorities:\nNeed full protocol interoperability? → Google ADK Need to reach users on WhatsApp/Telegram? → pi-agent-core Deep in AWS ecosystem? → Strands Want maximum control? → LangGraph This comparison reflects the state of these frameworks as of February 2026. The agentic AI space evolves rapidly — always check the latest docs.\nWritten by Amine El Farssi — Building production AI agents at KBC ","date":"1 February 2026","externalUrl":null,"permalink":"/blog/agentic-frameworks-comparison/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Building production AI agents requires choosing the right framework. This analysis examines \u003cstrong\u003epi-agent-core\u003c/strong\u003e (OpenClaw\u0026rsquo;s runtime), \u003cstrong\u003eGoogle ADK\u003c/strong\u003e, \u003cstrong\u003eAWS Strands\u003c/strong\u003e, \u003cstrong\u003eCrewAI\u003c/strong\u003e, \u003cstrong\u003eLangGraph\u003c/strong\u003e, and \u003cstrong\u003ePydantic AI\u003c/strong\u003e across critical dimensions: sessions, memory, protocols, agent loops, and replay support.\n\u003c/div\u003e","title":"Agentic Frameworks Deep Dive: pi-agent-core vs Google ADK vs AWS Strands vs CrewAI vs LangGraph vs Pydantic AI","type":"blog"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/ai/","section":"Tags","summary":"","title":"AI","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/architecture/","section":"Tags","summary":"","title":"Architecture","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/aws-strands/","section":"Tags","summary":"","title":"AWS Strands","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/computer-vision/","section":"Tags","summary":"","title":"Computer Vision","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/crewai/","section":"Tags","summary":"","title":"CrewAI","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/deepseek/","section":"Tags","summary":"","title":"DeepSeek","type":"tags"},{"content":" Introduction # In the age of AI, Optical Character Recognition (OCR) has evolved from simple pattern matching to sophisticated vision-language models that can understand context, preserve formatting, and handle complex documents. DeepSeek-OCR represents the cutting edge of this evolution — and the best part? You can run it entirely offline on your own hardware.\nThis guide covers:\nWhat makes DeepSeek-OCR special How to run it locally using the local_ai_ocr project Configuration, quantization, and optimization tips Real-world use cases What is DeepSeek-OCR? # DeepSeek-OCR is a vision-language model specifically trained for optical character recognition. Unlike traditional OCR engines (Tesseract, etc.), DeepSeek-OCR:\nUnderstands context — It doesn\u0026rsquo;t just recognize characters; it understands document structure Preserves formatting — Tables, lists, and layouts are maintained Multilingual — Vietnamese, English, Chinese, Japanese, and more Handles complexity — Handwriting, degraded scans, mixed content The model comes in a 3B parameter variant (deepseek-ocr:3b), which is small enough to run on consumer hardware while maintaining excellent accuracy.\nLocal AI OCR: The Easiest Way to Run DeepSeek-OCR # local_ai_ocr is an open-source project that wraps DeepSeek-OCR in a user-friendly GUI. It\u0026rsquo;s designed to be:\n100% Offline — After initial setup, no internet required Portable — Run from any folder, no installation needed Privacy-first — Your documents never leave your machine Key Features # Feature Description GPU Acceleration Auto-detects NVIDIA GPUs for 5-10x speedup CPU Fallback Works without GPU (slower but functional) Multiple Formats PNG, JPG, WebP, HEIC, PDF support PDF Page Selection Process specific pages from large documents Queue System Batch process multiple files Formatted Output Copy results directly to Word with formatting preserved Live Visualization See bounding boxes as AI reads the document System Requirements # Recommended Specs # Component Minimum Recommended OS Windows 10 Windows 10/11 22H2+ CPU 4 cores / 8 threads 8+ cores RAM 16 GB 32 GB Storage 11 GB free SSD preferred GPU None (CPU fallback) NVIDIA with 8GB+ VRAM GPU Notes # NVIDIA required for GPU acceleration (uses CUDA) Driver version 531+ required The software will attempt GPU even with less VRAM — it may work with reduced performance Installation \u0026amp; Setup # Step 1: Download # Go to Releases Download the latest .zip file Extract to any folder (e.g., C:\\Tools\\local_ai_ocr) Step 2: Initial Setup # # Run the setup script env_setup.cmd This downloads the AI model weights (~6.67 GB). After this, the software works completely offline.\nStep 3: Launch # # For GPU acceleration (recommended) run.cmd # For CPU-only mode run_cpu-only.cmd # With logging (for debugging) run_wlog.cmd Configuration Deep Dive # The configuration lives in config.toml:\n[engine] ip_address = \u0026#34;http://127.0.0.1\u0026#34; port = \u0026#34;11435\u0026#34; model = \u0026#34;deepseek-ocr:3b\u0026#34; Configuration Options # Key Default Description ip_address http://127.0.0.1 Local server address port 11435 Port for the OCR engine model deepseek-ocr:3b Model variant to use Model Variants # Currently, the project uses deepseek-ocr:3b — a 3-billion parameter model that balances:\nAccuracy: Near state-of-the-art OCR performance Speed: Reasonable inference times on consumer hardware Memory: ~6-8GB VRAM for GPU, ~12-16GB RAM for CPU Quantization Explained # The model uses quantization to reduce memory footprint and increase inference speed. Here\u0026rsquo;s what that means:\nWhat is Quantization? # Neural networks normally use 32-bit floating point numbers (FP32). Quantization reduces this to:\nFP16 — 16-bit floating point (2x memory savings) INT8 — 8-bit integers (4x memory savings) INT4 — 4-bit integers (8x memory savings) DeepSeek-OCR Quantization # The deepseek-ocr:3b model appears to use 4-bit quantization (based on the ~6.67GB download size for a 3B model):\nFull FP32: 3B × 4 bytes = 12 GB FP16: 3B × 2 bytes = 6 GB INT4: 3B × 0.5 bytes = 1.5 GB + overhead ≈ 6-7 GB This aggressive quantization is why it can run on consumer GPUs with 8GB VRAM.\nPerformance Impact # Precision Memory Speed Accuracy FP32 12+ GB Baseline 100% FP16 ~6 GB 1.5-2x faster ~99.9% INT8 ~3 GB 2-3x faster ~99.5% INT4 ~1.5 GB 3-4x faster ~98-99% The INT4 quantization used here provides excellent accuracy for OCR tasks while fitting in consumer hardware.\nProcessing Modes # The software offers three OCR modes:\n1. Markdown Document (Default — Best) # Best for: Structured documents, tables, forms Preserves: Headers, tables, lists, formatting Output: Markdown syntax Use this for:\nInvoices and receipts Academic papers Forms and applications Any document with clear structure 2. Free OCR # Best for: Complex layouts, mixed content Preserves: Better spatial layout than Standard Output: Plain text with spacing Use when default mode produces empty output (happens with very complex images).\n3. Standard OCR # Best for: Simple text extraction Preserves: Basic text only Output: Plain text Use for: Simple documents where formatting doesn\u0026rsquo;t matter.\nRunning on macOS / Linux # The official release targets Windows, but you can run it on other platforms:\nUsing Docker # # Pull Ollama docker run -d -v ollama:/root/.ollama -p 11435:11434 --gpus all ollama/ollama # Pull the model docker exec -it \u0026lt;container\u0026gt; ollama pull deepseek-ocr:3b Using Ollama Directly # # Install Ollama curl -fsSL https://ollama.com/install.sh | sh # Pull the model ollama pull deepseek-ocr:3b # Run server on custom port OLLAMA_HOST=127.0.0.1:11435 ollama serve Then use the Python components from the src/ folder to interact with it.\nPerformance Optimization Tips # 1. GPU Memory Management # The software auto-unloads the model after 5 minutes of inactivity. For manual control:\nClick \u0026ldquo;Unload AI Model\u0026rdquo; when done processing This frees VRAM for other applications 2. Batch Processing # Use the queue system for multiple files:\nAdd all files first Click \u0026ldquo;Start Processing\u0026rdquo; once Let it batch process — avoids repeated model loading 3. VRAM Optimization # If you\u0026rsquo;re hitting VRAM limits:\n# Force CPU mode for large batches run_cpu-only.cmd CPU is slower but can handle larger documents without VRAM constraints.\n4. Driver Updates # For NVIDIA GPUs:\nMinimum: Driver 531+ Recommended: Latest Game Ready or Studio driver Check: nvidia-smi in terminal Troubleshooting # Issue Solution GPU not detected Update NVIDIA driver to 531+ Setup fails at [1/6] Upgrade Windows to 22H2+ Empty output Try \u0026ldquo;Free OCR\u0026rdquo; mode instead of default Infinite loop Click STOP, try smaller image sections Slow performance Check GPU is being used, not CPU fallback Use Cases # Document Digitization # Scan old documents → OCR → Searchable archive\nData Extraction # Invoices, receipts → OCR → Structured data for accounting\nAccessibility # Convert image-based PDFs → Screen-reader compatible text\nResearch # Academic papers, books → OCR → Quotable, searchable text\nPrivacy-Sensitive Documents # Medical records, legal docs → Local OCR → Zero cloud exposure\nConclusion # DeepSeek-OCR via local_ai_ocr represents a paradigm shift: state-of-the-art AI running entirely on your hardware. No cloud APIs, no subscriptions, no privacy concerns.\nThe 3B quantized model hits the sweet spot of accuracy, speed, and accessibility. Whether you\u0026rsquo;re digitizing a personal archive or processing sensitive business documents, this setup gives you full control.\nLinks:\nGitHub Repository Releases / Downloads DeepSeek AI ","date":"1 February 2026","externalUrl":null,"permalink":"/blog/deepseek-ocr-local-guide/","section":"Blog","summary":"\u003ch2 class=\"relative group\"\u003eIntroduction\n    \u003cdiv id=\"introduction\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#introduction\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eIn the age of AI, Optical Character Recognition (OCR) has evolved from simple pattern matching to sophisticated vision-language models that can understand context, preserve formatting, and handle complex documents. \u003cstrong\u003eDeepSeek-OCR\u003c/strong\u003e represents the cutting edge of this evolution — and the best part? You can run it entirely offline on your own hardware.\u003c/p\u003e","title":"DeepSeek OCR: Running State-of-the-Art OCR Locally","type":"blog"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/frameworks/","section":"Tags","summary":"","title":"Frameworks","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/google-adk/","section":"Tags","summary":"","title":"Google ADK","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/local-ai/","section":"Tags","summary":"","title":"Local AI","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/ocr/","section":"Tags","summary":"","title":"OCR","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/privacy/","section":"Tags","summary":"","title":"Privacy","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/pydantic-ai/","section":"Tags","summary":"","title":"Pydantic AI","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/tags/quantization/","section":"Tags","summary":"","title":"Quantization","type":"tags"},{"content":"","date":"1 February 2026","externalUrl":null,"permalink":"/categories/technical-deep-dive/","section":"Categories","summary":"","title":"Technical Deep Dive","type":"categories"},{"content":" I\u0026rsquo;ve been obsessed with a question: Why can\u0026rsquo;t AI just\u0026hellip; do things? ChatGPT can write a perfect email, but you still copy-paste it. Claude can explain how to automate your workflow, but you implement it. Then I found OpenClaw — and everything clicked. The Problem With Chatbots # Traditional AI: Smart brain, no body. Limited to generating text.\nAgentic AI: Smart brain + hands + eyes + memory. Can accomplish tasks.\nMost AI interactions look like this:\nflowchart LR A[🧠 AI Brain] --\u003e B[📝 Text Output] B --\u003e C[😩 You Do The Work] style C fill:#ef4444,color:#fff With an agent orchestration gateway like OpenClaw, it becomes:\nflowchart LR A[🧠 AI Brain] --\u003e B[🦞 OpenClaw Gateway] B --\u003e C[📱 WhatsApp] B --\u003e D[💻 Files \u0026 Shell] B --\u003e E[🌐 Browser] B --\u003e F[📅 Calendar] style B fill:#10b981,color:#fff The Big Picture # OpenClaw is an agent orchestration gateway — a single long-lived process that connects AI brains to the real world.\nflowchart TB subgraph WORLD[\"🌍 YOUR WORLD\"] WA[📱 WhatsApp] TG[💬 Telegram] DC[🎮 Discord] SL[💼 Slack] SG[📨 Signal] IM[🍎 iMessage] end subgraph GATEWAY[\"🦞 OPENCLAW GATEWAY\"] direction TB INBOX[Inbox Router] SESSIONS[Session Manager] AGENT[Agent Loop] TOOLS[Tool Executor] MEMORY[Memory Search] end subgraph PROVIDERS[\"🧠 AI PROVIDERS\"] CL[Claude] GP[GPT-4] GE[Gemini] LL[Llama] end WORLD --\u003e GATEWAY GATEWAY --\u003e PROVIDERS style GATEWAY fill:#1e3a5f,color:#fff The Gateway is model-agnostic. Plug in Claude, GPT-4, Gemini, or local models. The magic isn\u0026rsquo;t in the AI — it\u0026rsquo;s in the infrastructure that lets the AI act.\nThe Agent Loop: Where Messages Become Actions # Here\u0026rsquo;s the core cycle that makes agents work:\n1. Message Arrives Input WhatsApp/Telegram/CLI → Gateway receives your message and routes it to the right session. 2. Context Assembly Prepare Gateway loads conversation history, user preferences (SOUL.md, USER.md), available tools, and relevant skills. 3. AI Thinks LLM The model receives everything and decides what to do. It might respond directly, or decide to use tools. 4. Tool Execution Action If tools are needed: Gateway executes them (send message, read file, run command, browse web). 5. Loop Continues Iterate AI sees tool results, decides if more actions needed. This can repeat multiple times per request. 6. Response Delivered Output Final response sent back through the original channel (WhatsApp → WhatsApp, etc.) In Code Terms # // What happens when you say \u0026#34;Send a project update to Alexander\u0026#34; // 1. AI receives context + tools // 2. AI outputs: { \u0026#34;tool_calls\u0026#34;: [{ \u0026#34;name\u0026#34;: \u0026#34;message\u0026#34;, \u0026#34;arguments\u0026#34;: { \u0026#34;action\u0026#34;: \u0026#34;send\u0026#34;, \u0026#34;channel\u0026#34;: \u0026#34;whatsapp\u0026#34;, \u0026#34;target\u0026#34;: \u0026#34;+32498022391\u0026#34;, \u0026#34;message\u0026#34;: \u0026#34;Hey Alexander, here\u0026#39;s the project update...\u0026#34; } }] } // 3. Gateway executes, returns result // 4. AI sees success, responds: \u0026#34;Done! Sent the update ✅\u0026#34; The Tool System: AI Superpowers # Tools are functions the AI can call to interact with the world. This is what transforms a chatbot into an agent.\nCore Tools # Tool What It Does Example exec Run any shell command git status, npm install, deploy scripts read/write/edit File system access Read configs, write code, edit docs browser Full Chrome control Click buttons, fill forms, screenshot message Multi-platform messaging WhatsApp, Telegram, Discord, Slack web_search Search the internet Research, find docs, check facts web_fetch Extract web content Scrape pages, read articles cron Schedule future tasks Reminders, daily briefings, monitoring memory_search Search agent memory Find past decisions, preferences Browser Automation: The Cool Part # sequenceDiagram participant A as Agent participant G as Gateway participant B as Browser A-\u003e\u003eG: browser.snapshot() G-\u003e\u003eB: Get page structure B--\u003e\u003eG: Accessibility tree G--\u003e\u003eA: Structured elements [ref=1,2,3...] A-\u003e\u003eG: browser.click(ref=12) G-\u003e\u003eB: Click element #12 B--\u003e\u003eG: Success A-\u003e\u003eG: browser.type(ref=15, \"hello@email.com\") G-\u003e\u003eB: Type into element #15 B--\u003e\u003eG: Success A-\u003e\u003eG: browser.screenshot() G-\u003e\u003eB: Capture screen B--\u003e\u003eA: Image data The agent sees a structured representation of the page (accessibility tree), not raw HTML. This makes navigation way more reliable than traditional scraping.\nSession Management: How It Remembers # Every conversation gets a session key that tracks its state:\nagent:main:main → Primary DM session agent:main:whatsapp:group:abc123 → A WhatsApp group agent:main:telegram:dm:user456 → A Telegram DM cron:daily-briefing → Scheduled task flowchart TB subgraph SESSIONS[\"Session Keys\"] M[agent:main:main] W[agent:main:whatsapp:group:123] T[agent:main:telegram:dm:456] C[cron:daily-report] end subgraph STORAGE[\"Persistence\"] JSON[sessions.jsonmetadata] JSONL[*.jsonltranscripts] end SESSIONS --\u003e STORAGE style M fill:#10b981,color:#fff Session Features # Daily Resets: Sessions expire at a configurable hour (default 4 AM) to prevent context bloat. Compaction: When nearing token limits, old context is summarized and compressed. JSONL Transcripts: Full conversation history persisted as append-only logs. The Soul Files: Personality \u0026amp; Memory # This is what makes agents feel *continuous* across sessions. OpenClaw uses plain Markdown files to define personality and store memories:\nflowchart TB subgraph WORKSPACE[\"~/.openclaw/workspace\"] SOUL[\"📜 SOUL.mdWho the agent is\"] USER[\"👤 USER.mdWho the human is\"] MEMORY[\"🧠 MEMORY.mdLong-term memories\"] DAILY[\"📅 memory/YYYY-MM-DD.mdDaily notes\"] TOOLS[\"🔧 TOOLS.mdLocal tool config\"] end SOUL --\u003e |\"Always loaded\"| AGENT[Agent Context] USER --\u003e |\"Always loaded\"| AGENT MEMORY --\u003e |\"Main session only\"| AGENT DAILY --\u003e |\"Today + yesterday\"| AGENT style MEMORY fill:#f59e0b,color:#000 Example: SOUL.md # # SOUL.md - Who You Are **Be genuinely helpful, not performatively helpful.** Skip the \u0026#34;Great question!\u0026#34; — just help. **Have opinions.** You\u0026#39;re allowed to disagree. **Be resourceful before asking.** Try to figure it out first. **Earn trust through competence.** Be careful with external actions (emails, tweets). Be bold with internal ones (reading, organizing). Why MEMORY.md is Main Session Only # Privacy: MEMORY.md contains personal context that shouldn\u0026rsquo;t leak into group chats or shared sessions. It\u0026rsquo;s only loaded when you\u0026rsquo;re in a direct, private conversation with the agent. Protocols: How Everything Connects # Gateway Protocol (WebSocket) # All clients communicate with the Gateway over WebSocket:\nsequenceDiagram participant C as Client (CLI/TUI/App) participant G as Gateway participant A as Agent C-\u003e\u003eG: connect (auth token) G--\u003e\u003eC: hello-ok (health snapshot) C-\u003e\u003eG: req:agent {message: \"Hello\"} G-\u003e\u003eA: Run agent loop A--\u003e\u003eG: Streaming chunks G--\u003e\u003eC: event:agent (streaming) G--\u003e\u003eC: res:agent (final) // Request {\u0026#34;type\u0026#34;: \u0026#34;req\u0026#34;, \u0026#34;id\u0026#34;: \u0026#34;1\u0026#34;, \u0026#34;method\u0026#34;: \u0026#34;agent\u0026#34;, \u0026#34;params\u0026#34;: {\u0026#34;message\u0026#34;: \u0026#34;Hello\u0026#34;}} // Response {\u0026#34;type\u0026#34;: \u0026#34;res\u0026#34;, \u0026#34;id\u0026#34;: \u0026#34;1\u0026#34;, \u0026#34;ok\u0026#34;: true, \u0026#34;payload\u0026#34;: {...}} // Server-push event {\u0026#34;type\u0026#34;: \u0026#34;event\u0026#34;, \u0026#34;event\u0026#34;: \u0026#34;agent\u0026#34;, \u0026#34;payload\u0026#34;: {\u0026#34;stream\u0026#34;: \u0026#34;assistant\u0026#34;, \u0026#34;chunk\u0026#34;: \u0026#34;Hi!\u0026#34;}} Multi-Channel Architecture # flowchart LR subgraph CHANNELS[\"Channel Connectors\"] BA[BaileysWhatsApp] GR[grammYTelegram] DJ[discord.jsDiscord] BO[BoltSlack] SC[signal-cliSignal] end subgraph GW[\"Gateway\"] UR[Unified Router] end BA --\u003e|WebSocket| UR GR --\u003e|Long-poll/Webhook| UR DJ --\u003e|WebSocket| UR BO --\u003e|Socket Mode| UR SC --\u003e|dbus| UR UR --\u003e AGENT[Agent Loop] Each channel maintains its own connection to the respective service, but they all feed into the same unified router and agent loop.\nSkills: On-Demand Expertise # Skills are modular knowledge packages loaded only when relevant:\ngithub-skill/ ├── SKILL.md # Instructions for using GitHub ├── scripts/ # Helper scripts └── references/ # Documentation flowchart TB Q[\"User: Create a PR for this fix\"] Q --\u003e SCAN[Scan available skills] SCAN --\u003e MATCH{Matches github skill?} MATCH --\u003e|Yes| LOAD[Load SKILL.md] LOAD --\u003e EXEC[Execute with skill knowledge] MATCH --\u003e|No| DEFAULT[Use base knowledge] This keeps the base prompt small while enabling deep expertise when needed.\nWhy This Architecture Matters # The architecture enables true agency through:\nUnified Gateway — One process handles all channels, sessions, and tools Tool Abstraction — Complex actions become simple function calls Persistent Memory — Sessions and personality survive restarts Plugin System — Extend without modifying core code Multi-Protocol Support — WebSocket, ACP, HTTP, and more Getting Started # View on GitHub npm install -g openclaw openclaw setup openclaw gateway Scan a QR code to connect WhatsApp, and you\u0026rsquo;ve got an AI assistant in your pocket.\nResources # 📚 Docs: docs.openclaw.ai 💻 GitHub: github.com/openclaw/openclaw 💬 Discord: discord.com/invite/clawd 🎯 Skills Hub: clawdhub.com Final Thoughts # The future isn\u0026rsquo;t AI that answers questions. It\u0026rsquo;s AI that gets things done. After digging through the codebase, I\u0026rsquo;m convinced this is where AI is heading. Not smarter chatbots — but AI that participates in your digital life.\nThe architecture is clean, extensible, and open source. Whether you want to use it, contribute to it, or just understand how agentic AI works under the hood, OpenClaw is worth exploring.\nP.S. — I wrote this article with the help of an OpenClaw-powered agent. It read the codebase, helped me understand the architecture, and even sent me WhatsApp reminders to finish writing. Very meta. 🤖\nWritten by Amine El Farssi — Exploring the future of AI agents ","date":"31 January 2026","externalUrl":null,"permalink":"/blog/openclaw-architecture-deep-dive/","section":"Blog","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  I\u0026rsquo;ve been obsessed with a question: \u003cstrong\u003eWhy can\u0026rsquo;t AI just\u0026hellip; do things?\u003c/strong\u003e ChatGPT can write a perfect email, but \u003cem\u003eyou\u003c/em\u003e still copy-paste it. Claude can explain how to automate your workflow, but \u003cem\u003eyou\u003c/em\u003e implement it. Then I found OpenClaw — and everything clicked.\n\u003c/div\u003e\n\n\n\u003ch2 class=\"relative group\"\u003eThe Problem With Chatbots\n    \u003cdiv id=\"the-problem-with-chatbots\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#the-problem-with-chatbots\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\n  \n  \n  \n  \n\n\n\n\u003cdiv\n  \n    class=\"flex px-4 py-3 rounded-md\" style=\"background-color: #1e3a5f\"\n  \n  \u003e\n  \u003cspan\n    \n      class=\"pe-3 flex items-center\" style=\"color: #60a5fa\"\n    \n    \u003e\n    \u003cspan class=\"relative block icon\"\u003e\u003csvg xmlns=\"http://www.w3.org/2000/svg\" viewBox=\"0 0 512 512\"\u003e\u003cpath fill=\"currentColor\" d=\"M256 32C114.6 32 .0272 125.1 .0272 240c0 49.63 21.35 94.98 56.97 130.7c-12.5 50.37-54.27 95.27-54.77 95.77c-2.25 2.25-2.875 5.734-1.5 8.734C1.979 478.2 4.75 480 8 480c66.25 0 115.1-31.76 140.6-51.39C181.2 440.9 217.6 448 256 448c141.4 0 255.1-93.13 255.1-208S397.4 32 256 32z\"/\u003e\u003c/svg\u003e\n\u003c/span\u003e\n  \u003c/span\u003e\n\n  \u003cspan\n    \n      style=\"color: #f0f0f0\"\n    \n    \u003e\u003cp\u003e\u003cstrong\u003eTraditional AI:\u003c/strong\u003e Smart brain, no body. Limited to generating text.\u003c/p\u003e","title":"Inside OpenClaw: The Architecture That Turns LLMs Into Autonomous Agents","type":"blog"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/tags/open-source/","section":"Tags","summary":"","title":"Open Source","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/tags/orchestration/","section":"Tags","summary":"","title":"Orchestration","type":"tags"},{"content":"","date":"31 January 2026","externalUrl":null,"permalink":"/tags/tools/","section":"Tags","summary":"","title":"Tools","type":"tags"},{"content":"","date":"27 January 2026","externalUrl":null,"permalink":"/tags/a2ui/","section":"Tags","summary":"","title":"A2UI","type":"tags"},{"content":" Executive Summary # TL;DR # As of January 2026, the agentic ecosystem is consolidating around four complementary protocols that fit naturally into a layered stack:\nMCP (Model Context Protocol): “USB‑C for AI” — standard tool/resource access. A2A (Agent‑to‑Agent): peer coordination — delegation, capability discovery, task lifecycle. A2UI (Agent‑to‑User Interface): safe UI — declarative UI blueprints (data-only), rendered natively by the host. UCP (Universal Commerce Protocol): agentic commerce — discovery/negotiation/identity/checkout primitives. The key architectural trade-off: A2UI’s data-only model removes entire classes of UI injection risk, while web/sandboxed UI approaches (e.g., iframe apps) increase flexibility at the cost of a bigger attack surface. In practice, enterprises tend to adopt a layered approach: MCP for tools, A2A for multi-agent orchestration, A2UI for secure UI, and UCP for transactions.\nThe Agentic Protocol Stack (Mental Model) # Think of these protocols as layers, not mutually exclusive competitors:\nTools / Resources: MCP Coordination: A2A Presentation: A2UI Commerce: UCP Individual Protocol Technical Notes # A2UI (Agent-to-User Interface) # Core idea: the agent outputs a declarative UI blueprint (schema’d data), not executable HTML/JS. The host app renders native components.\nWhy it matters:\nSecurity: eliminates XSS / script injection by design (no executable payload). Control: host whitelists components, validates schemas, owns rendering. Portability: one blueprint can render across platforms (host-defined catalogs). Operational checklist:\nStrict schema validation of blueprints Component catalog allow-listing Event handling with explicit, typed payloads Audit logging for UI generation + user actions A2A (Agent-to-Agent) # Core idea: enable agent collaboration with clear capability discovery and task lifecycle.\nTypical building blocks:\nJSON-RPC over HTTP(S) for invocation patterns Agent “cards” (metadata) for discovery + auth requirements Task lifecycle states (submitted → working → input-required → completed/failed/canceled) Streaming for long-running jobs (e.g., SSE) Where it shines:\nOrchestrator/worker patterns Delegating sub-tasks to specialized agents Cross-team workflows where capabilities change over time MCP (Model Context Protocol) # Core idea: standardize how a host/agent discovers and calls tools, reads resources, and uses structured prompts.\nUseful distinctions:\nHost: the application initiating connections (IDE, agent runtime) Client: manages connections to servers Server: exposes tools/resources/prompts Key design concerns (enterprise):\nAuthN/AuthZ (scopes, least privilege) Tool poisoning / supply-chain risk (server trust + integrity) Observability (tool call traces, failures, latency) Governance (approval flows for high-risk tools) UCP (Universal Commerce Protocol) # Core idea: define atomic primitives for agentic commerce so that agents can:\ndiscover products/services negotiate terms prove identity / authorization execute checkout Design tensions:\ndecentralized discovery vs centralized marketplaces strong cryptographic authorization vs UX friction regulatory constraints and auditability Comparative View (Quick Matrix) # Protocol Primary Function Architecture Code Execution Primary Risk Surface A2UI Native UI generation Declarative data blueprint None Catalog/schema errors, unsafe bindings A2A Inter-agent coordination P2P / federated None AuthN/AuthZ, impersonation, scope mistakes MCP Tool integration Client/server Depends on tool UI approach Tool poisoning, authz, data exfil, server trust UCP Commerce transactions Decentralized primitives None Fraud, mandate abuse, compliance, identity Security: A2UI vs “Sandboxed UI” # Two patterns show up repeatedly:\nData-only UI (A2UI): fewer moving parts; strongest baseline if you need hard guarantees. Sandboxed web UI (iframes / apps): faster iteration and richer UI, but you inherit web security complexity (CSP, sandbox boundaries, messaging, supply chain). The “right” choice depends on:\nthreat model (internal vs external users; regulated vs consumer) how much UI flexibility you need how mature your security review and monitoring is Implementation Guidelines (Practical) # Suggested layered architecture # Use MCP to standardize tool access (APIs, databases, internal services). Use A2A where you need dynamic delegation or multiple specialist agents. Use A2UI for user-facing surfaces where you want strict UI safety. Use UCP for commerce/transaction primitives (if applicable). Production readiness checklist (minimum) # CI/CD gating (evals, regression tests, security checks) Observability (tracing + structured logs for tool calls and agent steps) Safety controls (prompt-injection defenses, allow-listed tools, policy checks) Failure handling (timeouts, circuit breakers, retries, idempotency keys) References # AG‑UI protocol overview: https://docs.ag-ui.com/agentic-protocols A2UI intro: https://a2ui.sh/articles/introduction-to-a2ui MCP Apps announcement: https://blog.modelcontextprotocol.io/posts/2026-01-26-mcp-apps/ UCP deep dive: https://developers.googleblog.com/under-the-hood-universal-commerce-protocol-ucp/ ","date":"27 January 2026","externalUrl":null,"permalink":"/blog/agentic-protocols-a2ui-a2a-mcp-ucp/","section":"Blog","summary":"\u003ch2 class=\"relative group\"\u003eExecutive Summary\n    \u003cdiv id=\"executive-summary\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#executive-summary\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\n\u003ch3 class=\"relative group\"\u003eTL;DR\n    \u003cdiv id=\"tldr\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#tldr\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h3\u003e\n\u003cp\u003eAs of January 2026, the agentic ecosystem is consolidating around \u003cstrong\u003efour complementary protocols\u003c/strong\u003e that fit naturally into a layered stack:\u003c/p\u003e","title":"Agentic Protocols: A2UI, A2A, MCP, and UCP (Research Report)","type":"blog"},{"content":"","date":"27 January 2026","externalUrl":null,"permalink":"/tags/genai/","section":"Tags","summary":"","title":"GenAI","type":"tags"},{"content":"","date":"27 January 2026","externalUrl":null,"permalink":"/tags/security/","section":"Tags","summary":"","title":"Security","type":"tags"},{"content":"","date":"1 January 2023","externalUrl":null,"permalink":"/tags/agentops/","section":"Tags","summary":"","title":"AgentOps","type":"tags"},{"content":"","date":"1 January 2023","externalUrl":null,"permalink":"/tags/aws-bedrock/","section":"Tags","summary":"","title":"AWS Bedrock","type":"tags"},{"content":" 5 years of experience building production AI systems, from AML detection to enterprise AI agents. ","date":"1 January 2023","externalUrl":null,"permalink":"/experience/","section":"Experience","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  5 years of experience building production AI systems, from AML detection to enterprise AI agents.\n\u003c/div\u003e","title":"Experience","type":"experience"},{"content":"","date":"1 January 2023","externalUrl":null,"permalink":"/tags/langchain/","section":"Tags","summary":"","title":"LangChain","type":"tags"},{"content":"","date":"1 September 2021","externalUrl":null,"permalink":"/tags/ml/","section":"Tags","summary":"","title":"ML","type":"tags"},{"content":"","date":"1 September 2021","externalUrl":null,"permalink":"/tags/pyspark/","section":"Tags","summary":"","title":"PySpark","type":"tags"},{"content":"","date":"1 September 2021","externalUrl":null,"permalink":"/tags/python/","section":"Tags","summary":"","title":"Python","type":"tags"},{"content":"","date":"1 September 2021","externalUrl":null,"permalink":"/tags/pytorch/","section":"Tags","summary":"","title":"PyTorch","type":"tags"},{"content":"","date":"1 March 2021","externalUrl":null,"permalink":"/tags/big-data/","section":"Tags","summary":"","title":"Big Data","type":"tags"},{"content":"","date":"1 March 2021","externalUrl":null,"permalink":"/tags/etl/","section":"Tags","summary":"","title":"ETL","type":"tags"},{"content":"","date":"1 March 2021","externalUrl":null,"permalink":"/tags/snowflake/","section":"Tags","summary":"","title":"Snowflake","type":"tags"},{"content":"","date":"1 March 2021","externalUrl":null,"permalink":"/tags/spark/","section":"Tags","summary":"","title":"Spark","type":"tags"},{"content":"","date":"1 July 2020","externalUrl":null,"permalink":"/tags/deep-learning/","section":"Tags","summary":"","title":"Deep Learning","type":"tags"},{"content":"","date":"1 July 2020","externalUrl":null,"permalink":"/tags/lstm/","section":"Tags","summary":"","title":"LSTM","type":"tags"},{"content":"","date":"1 July 2020","externalUrl":null,"permalink":"/tags/time-series/","section":"Tags","summary":"","title":"Time Series","type":"tags"},{"content":" AI Engineer at KBC Bank \u0026amp; Insurance. I build production AI agent systems — from enterprise knowledge infrastructure to multi-agent platforms implementing the emerging agentic protocol stack. What I Do # I build AI that acts, not just talks. Agents that reason, use tools, complete real tasks, and work autonomously — not just chatbots with a nice UI.\nAt KBC, that means production-grade agentic systems for document processing, knowledge retrieval, and banking workflows. On the side, I\u0026rsquo;m building Luminar — a multi-merchant commerce platform that implements the full agentic protocol stack.\nflowchart LR A[🧠 Foundation Models] --\u003e B[🦾 Agent Orchestration] B --\u003e C[🔧 Tool Use \u0026 Memory] C --\u003e D[✅ Real Tasks Done] style D fill:#10b981,color:#fff Current Focus # Multi-agent systems at scale — building teams of specialized agents that collaborate on complex tasks, with persistent memory, structured workflows, and observable behavior. Agentic Architecture — Specialist agent teams, tool orchestration, long-term memory (Neo4j KG + vector store) Agentic Protocols — UCP, ACP, A2A, MCP, AG-UI, TAP — building platforms that implement them, not just use them AgentOps — Evaluation frameworks, observability (OTel), and production guardrails LLM Infrastructure — AWS Bedrock, multi-model routing, cost tracking, evals Technical Stack # Agent Frameworks \u0026amp; LLM # LangGraph Pydantic AI AWS Bedrock Claude OpenClaw Backend \u0026amp; Data # FastAPI Python PostgreSQL Neo4j ChromaDB Frontend \u0026amp; DevOps # Next.js Docker AWS GitHub Actions Payments \u0026amp; Commerce # Stripe Connect UCP ACP A2A Background # Period Role Company 2023–Present AI Engineer KBC Bank \u0026amp; Insurance 2021–2023 Data Scientist KBC Bank \u0026amp; Insurance 2021 Big Data Engineer JEMS Group 2020–2021 Data Scientist Bioceanor Luminar # A multi-merchant commerce platform built for the agentic web. Merchants connect and list their catalog. AI agents — from ChatGPT to custom bots — discover products, negotiate offers, and complete checkout using open protocols (UCP, ACP, A2A, MCP, TAP, AP2).\nThe interesting part: the platform is being built by a team of 9 AI agents — Forge (architect), Bolt (backend), Wire (frontend), Sage (AI/LLM), Drift (DevOps), and four others — dispatched automatically from a Linear webhook when issues are labeled.\nGitHub Writing # I write about agent architecture, production AI systems, and the emerging protocol stack:\n27 AI Engineering Patterns in 60 Seconds Each — the full series index with links to all 27 episodes AI Agent Protocols in 60 Seconds — MCP, A2A, UCP, AG-UI, ACP as Shorts Building an AI with Persistent Identity — Neo4j + ChromaDB memory architecture Multi-Agent Startup: 9 Agents Building Luminar Knowledge Graph Memory for AI Agents Agentic Frameworks Deep Dive — pi-agent-core vs ADK vs Strands vs LangGraph Agentic Protocols: MCP, A2A, AG-UI, UCP Inside OpenClaw: Architecture Deep Dive YouTube Channel for $0: Manim + TTS + FFmpeg AI Engineering Patterns — YouTube # I run @DPO-AI, a YouTube Shorts channel publishing one production AI engineering pattern per week. 27 episodes live. Each Short is 60–70 seconds covering a real pattern — the problem it solves, how it works, and the implementation.\nRecent episodes: Hybrid Search (BM25 + vectors + RRF), Agentic RAG, Self-RAG, Corrective RAG (CRAG), Multi-Agent Orchestration, LLM-as-Judge, Context Distillation.\nThe full pipeline is automated — Remotion animations, Google Chirp3-HD TTS, Whisper subtitles, background music, and YouTube Data API upload. All from a home server with an RTX 2080 Ti.\nWatch the Series ↗ Full Playlist ↗ Let\u0026rsquo;s Connect # LinkedIn GitHub X / Twitter ","externalUrl":null,"permalink":"/about/","section":"About Me","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  AI Engineer at KBC Bank \u0026amp; Insurance. I build production AI agent systems — from enterprise knowledge infrastructure to multi-agent platforms implementing the emerging agentic protocol stack.\n\u003c/div\u003e\n\n\n\u003ch2 class=\"relative group\"\u003eWhat I Do\n    \u003cdiv id=\"what-i-do\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#what-i-do\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eI build \u003cstrong\u003eAI that acts, not just talks\u003c/strong\u003e. Agents that reason, use tools, complete real tasks, and work autonomously — not just chatbots with a nice UI.\u003c/p\u003e","title":"About Me","type":"about"},{"content":" Building AI that doesn\u0026rsquo;t just think — it acts. From enterprise agents at scale to open-source frameworks. The Agentic Stack # I work across the full agentic AI stack, from foundation models to production deployment:\nflowchart TB subgraph LAYER4[\"🎨 Presentation Layer\"] AGUI[AG-UI Protocol] A2UI[A2UI Blueprints] end subgraph LAYER3[\"🤝 Coordination Layer\"] A2A[A2A Protocol] MULTI[Multi-Agent Orchestration] end subgraph LAYER2[\"🔧 Tools \u0026 Data Layer\"] MCP[MCP Protocol] RAG[RAG Systems] TOOLS[Tool Integration] end subgraph LAYER1[\"🧠 Foundation Layer\"] LLM[Foundation Models] EMB[Embeddings] end LAYER4 --\u003e LAYER3 LAYER3 --\u003e LAYER2 LAYER2 --\u003e LAYER1 Core Competencies # 🤖 Agent Development # Production Experience: Building enterprise AI agents at Belgium\u0026rsquo;s largest bank Skill Tools \u0026amp; Frameworks Experience Agent Orchestration LangGraph, AWS Bedrock Agents, CrewAI Production Tool Use \u0026amp; Function Calling OpenAI, Claude, Bedrock Production Memory \u0026amp; State Management Session stores, vector DBs, compaction Production Multi-Agent Systems A2A protocol, hierarchical agents Advanced Agent Evaluation AgentOps, LangSmith, custom evals Production 🔗 Agentic Protocols # Deep understanding of the emerging protocol stack:\nMCP (Model Context Protocol)\nTool and resource standardization Server/client architecture Enterprise security patterns A2A (Agent-to-Agent)\nCapability discovery Task delegation and lifecycle Cross-agent coordination AG-UI / A2UI\nDeclarative UI blueprints Safe agent-to-user interfaces Event-based streaming 🏗️ Agent Infrastructure # ┌─────────────────────────────────────────────────────────┐ │ Production Agent Stack │ ├─────────────────────────────────────────────────────────┤ │ Observability │ AgentOps, LangSmith, CloudWatch │ │ Guardrails │ Bedrock Guardrails, custom filters │ │ Evaluation │ LLM-as-judge, automated evals │ │ Deployment │ Lambda, ECS, SageMaker endpoints │ │ State Management │ DynamoDB, Redis, JSONL transcripts │ │ Vector Storage │ OpenSearch, Pinecone, pgvector │ └─────────────────────────────────────────────────────────┘ Framework Proficiency # Cloud Platforms # AWS Bedrock AWS SageMaker Azure OpenAI Google Vertex AI Agent Frameworks # LangChain LangGraph AWS AgentCore Runtime CrewAI AWS Strands Google ADK AutoGen Foundation Models # Claude OpenAI Gemini Llama Mistral Real-World Applications # Enterprise AI Agents # Building agents that handle real banking workflows:\nDocument processing and extraction Customer query routing Compliance checking Internal knowledge retrieval RAG-Enhanced Agents # Combining retrieval with reasoning:\nHybrid search (semantic + keyword) Chunking strategies for domain documents Citation and source tracking Incremental index updates Agentic Evaluation # Ensuring agents work reliably:\nTrajectory evaluation Tool use accuracy Hallucination detection Latency optimization Open Source Contributions # MiniClaw # A minimal agent orchestration framework in Python demonstrating core patterns:\nOpenClaw-style session management Workspace-based memory (SOUL.md, MEMORY.md) Multi-provider support (OpenAI, Anthropic, Ollama) ~2,800 lines of readable, educational code View Project Certifications \u0026amp; Training # AWS Certified Solutions Architect Deep Learning Specialization (Coursera) MLOps specialization (ongoing) Want to discuss agentic AI? I\u0026rsquo;m always happy to chat about agent architectures, production challenges, or collaboration opportunities. ","externalUrl":null,"permalink":"/skills/","section":"Agentic AI Skills","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  Building AI that doesn\u0026rsquo;t just think — it \u003cstrong\u003eacts\u003c/strong\u003e. From enterprise agents at scale to open-source frameworks.\n\u003c/div\u003e\n\n\u003chr\u003e\n\n\u003ch2 class=\"relative group\"\u003eThe Agentic Stack\n    \u003cdiv id=\"the-agentic-stack\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#the-agentic-stack\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eI work across the full agentic AI stack, from foundation models to production deployment:\u003c/p\u003e","title":"Agentic AI Skills","type":"skills"},{"content":" Download PDF Professional Summary # Senior AI Engineer with 5 years of AI/ML/DS experience, currently building and hardening production AI agents at one of Belgium\u0026rsquo;s largest banks. Strong focus on agentic systems, AWS architecture, and operational excellence (evaluation, observability, and security).\nProfessional Experience # AI Engineer (GenAI / Agentic Systems) | KBC Bank \u0026amp; Insurance # Leuven, Belgium | 2023 - Present\nBuilt and shipped production AI agents for core operations, including an insurance claims-handling agent that automated information gathering and follow-ups; improved handler throughput by ~2–3× across ~300 claims/month (target claim type). Designed an event-driven agent runtime with state synchronized to enterprise backends using EventBridge (decoupled integration) and DynamoDB (state management) to meet reliability and consistency constraints. Delivered an autonomous mortgage/credit review agent that retrieves and applies credit policies/SOPs, integrates with CRM dispatching, supports human-in-the-loop, and handles ~150 questions/month (refinancing, variable-rate reviews, simulations). Built a reusable RAG platform on AWS enabling business units to ingest policies/FAQs, generate vector snapshots, and serve consistent retrieval for downstream agents and assistants. Implemented AgentOps + observability: golden datasets as CI/CD deployment gates, post-deploy evaluation for prompt changes, OTEL tracing/logging, and circuit breakers for integration failures. Led security hardening for agentic apps: prompt-injection guardrails, policy enforcement in system prompts, and red-teaming to identify vulnerabilities pre-release. Technologies: Python, AWS (Bedrock, AgentCore, Lambda, EventBridge, SQS, DynamoDB, API Gateway, ECS/ECR, IAM), Docker, Terraform, OpenTelemetry (OTEL)\nData Scientist (AML / Graph ML) | KBC Bank \u0026amp; Insurance # Leuven, Belgium | 2021 - 2023\nBuilt AML transaction-monitoring models using an ensemble (GBM / Random Forest / XGBoost) combined with graph-based scoring (diffusion over transaction/customer networks) to increase hit-rate and reduce investigation time versus rule-based detection. Engineered network features (e.g., PageRank, Jaccard similarity, neighborhood overlap) and applied community detection (Leiden) to surface suspicious clusters and patterns beyond single-transaction signals. Operated the system with governance-aligned thresholding (≈~2% of transactions flagged) on high-volume transaction flows (order of magnitude: 10^6/day). Implemented daily batch scoring pipelines using PySpark + Airflow, with experiment tracking and monitoring in MLflow. Technologies: Python, PySpark/Spark, XGBoost, SQL, MLflow, Airflow, Graph analytics\nBig Data Engineer | JEMS Group # France | Mar 2021 - Aug 2021\nMigrated and standardized data pipelines on Databricks using the medallion architecture (bronze/silver/gold), delivering a clean curated layer ready for analytics warehousing. Technologies: Databricks, Spark, SQL, Python\nData Scientist | Bioceanor # Valbonne, France | Jul 2020 - Mar 2021\nBuilt time-series forecasting models (LSTM) for sea-water quality signals (e.g., turbidity/chlorophyll/temperature), deployed via API and integrated with LoRa-based sensor pipelines; evaluated with MAE. Technologies: Python, TensorFlow/Keras, LSTM, Time series, APIs, IoT/LoRa\nEducation # Master of Science in Applied Data Science # Data ScienceTech Institute (DSTI) | France | 2020 - 2021\nFocus: Machine Learning, Deep Learning, Statistical Modeling, Big Data Technologies\nDiplome d\u0026rsquo;Ingenieur in Civil Engineering # Strasbourg | France\nEquivalent to Master of Science. Foundation in analytical thinking, mathematical modeling, and complex systems engineering.\nCertifications # AWS Certified Solutions Architect – Associate (2022) Technical Skills # Category Technologies Agentic / GenAI AI Agents, RAG Systems, LangChain, LangGraph, Bedrock, Guardrails, Eval harnesses (golden datasets) AWS Bedrock, AgentCore, Lambda, API Gateway, EventBridge, SQS, DynamoDB, ECS/ECR, IAM, S3 Programming Python, SQL, TypeScript, Bash, HCL (Terraform) Data \u0026amp; MLOps PySpark/Spark, Airflow, MLflow, Docker, OpenTelemetry (OTEL), CI/CD Languages # Language Proficiency French Native English Fluent Arabic Fluent Dutch B1 (Currently Learning) ","externalUrl":null,"permalink":"/cv/","section":"Curriculum Vitae","summary":"\u003ca\n  class=\"!rounded-md bg-primary-600 px-4 py-2 !text-neutral !no-underline hover:!bg-primary-500 dark:bg-primary-800 dark:hover:!bg-primary-700\"\n  href=\"/files/Amine_El_Farssi_Resume.pdf\"\n  target=\"_blank\"\n  \n  role=\"button\"\u003e\n  \nDownload PDF\n\n\u003c/a\u003e\n\n\u003chr\u003e\n\u003cdiv style=\"float:right; width:140px; margin:0 0 1rem 1rem;\"\u003e\n  \u003cimg\n    src=\"/img/amine.jpg\"\n    alt=\"Amine El Farssi\"\n    style=\"width:100%; height:auto; border-radius:9999px; object-fit:cover;\"\n  /\u003e\n\u003c/div\u003e\n\n\u003ch2 class=\"relative group\"\u003eProfessional Summary\n    \u003cdiv id=\"professional-summary\" class=\"anchor\"\u003e\u003c/div\u003e\n    \n    \u003cspan\n        class=\"absolute top-0 w-6 transition-opacity opacity-0 -start-6 not-prose group-hover:opacity-100 select-none\"\u003e\n        \u003ca class=\"text-primary-300 dark:text-neutral-700 !no-underline\" href=\"#professional-summary\" aria-label=\"Anchor\"\u003e#\u003c/a\u003e\n    \u003c/span\u003e\n    \n\u003c/h2\u003e\n\u003cp\u003eSenior AI Engineer with 5 years of AI/ML/DS experience, currently building and hardening production AI agents at one of Belgium\u0026rsquo;s largest banks. Strong focus on agentic systems, AWS architecture, and operational excellence (evaluation, observability, and security).\u003c/p\u003e","title":"Curriculum Vitae","type":"cv"},{"content":" Python OpenAI Anthropic Open Source Overview # MiniClaw is a ~2,800 line Python implementation of the core patterns from OpenClaw, the open-source agent orchestration gateway. Built as both a learning tool and a functional framework.\nGoal: Understand agent architecture by building it from scratch — not by hiding behind a library. Architecture # flowchart TB subgraph CHANNELS[\"Channels\"] CLI[CLI] WH[Webhook] EXT[Extensible...] end subgraph GATEWAY[\"Gateway\"] ROUTER[Message Router] EVENTS[Event System] end subgraph CORE[\"Core\"] SESS[Session StoreKeys, JSONL, Resets] MEM[Workspace MemorySOUL.md, USER.md] AGENT[Agent LoopThink → Act → Observe] TOOLS[Tool Registry] end CHANNELS --\u003e GATEWAY GATEWAY --\u003e CORE Key Features # OpenClaw-Style Sessions # Session keys following the OpenClaw convention:\nagent:main:main → Primary DM agent:main:whatsapp:group:123 → Group chat agent:main:dm:+1234567890 → Per-peer DM Features:\nDaily resets at configurable hour (default 4 AM) JSONL transcripts for full history Compaction when nearing token limits Idle timeouts for inactive sessions Workspace Memory # Plain Markdown files as the agent\u0026rsquo;s memory:\nFile Purpose When Loaded SOUL.md Agent personality Always USER.md Human context Always MEMORY.md Long-term memories Main session only memory/YYYY-MM-DD.md Daily notes Today + yesterday Multi-Provider Support # Same code works across providers:\n# OpenAI agent = Agent(AgentConfig(model=\u0026#34;gpt-4o\u0026#34;, provider=\u0026#34;openai\u0026#34;)) # Anthropic agent = Agent(AgentConfig(model=\u0026#34;claude-3-5-sonnet\u0026#34;, provider=\u0026#34;anthropic\u0026#34;)) # Local (Ollama) agent = Agent(AgentConfig(model=\u0026#34;llama3.2\u0026#34;, provider=\u0026#34;ollama\u0026#34;)) TUI Interface # Rich terminal interface with:\nFormatted message display Tool call visualization Session status bar Slash commands (/help, /status, /new) Code Structure # miniclaw/ ├── __init__.py # Package exports ├── tools.py # @tool decorator, registry, built-ins ├── session.py # Sessions, keys, JSONL, compaction ├── memory.py # SOUL.md, USER.md, MEMORY.md ├── agent.py # The agent loop ├── gateway.py # Orchestrator + channels ├── tui.py # Rich terminal UI └── cli.py # CLI entry point Usage # # Install pip install -e . # Run CLI miniclaw # Run TUI miniclaw tui # One-shot miniclaw --one-shot \u0026#34;List files in current directory\u0026#34; Why Build This? # Most agent frameworks are either:\nToo simple — just wrappers around API calls Too complex — hundreds of abstractions MiniClaw sits in the middle: fully functional but completely readable. Every pattern is explicit and documented.\nUse MiniClaw to learn. Use OpenClaw for production. Related # OpenClaw Architecture Deep Dive — The blog explaining these patterns Agentic Protocols — MCP, A2A, A2UI, UCP View on GitHub ","externalUrl":null,"permalink":"/projects/miniclaw/","section":"Projects","summary":"\u003cp\u003e\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Python\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    OpenAI\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Anthropic\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\n\u003cspan class=\"flex cursor-pointer\"\u003e\n  \u003cspan\n    class=\"rounded-md border border-primary-400 px-1 py-[1px] text-xs font-normal text-primary-700 dark:border-primary-600 dark:text-primary-400\"\u003e\n    Open Source\n  \u003c/span\u003e\n\u003c/span\u003e\n\n\u003c/p\u003e","title":"MiniClaw: Python Agent Framework","type":"projects"},{"content":" A selection of AI and data science projects showcasing my expertise in production ML systems. ","externalUrl":null,"permalink":"/projects/","section":"Projects","summary":"\u003cdiv class=\"lead text-neutral-500 dark:text-neutral-400 !mb-9 text-xl\"\u003e\n  A selection of AI and data science projects showcasing my expertise in production ML systems.\n\u003c/div\u003e","title":"Projects","type":"projects"}]