22 May 2026

Your Documents, Your Agent: Using ThinkableSpace with OpenClaw

OpenClaw has quickly become one of the most exciting open-source projects of 2026. It turns any messaging app: Signal, Telegram, WhatsApp, Discord, into a personal AI agent that can actually do things on your computer. But there's a gap most people hit almost immediately: the agent is only as smart as the context you give it. Ask it something about your own notes, contracts, or research papers and it draws a blank.

OpenClaw connects to a large language model (Claude, GPT-4o, DeepSeek 4, your choice) and can execute tasks through that model. What it can't do out of the box is reach into your private document library. Your notes, PDFs, meeting transcripts, and research files live on your disk, and the agent has no way to search them semantically.

The naive fix: dumping documents into every prompt, it doesn't scale. A 200-page contract or a year of meeting notes won't fit in a context window, and even if it did, you'd be sending sensitive material to a cloud API on every single message.

Token Costs: The Hidden Tax of Naive Document Access

This is where a lot of people run into trouble in practice.

If you try to give OpenClaw access to your documents without a retrieval layer, you're left with two bad options:

Option A: Attach files manually. You copy-paste or upload documents per conversation. Small files work, but you have to remember which file is relevant each time, and anything over a few thousand words eats a significant chunk of your context budget.

Option B: Stuff everything into the system prompt. Some setups try to pre-load entire document libraries into the agent's context. This sounds convenient until you see your API bill. With Claude Sonnet 4.6, input tokens run at $3 per million. A modest personal document library — say, 500 PDFs averaging 5 pages each — is roughly 10–15 million tokens. Sending that as context on every message would cost $30–45 per query. A week of daily use would run you hundreds of dollars.

What you actually need is retrieval: find the 3–5 most relevant passages from your entire library and send only those to the LLM. That's exactly what ThinkableSpace does.

Regardless of how large your document library is. The token cost stays flat whether you have 10 documents or 10,000.

The difference is not marginal. It's four orders of magnitude.

The Solution: ThinkableSpace as a Local RAG Brain

ThinkableSpace is a privacy-first RAG (Retrieval-Augmented Generation) desktop app. It indexes your local documents: PDFs, Word files, Markdown, EPUBs, CSVs, using a local embedding model (no cloud required), and makes them searchable via semantic vector search.

Crucially, it exposes everything through a local MCP server, the same Model Context Protocol standard that OpenClaw uses to connect to external tools.

The LLM never sees your full document library, only the excerpts that are actually relevant to the question. Token usage stays low. Costs stay predictable.

Setting It Up

1. Enable the MCP Server in ThinkableSpace

Open ThinkableSpace → Settings → MCP Server

You'll see your auth token in the settings panel: copy it, you'll need it in the next step.

2. Point OpenClaw at the MCP Server

In your OpenClaw config, add ThinkableSpace as an MCP tool provider

Restart OpenClaw. Your agent now has access to the ThinkableSpace search tools.

3. Index Your Documents

In ThinkableSpace, add the folders you want searchable. The app watches them automatically, drop a new PDF in and it's indexed within seconds. Embedding and chunking run locally using EmbeddingGemma-300m (a ~300M parameter model), so indexing a typical document takes a few seconds and uses no cloud credits.

What You Can Do Now

Once connected, your OpenClaw agent can answer questions that previously required you to manually dig through files:

Research and writing:

"What did the contract with Acme say about termination clauses?"

Personal knowledge base:

"Summarize my notes from last month's product meetings."

Technical reference:

"What does my architecture doc say about the auth service?"

Cross-document synthesis:

"Find anything across my notes that relates to Q3 revenue projections."

The agent retrieves semantically relevant chunks (not just keyword matches), passes them as context to the LLM, and gives you a grounded answer, with the source documents cited. And because retrieval is surgical, you pay for maybe 1,500 tokens of context instead of millions.

Privacy by Design

Both projects share the same philosophy: your data belongs to you.

ThinkableSpace runs all embedding and search locally. The database is encrypted with SQLCipher. No document content ever touches a cloud API.
OpenClaw runs locally and only sends what you explicitly ask to whatever LLM you've configured. If you're using a local model (via llama.cpp or Ollama), the entire pipeline is air-gapped.

The only thing that goes to the cloud is the search query text and the retrieved document excerpts and only because you explicitly prompted the agent. If full local operation matters to you, pair ThinkableSpace with a local model in OpenClaw and nothing leaves your machine at all.

← All Posts