Skip to content

Building linkledger-cli: A Local-First Memory Layer for AI Agents

Published:
7 min read

Where My Head Was At

When I cooked this idea up, I was trying to solve a very boring but expensive problem: great sources were scattered across chats, tabs, and notes, so every new draft started with the same research loop again.

Pocket was the reference point. If you never used it, Pocket was a “save this for later” app for links and articles: one place to collect what mattered so you could reuse it later.

I wanted that exact behavior for AI-agent workflows, where both humans and agents need the same source memory. Agent memory drift was part of it, but the bigger goal was a Pocket-for-agents system: capture fast, organize lightly, and retrieve compact, high-signal evidence on demand. The implementation bet was to start as developer infrastructure, with a CLI that’s fast, scriptable, and deterministic.

What I Actually Built

linkledger-cli is the CLI-first core of a Pocket-for-agents system:

The goal is retrieval that’s cheap on tokens and predictable enough for machines to consume without surprises.

Why This Shape Works for AI Agents

For me, these three constraints mattered:

  1. Retrieval needs to be cheap and composable.
  2. Evidence needs to carry metadata about where it came from and how confident you are in it.
  3. Tooling needs to run where agents and operators already work.

That drove the following architecture:

This gave me working memory without having to stand up a new service.

System Design in Practice

1. Save path is instant, ingest is async

save canonicalizes the URL, dedupes by canonical URL, creates the item record, and enqueues an ingest job. It can also attach an initial note and tags in the same transaction.

The important part: your intent is captured immediately, while parsing and enrichment happen in the background worker. This felt like the right split from the start — you don’t want saving to block on network calls or parsing.

2. Explicit ingestion lifecycle

Items move through clear states:

There are first-class ops commands (status, retry, worker) so ingest failures are observable and recoverable, not silent. I’ve been burned enough by systems that fail quietly to know this matters early.

3. Adapter chain with pragmatic fallbacks

The worker picks adapters by source type:

If a source-specific parser fails, the chain continues. Retryable failures are requeued with exponential backoff.

4. Ranking that rewards trust signals

Search runs on SQLite FTS5 with weighted fields. Ranking is:

ranking_score = bm25_score + pinned_boost - low_confidence_penalty

Where:

This is intentionally simple and interpretable. I’d rather be able to explain why something ranked high than chase marginal relevance gains I can’t debug.

5. Retrieval is highlight-first by default

find and brief return compact context first:

Full chunk expansion is opt-in, so the default output stays token-efficient.

6. Freshness without cron complexity

A stale revalidation service can enqueue re-ingest jobs for older content (default threshold: 30 days) when content is accessed via retrieval. This keeps memory useful over time without adding cron jobs or a separate scheduler to manage. I have noticed how many tokens can get burned by a cron job running only to find nothing needing to be acted on.

Day-to-Day Workflow

The core drafting loop looks like this:

linkledger save "https://example.com/source" --note "why this matters" --tags ai-memory --json
linkledger worker --limit 20 --max-attempts 3 --base-backoff-ms 2000 --json
linkledger find "agent memory retrieval" --tags ai-memory --limit 10 --json
linkledger brief "Write a post on local-first agent memory systems" --max-items 8 --json

The output contract is stable:

{
  "ok": true,
  "data": {},
  "meta": {
    "timestamp": "2026-02-24T17:00:00.000Z",
    "version": "0.1.0"
  }
}

That one decision makes it easy to slot into agent pipelines and content workflows.

Data Model Choices That Matter

A few choices carried most of the quality load:

This structure supports both human and agent contributions while keeping it easy to trace where any piece of evidence came from.

Tradeoffs I Chose

These are deliberate v1 tradeoffs — first-pass decisions based on experience, not battle-tested conclusions. I need to use this for a while before I’ll know what actually breaks:

For this stage, reliability and explainability beat sophistication.

What I Want to Build Next

Near-term extensions are straightforward:

  1. Add hybrid lexical + semantic retrieval behind the same command surface.
  2. Improve enrichment quality with stronger claim extraction and source-aware summarization.
  3. Add a minimal human curation UI only where CLI friction becomes real (inbox, highlight review, retry/status).
  4. The SKILL file here is rather specific to my stack of tools. Namely, the content-board, which is a Kanban type board where content is managed through different states of readiness from inbox to published. I’d like to make this even more generic so others can just drop in and start having their agent use it.

The main principle remains: keep the memory layer boring, deterministic, and cheap to consume.

Why I’m Excited About It

This started as “I’m tired of repeating the same research loop,” and it turned into a tool I want in every workflow where source memory and evidence quality actually matter.

The interesting part isn’t that it stores links. It’s that it turns saved sources into reusable context with ranking signals, provenance, and a stable contract agents can consume without custom glue code.

The first real test case is OpenClaw. I’m already using OpenClaw agents to help with research for posts and content, and the missing piece was persistent memory — a way for those agents to store interesting things I’ve read or consumed so I can come back to them for content ideas or as evidence in future writing. That’s the workflow linkledger-cli was built for.

If you’re building with agents, this is the bar I’d recommend:

Today, that naturally maps to CLI-native environments like Claude Code and Codex CLI. Over time, the same model can fit desktop agent apps too, as long as they expose tool hooks (CLI execution, MCP, or plugin interfaces) that can call into the same retrieval contract.

That’s what linkledger-cli is for me right now: a practical Pocket-for-agents core. I’m curious to see what other use cases people find for it.


Edit on GitHub