A long CLAUDE.md is a poor knowledge base. It is good for rules, constraints, and project-specific instructions that the agent must always obey. It is bad for runbooks, ADRs, vendor docs, PDFs, incident notes, migration guides, and old design decisions that may or may not matter in the current turn.
Lore is worth saving because it gives that second class of material a better home. It builds a local full-text knowledge base from mixed sources, lets a developer verify it from the terminal, and serves the same store to Claude Code, Codex-style workflows, Cursor, Cline, and other MCP clients through read-only tools. The article should not sell it as generic privacy magic. It should explain when local retrieval is the right boundary between prompt instructions and project memory.
The adoption boundary
Use Lore when a coding agent needs searchable project knowledge but not permanent instruction weight. Keep CLAUDE.md or AGENTS.md for rules: test commands, repo conventions, safety limits, release process, and the few constraints the agent should see every turn. Put reference material in Lore: design docs, architecture notes, API references, support runbooks, RFCs, PDFs, exported wiki pages, and vendor docs. If the repository has ten Markdown files, rg is simpler. If the useful context lives across formats, source types, and update cadences, Lore gives the retrieval job a real tool contract.
What the sources actually prove
The evidence base is specific enough to rescue the topic, but it also sets limits. The README proves the scope: local search, MCP serving, install paths, quick start commands, source types, six read-only MCP tools, and resources such as lore://info. The MCP guide proves the operational detail: Claude Code config locations, absolute LORE_CONFIG, stdio versus HTTP, watch mode, and multi-KB federation. The design philosophy proves the retrieval model: BM25 by default, Tantivy embedded in-process, one store behind CLI and MCP, and separate knowledge bases over a monolithic index. The security guide proves the risk boundary: no telemetry by default, explicit LLM provider transmission when enrichment is enabled, SSRF and archive limits, and HTTP exposure concerns. The sources do not prove that Lore should replace semantic search, access control, or human source selection. That is the editorial decision this article adds.
Why the source is stronger than the old title
The old seed framed Lore around local AI configuration privacy. That undersells the source. The README describes a local knowledge base that can search from the terminal or serve agents over MCP. It ingests local files, websites, git repos, feeds, S3, YouTube transcripts, maildir email, shell command output, upstream MCP servers, and more than 90 document formats. The design docs explain the architectural choice: BM25 is deterministic and model-free, Tantivy runs embedded, and the store sits next to the config. That is an operational story, not a slogan.
Start with one project store
The first setup should be boring and repeatable. Install with brew install timorunge/tap/lore-cli or the release installer, then run cd /path/to/project, lore init, and open .lore/lore.yaml before ingesting anything. Keep the first source narrow, for example path: ./docs, glob: "**/*.md", and topic: Internal. If the docs are mixed with code, add a second source with a different topic instead of dumping the whole repo into one bucket. Run lore preview ./docs --limit 5 when chunking or extraction is uncertain, then run lore ingest. Only after that should you test lore info, lore topics, lore docs, and lore search "authentication" --limit 5.
Use this first-run checklist
A practical acceptance path has eight checks. First, confirm the source list is intentional, not every file the project happens to contain. Second, check that .lore/lore.yaml uses base_dir: .. when it lives under .lore/. Third, keep store.path: ./store unless there is a reason to move the index. Fourth, run lore ingest and make sure document counts in lore info match expectations. Fifth, use lore topics to confirm topic names are understandable to an agent. Sixth, search for a known term and a known missing term. Seventh, read one result with lore read or inspect it through lore docs. Eighth, add the MCP config only after the CLI path proves the store contains the right knowledge.
Expose retrieval, not mutation
Lore's MCP surface is deliberately narrow. The README and MCP guide list six read-only tools: lore_info, lore_list_topics, lore_search, lore_read_topic, lore_list_docs, and lore_read_doc. That matters for agent safety. A Claude Code session can discover the knowledge base, search by topic, read a document, and cite a retrieved chunk, but it cannot rewrite the store through those MCP calls. In practice, a good agent instruction is: call lore_info first, use lore_search for candidate chunks, call lore_read_doc only for the documents that justify the answer, then report the source names used.
Use absolute config paths in MCP
The MCP guide calls out a small but important failure mode: clients launch subprocesses from unpredictable working directories, so LORE_CONFIG should be absolute. For Claude Code, the config can live in .claude/settings.local.json or .mcp.json in the project root. A minimal entry is { "mcpServers": { "project-docs": { "command": "lore", "args": ["serve"], "env": { "LORE_CONFIG": "/absolute/path/to/project/.lore/lore.yaml" } } } }. After adding it, restart or reload MCP servers, ask Claude Code to call lore_info, then ask for lore_search on a term you already verified from the CLI. That is the right default for a single developer. Start there before adding HTTP transport, multi-client access, or watchers.
Separate knowledge bases before they fight
Lore is most useful when it preserves boundaries. The design philosophy recommends separate knowledge bases per domain because update cadence and relevance are different. A product repo, an internal wiki checkout, and vendor API docs should not always share one monolithic score space. When an agent needs all three, Lore can federate them at query time with multiple -c flags or a colon-separated LORE_CONFIG. That lets Claude Code see one MCP server while the operator still controls which config produced each result.
Do not confuse local with safe
Lore's security notes are direct: it does not phone home or collect telemetry, and the default search path does not need external services. That is useful, but it is not the whole risk model. You are still responsible for the right to ingest and serve the content. If LLM enrichment is enabled, document content can be sent to Ollama, Bedrock, Anthropic, or OpenAI depending on configuration. If HTTP transport is used, anything that can connect to lore serve can query the indexed content unless you bind carefully and use authentication. Local retrieval reduces one class of data movement; it does not replace access design.
Treat HTTP mode as infrastructure
The stdio mode is a local subprocess. HTTP mode is a service. The MCP docs show lore serve --transport http --port 8080, and they also require --expose when binding to a non-loopback host. That is a deliberate tripwire because binding to all interfaces puts the server on the network. For a team or remote-agent setup, add a token, keep the bind address intentional, and document which stores the server exposes. If nobody owns auth, logs, and rotation, keep Lore on stdio.
Choose sources like an editor
The YAML source model is broad enough to get messy. A practical project config should start with a few high-signal sources: path: ./docs for local architecture notes, git for a shared documentation repo, and a sitemap for official vendor docs. Add topics so search results explain their origin. Use headers with ${LORE_*} variables for authenticated endpoints instead of hardcoding tokens. Avoid indexing private mailboxes, customer dumps, or broad web sources just because the connector exists. The point is to give the agent better retrieval, not unlimited memory.
BM25 is a feature and a limit
Lore's default BM25 search is easy to reason about: no embedding model, no vector database, no cloud account, and predictable lexical failure modes. That makes it a good fit for API references, policies, runbooks, and docs where exact terms matter. It is weaker when the question needs semantic similarity across unfamiliar language. The right workflow is to use topic filters, source filters, and explicit search terms first. If the answer still depends on broad conceptual recall, ask the agent to say that Lore did not find enough evidence rather than pretending lexical search found everything.
Keep ingest observable
The performance guide is useful because it separates initial ingest from normal use. Initial ingest is the expensive part. Search is fast after the store exists, and incremental lore ingest skips unchanged documents. That should shape operations. Run lore status before rebuilding, use lore ingest --recreate only when settings such as max_chunk_chars, language, or phrase search need a real rebuild, and keep source configs small enough that failures are explainable. For most teams, the acceptance test is not a benchmark. It is whether the agent can retrieve the right document and name it in the answer.
Where it fits in a Codex workflow
A strong Codex setup has three layers. First, AGENTS.md or CLAUDE.md defines behavior and constraints. Second, the repository provides tests, scripts, and type checks. Third, Lore provides searchable reference context that should not bloat every turn. Before a large refactor, ask the agent to search Lore for architecture notes, migration history, and vendor docs. Before a support fix, search runbooks and incident notes. Before changing auth, search security decisions and API references. The final answer should mention which Lore documents influenced the change, just as it would mention tests that were run.
A setup I would ship first
For a real team, I would start with two stores, not one giant cache. The project store indexes ./docs, ADRs, and runbooks under topics such as Architecture, Operations, and Security. The vendor store indexes only official docs through a sitemap or a checked-out git mirror. Claude Code gets one MCP entry that runs lore serve -c /absolute/project/.lore/lore.yaml -c /absolute/vendor/lore.yaml. The agent instruction is equally narrow: search Lore before architecture or dependency decisions, name the documents used, and say when Lore returned no strong evidence. That is the point where Lore becomes magazine-worthy: it gives the agent a controlled retrieval habit, not just another search command.
Save Lore as a local knowledge retrieval article. It is valuable when Claude Code needs searchable project memory through read-only MCP tools, but it should not replace instruction files, source control, or access-control decisions.
Practical takeaway
Start with one project store: run lore init, review .lore/lore.yaml, run lore ingest, and verify with lore info, lore topics, lore docs, and lore search "authentication". Add a Claude Code MCP entry with command: "lore", args: ["serve"], and an absolute LORE_CONFIG. Keep rules in CLAUDE.md or AGENTS.md, put reference material in Lore, and require the agent to name the retrieved documents before using them in a decision.