Semantic Memory
Persistent vector-based memory for agent learning and knowledge retrieval
Semantic Memory
╔═══════════════════════════════════════════════════════════════╗
║ ║
║ 🧠 SEMANTIC MEMORY 🧠 ║
║ ║
║ ┌─────────────────────────────────────────────────────┐ ║
║ │ "The palest ink is better than the best memory." │ ║
║ │ — Chinese Proverb │ ║
║ └─────────────────────────────────────────────────────┘ ║
║ ║
║ Vector embeddings + PGLite = Persistent agent learning ║
║ ║
╚═══════════════════════════════════════════════════════════════╝Overview
Semantic Memory provides persistent, searchable storage for agent learnings. Built on pgvector (vector similarity search) with Ollama embeddings, it enables agents to:
- Store learnings - Capture debugging insights, architectural decisions, domain patterns
- Search by meaning - Find relevant memories via semantic similarity, not just keywords
- Decay over time - Old memories fade unless validated (90-day half-life)
- Graceful fallback - Full-text search when Ollama unavailable
Why Semantic Memory?
Without persistent memory, agents solve the same problems repeatedly. Semantic memory breaks this cycle:
| Problem | Without Memory | With Memory |
|---|---|---|
| Debugging OAuth token refresh | 30min investigation | 30sec lookup |
| Architectural decisions | Re-debate every time | Reference past reasoning |
| Domain-specific patterns | Rediscover each session | Instant recall |
| Tool/library gotchas | Trial and error | Known workarounds |
Key insight: Store the WHY, not just the WHAT. Future agents need context.
Architecture
Semantic Memory is embedded in swarm-mail's PGLite instance:
┌─────────────────────────────────────────────────────────────────┐
│ swarm-mail PGLite │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ events │ │ hive │ │ memories + embeddings │ │
│ │ (stream) │ │ (cells) │ │ (vector search) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ pgvector extension │ │
│ │ (1024-dim cosine similarity) │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ Ollama │
│ mxbai-embed-lg │
│ (optional) │
└─────────────────┘Components
| Component | Purpose |
|---|---|
memories table | Content, metadata, tags, collection, timestamps |
memory_embeddings table | 1024-dim vectors for similarity search |
pgvector extension | Cosine similarity via <=> operator |
| Ollama integration | Embedding generation (mxbai-embed-large) |
| FTS fallback | Full-text search when Ollama unavailable |
Quick Start
Installation
Semantic memory is included in opencode-swarm-plugin:
bun add opencode-swarm-pluginOllama Setup (Recommended)
For vector search, install Ollama and pull the embedding model:
# Install Ollama (macOS)
brew install ollama
# Start Ollama server
ollama serve
# Pull embedding model (1024 dimensions)
ollama pull mxbai-embed-largeWithout Ollama, memory falls back to full-text search (still functional, less semantic).
Using the Tools
The plugin exposes 8 memory tools:
// Store a learning
semantic-memory_store({
information: "OAuth refresh tokens need 5min buffer before expiry...",
tags: "auth,tokens,oauth",
metadata: JSON.stringify({ priority: "high" })
})
// Search by meaning
semantic-memory_find({
query: "token refresh race condition",
limit: 5,
expand: true // Full content, not truncated
})
// Get specific memory
semantic-memory_get({ id: "mem-abc123" })
// Validate memory (reset decay)
semantic-memory_validate({ id: "mem-abc123" })
// Remove outdated memory
semantic-memory_remove({ id: "mem-abc123" })
// List all memories
semantic-memory_list({ collection: "default" })
// Get statistics
semantic-memory_stats()
// Check Ollama health
semantic-memory_check()API Reference
semantic-memory_store
Store a memory with automatic embedding generation.
| Parameter | Type | Required | Description |
|---|---|---|---|
information | string | ✅ | Memory content (the learning) |
collection | string | Collection name (default: "default") | |
tags | string | Comma-separated tags (e.g., "auth,tokens") | |
metadata | string | JSON string with additional metadata |
Returns: { id: "mem-abc123" }
Example:
semantic-memory_store({
information: `OAuth refresh tokens need 5min buffer before expiry to avoid
race conditions. Without buffer, token refresh can fail mid-request if expiry
happens between check and use. Implemented with:
if (expiresAt - Date.now() < 300000) refresh()`,
tags: "auth,oauth,tokens,race-conditions",
metadata: JSON.stringify({
source: "debugging-session",
files: ["src/auth/refresh.ts"]
})
})semantic-memory_find
Search memories by semantic similarity or full-text.
| Parameter | Type | Required | Description |
|---|---|---|---|
query | string | ✅ | Search query |
limit | number | Max results (default: 10) | |
collection | string | Filter by collection | |
expand | boolean | Return full content (default: false = truncated) | |
fts | boolean | Force full-text search (default: false = vector) |
Returns: Array of { memory, score } sorted by relevance.
Example:
// Vector search (semantic similarity)
semantic-memory_find({
query: "authentication token expiry",
limit: 5,
expand: true
})
// Full-text search (keyword matching)
semantic-memory_find({
query: "OAuth refresh",
fts: true
})semantic-memory_get
Retrieve a specific memory by ID.
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | ✅ | Memory ID |
Returns: Memory object or "Memory not found"
semantic-memory_validate
Validate a memory is still accurate, resetting its decay timer.
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | ✅ | Memory ID |
Returns: { success: true, message: "Memory validated" }
When to use: After confirming a memory helped solve a problem correctly.
semantic-memory_remove
Delete a memory permanently.
| Parameter | Type | Required | Description |
|---|---|---|---|
id | string | ✅ | Memory ID |
Returns: { success: true, message: "Memory removed" }
When to use: Memory is outdated, incorrect, or superseded.
semantic-memory_list
List all stored memories.
| Parameter | Type | Required | Description |
|---|---|---|---|
collection | string | Filter by collection |
Returns: Array of memory objects.
semantic-memory_stats
Get database statistics.
Returns: { memories: 42, embeddings: 42 }
semantic-memory_check
Check if Ollama is available for embedding generation.
Returns: { ollama: true, model: "mxbai-embed-large" } or { ollama: false }
Memory Decay
Memories decay over time using a 90-day half-life:
score = raw_score × 0.5^(age_days / 90)| Age | Decay Factor | Effect |
|---|---|---|
| 0 days | 1.0 | Full relevance |
| 45 days | 0.71 | 71% relevance |
| 90 days | 0.5 | 50% relevance |
| 180 days | 0.25 | 25% relevance |
| 1 year | 0.06 | 6% relevance |
Validation resets decay: When you confirm a memory is still accurate via semantic-memory_validate, its timestamp resets to now.
Why decay? Stale knowledge is dangerous. Outdated patterns, deprecated APIs, and superseded decisions should fade unless actively maintained.
Best Practices
What to Store
✅ Good memories:
- Root causes of tricky bugs (with context)
- Architectural decisions (with reasoning and tradeoffs)
- Domain-specific patterns (with examples)
- Tool/library gotchas (with workarounds)
- Failed approaches (to avoid repeating)
❌ Bad memories:
- Generic knowledge (already in docs)
- Implementation details that change frequently
- Vague descriptions ("fixed the thing")
- Duplicate information
Memory Format
Include the problem, solution, and reasoning:
// ❌ BAD: No context
semantic-memory_store({
information: "Changed auth timeout to 5 minutes"
})
// ✅ GOOD: Full context
semantic-memory_store({
information: `OAuth refresh tokens need 5min buffer before expiry to avoid
race conditions. Without buffer, token refresh can fail mid-request if expiry
happens between check and use.
Root cause: Token validity check happens at request start, but actual API call
happens after async operations. If token expires during those operations,
request fails with 401.
Solution: if (expiresAt - Date.now() < 300000) refresh()
Affects: All API clients using refresh tokens.`,
tags: "auth,oauth,tokens,race-conditions,api-clients"
})When to Search
ALWAYS query memory BEFORE:
- Starting complex debugging
- Making architectural decisions
- Using unfamiliar tools/libraries
- Implementing cross-cutting features
// Before debugging auth issues
semantic-memory_find({ query: "authentication error 401", limit: 5 })
// Before architectural decisions
semantic-memory_find({ query: "event sourcing tradeoffs", limit: 5 })Migration from Standalone MCP
If you were using the standalone semantic-memory MCP server, migrate your data:
import { migrateLegacyMemories } from "swarm-mail/memory";
// Migrate from ~/.semantic-memory to swarm-mail
const result = await migrateLegacyMemories({
legacyDbPath: "~/.semantic-memory/db",
targetDb: swarmMailDb,
dryRun: false // Set true to preview
});
console.log(`Migrated ${result.migrated} memories`);
console.log(`Skipped ${result.skipped} (already exist)`);
console.log(`Failed ${result.failed} (see errors)`);Deprecation Notice
⚠️ The standalone semantic-memory MCP server is deprecated.
Use the embedded memory in
opencode-swarm-plugininstead:
- Single PGLite instance (no duplicate databases)
- No separate MCP server process
- Integrated with swarm coordination
- Same tool interface (
semantic-memory_*)The standalone server will continue to work but won't receive updates.
Database Schema
memories Table
CREATE TABLE memories (
id TEXT PRIMARY KEY, -- e.g., "mem-abc123"
content TEXT NOT NULL, -- The memory content
metadata JSONB DEFAULT '{}', -- Tags, source, etc.
collection TEXT DEFAULT 'default',
created_at TIMESTAMP NOT NULL,
updated_at TIMESTAMP NOT NULL
);
CREATE INDEX idx_memories_collection ON memories(collection);
CREATE INDEX idx_memories_created ON memories(created_at DESC);memory_embeddings Table
CREATE TABLE memory_embeddings (
id SERIAL PRIMARY KEY,
memory_id TEXT NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
embedding vector(1024) NOT NULL, -- pgvector type
UNIQUE(memory_id)
);
CREATE INDEX idx_memory_embeddings_vector
ON memory_embeddings USING ivfflat (embedding vector_cosine_ops);Full-Text Search Index
CREATE INDEX idx_memories_fts
ON memories USING gin(to_tsvector('english', content));Troubleshooting
"Failed to generate embedding"
Ollama isn't running or model isn't available:
# Check Ollama status
ollama list
# Start Ollama if not running
ollama serve
# Pull model if missing
ollama pull mxbai-embed-largeSlow searches
Vector search requires the IVFFlat index. If searches are slow:
-- Check index exists
SELECT indexname FROM pg_indexes WHERE tablename = 'memory_embeddings';
-- Rebuild if needed (run via swarm-mail migration)Memory not found after store
Ensure you're using the same project path. Memory adapters are cached per project:
// These use different databases!
const adapter1 = await getMemoryAdapter("/project/a");
const adapter2 = await getMemoryAdapter("/project/b");Credits
- Ollama - Local LLM inference for embeddings
- pgvector - Vector similarity for Postgres
- PGLite - Embedded Postgres in WASM
- mxbai-embed-large - 1024-dim embedding model
🧠 ═══════════════════════════════════════════════════════ 🧠
║ ║
║ "Those who cannot remember the past are condemned ║
║ to repeat it." — George Santayana ║
║ ║
║ Store your learnings. Search before solving. ║
║ ║
🧠 ═══════════════════════════════════════════════════════ 🧠