Knowledge Base Memory

Store facts, documents, and information for semantic search and retrieval-augmented generation (RAG).

Overview

Knowledge Base memory is designed for storing factual information, documents, and reference material that agents can search semantically. It's the foundation for RAG applications and fact-based question answering.

When to Use

RAG (Retrieval-Augmented Generation) applications
Storing documentation, articles, and reference material
Fact-based question answering
Grounding agent responses in source material
Building searchable knowledge repositories

Storage Strategy

Backend: Vector database (PostgreSQL with pgvector, SQLite with sqlite-vss) Why: Knowledge base requires:

Semantic search via embedding similarity
Efficient vector operations (cosine similarity, HNSW indexing)
Metadata filtering (by source, category, etc.)

Stored with vector embeddings for semantic search, indexed using HNSW (Hierarchical Navigable Small World) for fast approximate nearest neighbor search.

API Methods

add_knowledge

Add factual information or a document to the knowledge base.

async def add_knowledge(
    content: str,
    source: str | None = None,
    metadata: dict[str, Any] | None = None
) -> str

Parameters:

content: The knowledge content (fact, passage, document chunk)
source: Optional source identifier (URL, document name, etc.)
metadata: Optional metadata (category, tags, etc.)

Returns: Memory ID

Example:

from memharness import MemoryHarness

async with MemoryHarness("sqlite:///memory.db") as harness:
    # Add a fact
    await harness.add_knowledge(
        content="Python's Global Interpreter Lock (GIL) prevents true parallelism in CPU-bound threads.",
        source="Python Documentation",
        metadata={"category": "programming", "language": "python"}
    )

    # Add a document chunk
    await harness.add_knowledge(
        content="Kubernetes is a container orchestration platform that automates deployment, scaling, and management of containerized applications.",
        source="https://kubernetes.io/docs",
        metadata={"category": "devops", "topic": "kubernetes"}
    )

    # Add research findings
    await harness.add_knowledge(
        content="The study found that regular exercise improves cognitive function and reduces the risk of dementia by 30%.",
        source="Journal of Neuroscience, 2025",
        metadata={"category": "health", "type": "research"}
    )

search_knowledge

Search the knowledge base by semantic similarity.

async def search_knowledge(
    query: str,
    k: int = 5,
    filters: dict[str, Any] | None = None
) -> list[MemoryUnit]

Parameters:

query: The search query
k: Number of results to return (default: 5)
filters: Optional metadata filters (e.g., {"category": "programming"})

Returns: List of MemoryUnit objects, ordered by relevance (similarity score)

Example:

# Basic search
results = await harness.search_knowledge("Python concurrency", k=3)

for r in results:
    print(f"[{r.score:.2f}] {r.content[:100]}...")
    print(f"Source: {r.metadata.get('source')}\n")

# Filtered search
python_docs = await harness.search_knowledge(
    query="async programming",
    k=5,
    filters={"language": "python"}
)

# Multiple filters
kubernetes_docs = await harness.search_knowledge(
    query="deployment strategies",
    k=10,
    filters={
        "category": "devops",
        "topic": "kubernetes"
    }
)

Schema/Metadata Structure

Each knowledge base memory unit contains:

{
    "id": "uuid",
    "content": "The knowledge content",
    "memory_type": "knowledge",
    "namespace": ("knowledge",),
    "metadata": {
        "source": "Python Documentation",  # Optional
        # Custom metadata fields:
        "category": "programming",
        "language": "python",
        "tags": ["concurrency", "async"],
        "confidence": 0.95,
        "last_verified": "2026-03-23"
    },
    "embedding": [0.123, -0.456, ...],  # Vector embedding
    "created_at": "2026-03-23T10:00:00Z",
    "score": 0.87  # Similarity score (only in search results)
}

Best Practices

1. Chunk Documents Appropriately

Break large documents into searchable chunks:

def chunk_document(doc: str, chunk_size: int = 512) -> list[str]:
    """Split document into overlapping chunks."""
    chunks = []
    words = doc.split()
    for i in range(0, len(words), chunk_size):
        chunk = " ".join(words[i:i + chunk_size])
        chunks.append(chunk)
    return chunks

# Add each chunk
document = "Long document text..."
chunks = chunk_document(document)

for i, chunk in enumerate(chunks):
    await harness.add_knowledge(
        content=chunk,
        source="long-document.pdf",
        metadata={"chunk_id": i, "total_chunks": len(chunks)}
    )

2. Use Source Tracking

Always include source information for attribution:

await harness.add_knowledge(
    content="Fact from documentation",
    source="https://docs.example.com/page",
    metadata={
        "source_type": "documentation",
        "retrieved_at": datetime.utcnow().isoformat()
    }
)

3. Leverage Metadata Filters

Use metadata to organize and filter knowledge:

# Add with rich metadata
await harness.add_knowledge(
    content="Docker containers provide isolated environments...",
    source="Docker Docs",
    metadata={
        "category": "devops",
        "technology": "docker",
        "difficulty": "beginner",
        "language": "en"
    }
)

# Search with filters
beginner_docker = await harness.search_knowledge(
    query="container basics",
    filters={
        "category": "devops",
        "technology": "docker",
        "difficulty": "beginner"
    }
)

4. Update vs. Replace

Knowledge can become stale. Choose whether to update or add new versions:

# Option 1: Add new version (preserves history)
await harness.add_knowledge(
    content="Updated information about Python 3.14...",
    source="Python 3.14 Release Notes",
    metadata={"version": "3.14", "supersedes": old_id}
)

# Option 2: Update existing (if backend supports)
# await harness.update(old_id, new_content)

5. RAG Pattern with LangChain

Use knowledge base in a RAG pipeline:

from langchain.agents import create_agent
from memharness.middleware import MemoryContextMiddleware

# Knowledge is automatically injected into context
agent = create_agent(
    model="anthropic:claude-sonnet-4-6",
    tools=[...],
    middleware=[
        MemoryContextMiddleware(
            harness=harness,
            thread_id="chat-1",
            max_tokens=2000  # Includes knowledge search
        )
    ]
)

# Agent automatically searches knowledge base
response = await agent.ainvoke({
    "messages": [{"role": "user", "content": "Explain Docker containers"}]
})
# Response will be grounded in stored knowledge

Vector Storage Details

Knowledge base uses vector embeddings for semantic search:

CREATE TABLE memory_store (
    id UUID PRIMARY KEY,
    namespace TEXT[] NOT NULL,
    memory_type VARCHAR(50) NOT NULL,
    content TEXT NOT NULL,
    metadata JSONB DEFAULT '{}',
    embedding vector(768) NOT NULL,  -- Dimension depends on model
    created_at TIMESTAMPTZ DEFAULT NOW(),
    updated_at TIMESTAMPTZ DEFAULT NOW()
);

-- HNSW index for fast similarity search
CREATE INDEX idx_knowledge_embedding
ON memory_store USING hnsw(embedding vector_cosine_ops)
WHERE memory_type = 'knowledge';

-- GIN index for metadata filtering
CREATE INDEX idx_knowledge_metadata
ON memory_store USING gin(metadata)
WHERE memory_type = 'knowledge';

Embedding Models

memharness uses HuggingFace embeddings by default:

# Default: sentence-transformers/all-MiniLM-L6-v2 (384 dimensions)
harness = MemoryHarness("sqlite:///memory.db")

# Custom embedding model
from langchain_huggingface import HuggingFaceEmbeddings

embeddings = HuggingFaceEmbeddings(
    model_name="sentence-transformers/all-mpnet-base-v2"
)
harness = MemoryHarness(
    "postgresql://localhost/memharness",
    embedding_fn=lambda text: embeddings.embed_query(text)
)

Performance Considerations

Search Speed

HNSW indexing: Near-instant search even with millions of vectors
Metadata filtering: Add GIN indexes on commonly filtered fields
Batch operations: Add multiple knowledge items in parallel

import asyncio

# Batch add for large datasets
async def add_knowledge_batch(items: list[dict]):
    tasks = [
        harness.add_knowledge(**item)
        for item in items
    ]
    return await asyncio.gather(*tasks)

await add_knowledge_batch([
    {"content": "fact 1", "source": "source1"},
    {"content": "fact 2", "source": "source2"},
    # ... thousands more
])

Storage Size

Each embedding: ~3KB (768 dimensions * 4 bytes)
Monitor database size and consider archiving old knowledge

Entity Memory — Store structured entities mentioned in knowledge
File Memory — Track source documents and files
Conversational Memory — Ground conversations in knowledge

Next Steps

Entity Memory — Structured named entities
RAG Guide — Building RAG applications
Embeddings — Choosing embedding models

Overview​

When to Use​

Storage Strategy​

API Methods​

add_knowledge​

search_knowledge​

Schema/Metadata Structure​

Best Practices​

1. Chunk Documents Appropriately​

2. Use Source Tracking​

3. Leverage Metadata Filters​

4. Update vs. Replace​

5. RAG Pattern with LangChain​

Vector Storage Details​

Embedding Models​

Performance Considerations​

Search Speed​

Storage Size​

Related Memory Types​

Next Steps​