Skip to main content

PostgreSQL Backend

The PostgreSQL backend with pgvector is the recommended production setup for memharness. It provides efficient vector similarity search via HNSW indexes.

When to Use

ScenarioRecommendation
Production systems✅ Recommended
Multi-user applications✅ Best choice
Large-scale vector search✅ HNSW indexing
CI/CD with real DB✅ Use Docker
Quick local devUse SQLite instead

Installation

pip install memharness[postgres]
# Installs: asyncpg, pgvector

The easiest way to run PostgreSQL + pgvector locally:

# Start postgres with pgvector
docker compose up -d

# Or manually
docker run -d \
--name memharness-postgres \
-e POSTGRES_DB=memharness \
-e POSTGRES_USER=memharness \
-e POSTGRES_PASSWORD=memharness \
-p 5432:5432 \
pgvector/pgvector:pg17

A docker-compose.yml is included in the repository:

services:
postgres:
image: pgvector/pgvector:pg17
ports:
- "5432:5432"
environment:
POSTGRES_DB: memharness
POSTGRES_USER: memharness
POSTGRES_PASSWORD: memharness
volumes:
- pgdata:/var/lib/postgresql/data

volumes:
pgdata:

Basic Usage

from memharness import MemoryHarness

harness = MemoryHarness("postgresql://memharness:memharness@localhost/memharness")
await harness.connect()

# Same API as SQLite
await harness.add_knowledge(
"pgvector enables vector similarity search in PostgreSQL",
source="docs"
)
results = await harness.search_knowledge("vector database", k=5)

await harness.disconnect()

Schema Overview

PostgreSQL uses separate tables per memory type:

TableTypeStorage
conversational_memoryConversationalSQL + timeline
knowledge_base_memoryKnowledge BaseVector (HNSW)
entity_memoryEntityVector (HNSW)
workflow_memoryWorkflowVector (HNSW)
toolbox_memoryToolboxVector (HNSW)
summary_memorySummaryVector (HNSW)
tool_log_memoryTool LogSQL + timeline
persona_memoryPersonaVector (HNSW)
file_memoryFileVector (HNSW)

The full schema is at src/memharness/sql/postgres/schema.sql.

Uses pgvector's HNSW index for efficient approximate nearest-neighbor search:

CREATE INDEX ON knowledge_base_memory 
USING hnsw (embedding vector_cosine_ops);

This scales to millions of vectors while maintaining fast query times.

Connection Pooling

The PostgreSQL backend uses asyncpg connection pooling automatically:

harness = MemoryHarness(
"postgresql://user:pass@host/db",
# Connection pool size configurable via config
)

Production Checklist

  • Use a dedicated database user with limited permissions
  • Enable connection pooling (PgBouncer or asyncpg pool)
  • Set up regular backups
  • Monitor pgvector index sizes
  • Use environment variables for credentials, not hardcoded strings
import os

DATABASE_URL = os.environ["DATABASE_URL"]
harness = MemoryHarness(DATABASE_URL)