Embedded Agents Overview

Embedded agents are autonomous, AI-powered processes that manage memory operations within memharness. They handle complex tasks like entity extraction, summarization, consolidation, and context assembly — operations that require semantic understanding beyond simple heuristics.

What Are Memory Agents?

Memory Agents = Small, specialized LLM-powered processes embedded INSIDE your memory infrastructure that autonomously perform memory management tasks.

They're called "embedded" because they live within the memharness system itself, not in your application code. Think of them as automated janitors, librarians, and curators working behind the scenes.

The 5 Embedded Agents

memharness provides 5 specialized agents:

Agent	Purpose	Mode	Example
Summarizer	Compress old conversations	Heuristic + LLM	30 messages → 1 paragraph
Entity Extractor	Extract people, places, orgs	Regex + LLM	"Dr. Chen works at MIT" → 2 entities
Consolidator	Merge duplicate memories	Semantic + LLM	3 "Dr. Chen" entries → 1 merged entity
Garbage Collector	Clean expired memories	Policy-based	Delete memories older than 365d
Context Assembler	Build optimal LLM context	Retrieval + ranking	Assemble top-K from each memory type

Dual-Mode Design

All agents follow a dual-mode design pattern:

Without LLM — Uses heuristics, regex, or policy rules. Works immediately, no setup required.
With LLM — Uses LangChain for intelligent operations. Better quality, requires LLM configuration.

from memharness import MemoryHarness
from memharness.agents import EntityExtractorAgent
from langchain.chat_models import init_chat_model

# Mode 1: Without LLM (heuristic)
harness = MemoryHarness("sqlite:///memory.db")
agent = EntityExtractorAgent(harness)  # No LLM
entities = await agent.extract_entities("Dr. Chen works at MIT")
# Uses regex: capitalized words, @mentions, email addresses

# Mode 2: With LLM (intelligent)
llm = init_chat_model("gpt-4o")
agent = EntityExtractorAgent(harness, llm=llm)
entities = await agent.extract_entities("Dr. Chen works at MIT")
# Uses LangChain structured output for accurate NER

Agent Lifecycle: Triggers

Agents can be triggered in different ways:

Trigger Type	When It Fires	Example
ON_WRITE	When new memory is added	Entity extraction on every message
ON_READ	When memory is retrieved	Context assembly before LLM call
PRE_LLM	Before sending context to LLM	Context optimization
POST_LLM	After receiving LLM response	Extract entities from response
SCHEDULED	Time-based periodic execution	Daily consolidation at 3 AM
POLICY	Triggered by policy rules	GC when storage > 90%
ON_DEMAND	Manual invocation	User calls `agent.run()`

The BaseMemoryAgent Protocol

All agents implement the same interface:

from memharness.agents.base import BaseMemoryAgent

class MyAgent(BaseMemoryAgent):
    def __init__(self, harness: MemoryHarness, llm: BaseChatModel | None = None):
        self.harness = harness
        self.llm = llm

    async def run(self, **kwargs) -> dict[str, Any]:
        """Execute the agent's main logic."""
        if not self.llm:
            return await self._heuristic_mode(**kwargs)
        return await self._llm_mode(**kwargs)

This consistent interface makes agents composable, testable, and framework-agnostic.

BEFORE / INSIDE / AFTER the Loop

Agents operate at different points in the agent execution lifecycle:

Why Embedded Agents?

Without Agents	With Agents
Manual memory management	Automated memory operations
Write extraction logic in your app	Agents handle it internally
Context assembly is your problem	Context Assembler does it for you
No deduplication → messy memory	Consolidator keeps it clean
Memory grows forever → DB bloat	GC agent cleans expired memories
Inconsistent behavior across projects	Standardized, battle-tested agents

Configuration

Agents can be configured via YAML:

agents:
  entity_extractor:
    enabled: true
    trigger: on_write
    llm: gpt-4o-mini

  summarizer:
    enabled: true
    triggers:
      - condition: "age > 7d"
      - condition: "message_count > 50"
    llm: gpt-4o-mini

  consolidator:
    enabled: true
    schedule: "0 3 * * *"  # Daily at 3 AM
    similarity_threshold: 0.9
    llm: gpt-4o

  gc:
    enabled: true
    schedule: "0 4 * * 0"  # Weekly Sunday 4 AM
    archive_after: 90d
    delete_after: 365d

Next Steps

Summarizer — Conversation compression
Entity Extractor — Named entity recognition
Consolidator — Duplicate detection and merging
Garbage Collector — Memory cleanup policies
Context Assembler — Optimal context assembly (NEW in v0.5.1)

What Are Memory Agents?​

The 5 Embedded Agents​

Dual-Mode Design​

Agent Lifecycle: Triggers​

The BaseMemoryAgent Protocol​

BEFORE / INSIDE / AFTER the Loop​

Why Embedded Agents?​

Configuration​

Next Steps​