Overview

Give your AI agents the gift of memory. Persistent, structured, and works with every major framework.

💡

One-line install: pip install agentcortex — works with LangChain, CrewAI, AutoGen, and raw SDKs.

The Problem

AI agents are stateless by default. Every conversation starts fresh — they forget everything. Building custom memory systems on top of raw LLM APIs is complex, error-prone, and not portable between frameworks.

agent-memory solves this with a production-ready three-tier memory system that mirrors how human memory actually works. One import. Three lines. Works with any agent framework.

🧠 Three-tier architecture

Working, episodic, and semantic memory — just like human cognition.

🔌 Universal compatibility

LangChain, CrewAI, AutoGen, Anthropic, OpenAI, MCP — all supported.

⚡ 3 lines of code

Drop-in memory for any agent in under a minute. No config files.

💾 Local SQLite default

Data stays on your machine. Swap to Qdrant for semantic search at scale.

Installation

pip install agentcortex           # core (SQLite backend)
pip install agentcortex[qdrant]   # + Qdrant vector backend
pip install agentcortex[langchain] # + LangChain integration helpers
pip install agentcortex[all]      # everything

Requirements

Python 3.10 or newer
Any LLM framework (optional — works with raw SDKs too)

Quick Start

from agentcortex import MemoryStore

memory = MemoryStore(agent_id="my-agent")

# Store a memory
memory.add("User prefers concise answers in bullet points")

# Retrieve relevant memories before responding
context = memory.recall("What are the user's preferences?")
print(context)  # → relevant memories ranked by similarity

# Use in your agent
def my_agent(user_input):
    past_context = memory.recall(user_input)
    response = call_llm(f"Context: {past_context}\n\nUser: {user_input}")
    memory.add(f"User asked: {user_input}. Agent responded: {response}")
    return response

Memory Architecture

agent-memory uses a three-tier system that mirrors human memory — so your agent remembers the right things at the right time.

Tier 1 Working Memory

Short-term context for the current session. Fast, in-memory, auto-cleared when the session ends. Used for the active conversation window and immediate task state.

Tier 2 Episodic Memory

Timestamped record of past interactions. Persisted to SQLite. Used for conversation history, task outcomes, and user preference patterns.

Tier 3 Semantic Memory

Vector-embedded factual knowledge. Retrieved by similarity, not recency. Used for long-term facts, user profiles, and domain knowledge. Backed by SQLite (default) or Qdrant.

API Reference

MemoryStore

from agentcortex import MemoryStore

memory = MemoryStore(
    agent_id="my-agent",          # unique agent identifier (required)
    working_size=20,               # max items in working memory (default: 20)
    episodic_limit=1000,           # max episodic records to keep (default: 1000)
    semantic_backend="sqlite",     # "sqlite" | "qdrant"
)

# Write
memory.add("fact or observation")             # auto-categorized
memory.working.add("current task state")       # working memory only
memory.episodic.add("what just happened")      # episodic only
memory.semantic.add("long-term fact")          # semantic only

# Read
memory.recall(query, top_k=5)                 # semantic similarity search
memory.working.get_all()                       # all working memory items
memory.episodic.recent(n=10)                   # last N episodic records
memory.semantic.search(query, top_k=5)         # explicit semantic search

# Manage
memory.clear(tier="working")                  # clear one tier
memory.clear()                                # clear all
memory.export_json("backup.json")

Async Support

from agentcortex import AsyncMemoryStore

memory = AsyncMemoryStore(agent_id="my-async-agent")

async def my_async_agent(user_input):
    context = await memory.recall(user_input)
    response = await async_call_llm(f"Context: {context}\n\n{user_input}")
    await memory.add(f"interaction: {user_input} → {response}")
    return response

Anthropic Integration

from agentcortex import MemoryStore
from anthropic import Anthropic

memory = MemoryStore(agent_id="claude-agent")
client = Anthropic()

def agent(user_message):
    context = memory.recall(user_message, top_k=5)

    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1024,
        system=f"You are a helpful assistant. Relevant past context:\n{context}",
        messages=[{"role": "user", "content": user_message}]
    )

    result = response.content[0].text
    memory.add(f"User: {user_message}\nAssistant: {result}")
    return result

OpenAI Integration

from agentcortex import MemoryStore
from openai import OpenAI

memory = MemoryStore(agent_id="gpt-agent")
client = OpenAI()

def agent(user_message):
    context = memory.recall(user_message, top_k=5)

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": f"Relevant context:\n{context}"},
            {"role": "user",   "content": user_message},
        ]
    )

    result = response.choices[0].message.content
    memory.add(f"User: {user_message}\nAssistant: {result}")
    return result

LangChain Integration

from agentcortex.integrations.langchain import AgentCortexMemory
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_functions_agent

# Drop in as a LangChain memory object
memory = AgentCortexMemory(agent_id="langchain-agent")

llm = ChatOpenAI(model="gpt-4o-mini")
agent = create_openai_functions_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, memory=memory)

# Memory is persisted automatically across runs
executor.invoke({"input": "Remember that I prefer metric units"})
executor.invoke({"input": "What unit system do I prefer?"})
# → "You prefer metric units"

CrewAI Integration

from agentcortex.integrations.crewai import AgentCortexMemory
from crewai import Agent, Task, Crew

memory = AgentCortexMemory(agent_id="crew-agent")

researcher = Agent(
    role="Researcher",
    goal="Research and summarize topics",
    backstory="Expert researcher with perfect memory",
    memory=memory,  # persistent memory across tasks
)

task = Task(description="Research quantum computing trends", agent=researcher)
crew = Crew(agents=[researcher], tasks=[task])
crew.kickoff()

AutoGen Integration

from agentcortex.integrations.autogen import MemoryAgent
import autogen

memory_agent = MemoryAgent(
    name="AssistantWithMemory",
    agent_id="autogen-agent",
    llm_config={"model": "gpt-4o-mini"},
)

user = autogen.UserProxyAgent("user")

# Memory is automatically stored and retrieved between conversations
user.initiate_chat(memory_agent, message="My name is Alex and I work in fintech")

MCP / Claude Code

agent-memory ships as an MCP (Model Context Protocol) server. Use it directly from Claude Code or any MCP-compatible client.

# Start the MCP server
agentmemory serve-mcp

# Or in claude_desktop_config.json:
{
  "mcpServers": {
    "agent-memory": {
      "command": "agentmemory",
      "args": ["serve-mcp", "--agent-id", "claude-code"]
    }
  }
}

ℹ️

Once connected, Claude Code can call remember, recall, and forget tools directly to maintain persistent context across sessions.

Qdrant Backend

Swap the default SQLite backend for Qdrant for high-performance semantic search at scale.

pip install agentcortex[qdrant]

from agentcortex import MemoryStore

memory = MemoryStore(
    agent_id="my-agent",
    semantic_backend="qdrant",
    qdrant_url="http://localhost:6333",
    # qdrant_api_key="..."  # for Qdrant Cloud
)

💡

Start Qdrant locally with Docker: docker run -p 6333:6333 qdrant/qdrant

Export / Import

# Export all memories to JSON
memory.export_json("backup.json")

# Import into a new agent
new_memory = MemoryStore(agent_id="new-agent")
new_memory.import_json("backup.json")

# Merge into existing memories
new_memory.import_json("backup.json", merge=True)

Memory CLI

# Inspect stored memories
agentmemory inspect --agent-id my-project

# Search memories
agentmemory search "user preferences" --agent-id my-project

# Export / Import
agentmemory export --agent-id my-project --output memories.json
agentmemory import memories.json --agent-id new-project --merge

# Clear memories
agentmemory clear --agent-id my-project --tier working

# Start MCP server
agentmemory serve-mcp --agent-id my-project