Overview
Give your AI agents the gift of memory. Persistent, structured, and works with every major framework.
đĄ
One-line install: pip install agentcortex â works with LangChain, CrewAI, AutoGen, and raw SDKs.
The Problem
AI agents are stateless by default. Every conversation starts fresh â they forget everything. Building custom memory systems on top of raw LLM APIs is complex, error-prone, and not portable between frameworks.
agent-memory solves this with a production-ready three-tier memory system that mirrors how human memory actually works. One import. Three lines. Works with any agent framework.
đ§ Three-tier architecture
Working, episodic, and semantic memory â just like human cognition.
đ Universal compatibility
LangChain, CrewAI, AutoGen, Anthropic, OpenAI, MCP â all supported.
⥠3 lines of code
Drop-in memory for any agent in under a minute. No config files.
đž Local SQLite default
Data stays on your machine. Swap to Qdrant for semantic search at scale.
Installation
pip install agentcortex # core (SQLite backend)
pip install agentcortex[qdrant] # + Qdrant vector backend
pip install agentcortex[langchain] # + LangChain integration helpers
pip install agentcortex[all] # everything
Requirements
- Python 3.10 or newer
- Any LLM framework (optional â works with raw SDKs too)
Quick Start
from agentcortex import MemoryStore
memory = MemoryStore(agent_id="my-agent")
# Store a memory
memory.add("User prefers concise answers in bullet points")
# Retrieve relevant memories before responding
context = memory.recall("What are the user's preferences?")
print(context) # â relevant memories ranked by similarity
# Use in your agent
def my_agent(user_input):
past_context = memory.recall(user_input)
response = call_llm(f"Context: {past_context}\n\nUser: {user_input}")
memory.add(f"User asked: {user_input}. Agent responded: {response}")
return response
Memory Architecture
agent-memory uses a three-tier system that mirrors human memory â so your agent remembers the right things at the right time.
Short-term context for the current session. Fast, in-memory, auto-cleared when the session ends. Used for the active conversation window and immediate task state.
Timestamped record of past interactions. Persisted to SQLite. Used for conversation history, task outcomes, and user preference patterns.
Vector-embedded factual knowledge. Retrieved by similarity, not recency. Used for long-term facts, user profiles, and domain knowledge. Backed by SQLite (default) or Qdrant.
API Reference
MemoryStore
from agentcortex import MemoryStore
memory = MemoryStore(
agent_id="my-agent", # unique agent identifier (required)
working_size=20, # max items in working memory (default: 20)
episodic_limit=1000, # max episodic records to keep (default: 1000)
semantic_backend="sqlite", # "sqlite" | "qdrant"
)
# Write
memory.add("fact or observation") # auto-categorized
memory.working.add("current task state") # working memory only
memory.episodic.add("what just happened") # episodic only
memory.semantic.add("long-term fact") # semantic only
# Read
memory.recall(query, top_k=5) # semantic similarity search
memory.working.get_all() # all working memory items
memory.episodic.recent(n=10) # last N episodic records
memory.semantic.search(query, top_k=5) # explicit semantic search
# Manage
memory.clear(tier="working") # clear one tier
memory.clear() # clear all
memory.export_json("backup.json")
Async Support
from agentcortex import AsyncMemoryStore
memory = AsyncMemoryStore(agent_id="my-async-agent")
async def my_async_agent(user_input):
context = await memory.recall(user_input)
response = await async_call_llm(f"Context: {context}\n\n{user_input}")
await memory.add(f"interaction: {user_input} â {response}")
return response
Anthropic Integration
from agentcortex import MemoryStore
from anthropic import Anthropic
memory = MemoryStore(agent_id="claude-agent")
client = Anthropic()
def agent(user_message):
context = memory.recall(user_message, top_k=5)
response = client.messages.create(
model="claude-opus-4-6",
max_tokens=1024,
system=f"You are a helpful assistant. Relevant past context:\n{context}",
messages=[{"role": "user", "content": user_message}]
)
result = response.content[0].text
memory.add(f"User: {user_message}\nAssistant: {result}")
return result
OpenAI Integration
from agentcortex import MemoryStore
from openai import OpenAI
memory = MemoryStore(agent_id="gpt-agent")
client = OpenAI()
def agent(user_message):
context = memory.recall(user_message, top_k=5)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": f"Relevant context:\n{context}"},
{"role": "user", "content": user_message},
]
)
result = response.choices[0].message.content
memory.add(f"User: {user_message}\nAssistant: {result}")
return result
LangChain Integration
from agentcortex.integrations.langchain import AgentCortexMemory
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_openai_functions_agent
# Drop in as a LangChain memory object
memory = AgentCortexMemory(agent_id="langchain-agent")
llm = ChatOpenAI(model="gpt-4o-mini")
agent = create_openai_functions_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, memory=memory)
# Memory is persisted automatically across runs
executor.invoke({"input": "Remember that I prefer metric units"})
executor.invoke({"input": "What unit system do I prefer?"})
# â "You prefer metric units"
CrewAI Integration
from agentcortex.integrations.crewai import AgentCortexMemory
from crewai import Agent, Task, Crew
memory = AgentCortexMemory(agent_id="crew-agent")
researcher = Agent(
role="Researcher",
goal="Research and summarize topics",
backstory="Expert researcher with perfect memory",
memory=memory, # persistent memory across tasks
)
task = Task(description="Research quantum computing trends", agent=researcher)
crew = Crew(agents=[researcher], tasks=[task])
crew.kickoff()
AutoGen Integration
from agentcortex.integrations.autogen import MemoryAgent
import autogen
memory_agent = MemoryAgent(
name="AssistantWithMemory",
agent_id="autogen-agent",
llm_config={"model": "gpt-4o-mini"},
)
user = autogen.UserProxyAgent("user")
# Memory is automatically stored and retrieved between conversations
user.initiate_chat(memory_agent, message="My name is Alex and I work in fintech")
MCP / Claude Code
agent-memory ships as an MCP (Model Context Protocol) server. Use it directly from Claude Code or any MCP-compatible client.
# Start the MCP server
agentmemory serve-mcp
# Or in claude_desktop_config.json:
{
"mcpServers": {
"agent-memory": {
"command": "agentmemory",
"args": ["serve-mcp", "--agent-id", "claude-code"]
}
}
}
âšī¸
Once connected, Claude Code can call remember, recall, and forget tools directly to maintain persistent context across sessions.
Qdrant Backend
Swap the default SQLite backend for Qdrant for high-performance semantic search at scale.
pip install agentcortex[qdrant]
from agentcortex import MemoryStore
memory = MemoryStore(
agent_id="my-agent",
semantic_backend="qdrant",
qdrant_url="http://localhost:6333",
# qdrant_api_key="..." # for Qdrant Cloud
)
đĄ
Start Qdrant locally with Docker: docker run -p 6333:6333 qdrant/qdrant
Export / Import
# Export all memories to JSON
memory.export_json("backup.json")
# Import into a new agent
new_memory = MemoryStore(agent_id="new-agent")
new_memory.import_json("backup.json")
# Merge into existing memories
new_memory.import_json("backup.json", merge=True)
Memory CLI
# Inspect stored memories
agentmemory inspect --agent-id my-project
# Search memories
agentmemory search "user preferences" --agent-id my-project
# Export / Import
agentmemory export --agent-id my-project --output memories.json
agentmemory import memories.json --agent-id new-project --merge
# Clear memories
agentmemory clear --agent-id my-project --tier working
# Start MCP server
agentmemory serve-mcp --agent-id my-project