Memory Systems and Persistence

Overview

Memory systems enable AI agents to retain and utilize information across interactions, improving their ability to maintain context and learn from experience [19].

Memory Types

TypeDurationPurposeImplementation
Short-termSingle sessionWorking contextIn-memory buffer
Long-termPersistentKnowledge retentionVector stores, databases
EpisodicPersistentExperience recallStructured logs
SemanticPersistentFactual knowledgeKnowledge graphs
ProceduralPersistentHow-to knowledgeTool definitions

Short-term Memory

Short-term memory maintains context within a single conversation or task.

Implementation Strategies

  • Buffer Memory: Store recent N messages
  • Window Memory: Sliding window of recent context
  • Summary Memory: Compress older context into summaries
  • Token Buffer: Limit by token count rather than message count

Example

# LangChain Buffer Memory
from langchain.memory import ConversationBufferMemory

memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Add to conversation
memory.save_context(
    {"input": "Hello, I'm Alice"},
    {"output": "Hello Alice! How can I help you today?"}
)

# Retrieve history
history = memory.load_memory_variables({})
print(history["chat_history"])

Long-term Memory

Long-term memory persists information across sessions, enabling agents to remember user preferences, past interactions, and accumulated knowledge [20].

Storage Options

StorageBest ForRetrieval Method
Vector StoresSemantic searchSimilarity search
Key-Value StoresFast lookupsExact match
Relational DBsStructured dataSQL queries
Graph DBsRelationshipsGraph traversal

MemGPT Architecture

MemGPT (now Letta) introduces a hierarchical memory system inspired by operating systems [21].

Memory Hierarchy

LevelAnalogyCharacteristics
Main ContextRAMLimited, fast access, in LLM context
Archival MemoryDiskUnlimited, slower access, searchable
Recall MemoryCacheRecent interactions, quick retrieval

Key Concepts

  • Self-Editing Memory: Agent can modify its own memory
  • Memory Paging: Swap information in/out of context
  • Memory Search: Query archival storage as needed

LangGraph Memory and Persistence

LangGraph provides built-in support for memory and state persistence [22].

Checkpointing

LangGraph supports checkpointing for state persistence:

from langgraph.checkpoint.sqlite import SqliteSaver

# Create checkpointer
checkpointer = SqliteSaver.from_conn_string(":memory:")

# Create graph with checkpointing
graph = StateGraph(State)
# ... add nodes and edges ...
app = graph.compile(checkpointer=checkpointer)

# Run with thread ID for persistence
config = {"configurable": {"thread_id": "user_123"}}
result = app.invoke({"messages": [...]}, config)

Memory Backends

  • SQLite: Local file-based persistence
  • PostgreSQL: Production-grade persistence
  • Redis: Fast in-memory with optional persistence
  • Custom: Implement your own checkpoint saver

Memory Management Strategies

Compression

  • Summarize older conversations
  • Extract key facts and discard details
  • Use hierarchical summarization

Prioritization

  • Score memories by relevance and recency
  • Implement forgetting curves
  • Prioritize frequently accessed information

Organization

  • Categorize memories by type
  • Use metadata for filtering
  • Build knowledge graphs for relationships

Implementation Example

# Hybrid Memory System
class HybridMemory:
    def __init__(self):
        self.short_term = []  # Recent messages
        self.long_term = VectorStore()  # Persistent knowledge
        self.max_short_term = 10
    
    def add_message(self, message: str, role: str):
        # Add to short-term
        self.short_term.append({"role": role, "content": message})
        
        # Trim if needed
        if len(self.short_term) > self.max_short_term:
            # Summarize and store in long-term
            summary = self.summarize(self.short_term[:5])
            self.long_term.add(summary)
            self.short_term = self.short_term[5:]
    
    def get_context(self, query: str) -> str:
        # Get relevant long-term memories
        relevant = self.long_term.search(query, k=3)
        
        # Combine with short-term
        context = "\n".join([m["content"] for m in relevant])
        context += "\n" + "\n".join([
            f"{m['role']}: {m['content']}" 
            for m in self.short_term
        ])
        return context

Best Practices

  1. Separate Concerns: Use different stores for different memory types
  2. Implement Forgetting: Not all information needs to be retained
  3. Index Appropriately: Use the right retrieval method for each use case
  4. Handle Conflicts: Decide how to handle contradictory memories
  5. Privacy Considerations: Be mindful of what information is stored

References

  1. IBM - AI Agent Memory
  2. Redis - AI Agent Memory
  3. MemGPT Documentation
  4. LangGraph Memory and Persistence