Skip to content

Memory Tiers

SimpleContext organizes memory into three tiers, each with a specific role and lifetime.


The Three Tiers

Working   →  Episodic  →  Semantic
(active)     (sessions)   (permanent)
  2h TTL      30d TTL      No TTL

Working Tier

What it holds: Active messages, current task state, recent context.

TTL: 2 hours (configurable)

Node kinds: message, fact, task_state

When used: Every conversation. Always included in context retrieval.

ctx = sc.context(user_id)
ctx.working.add("debug this error", NodeKind.MESSAGE)
ctx.working.add("task: fix login bug", NodeKind.TASK_STATE)

Episodic Tier

What it holds: Session summaries, compressed interaction history, notable events.

TTL: 30 days (configurable)

Node kinds: summary, fact

When used: For personal queries, task continuity, coding help.

ctx.episodic.add("Session 2024-01: fixed auth bug, user prefers FastAPI", NodeKind.SUMMARY)

Working memory can be compressed into episodic summaries:

sc.memory(user_id).compress(keep_last=10)
# Compresses messages older than the last 10 into an episodic summary

Semantic Tier

What it holds: Long-term facts, user knowledge, persistent preferences.

TTL: None (permanent)

Node kinds: fact, resource

When used: For knowledge queries, personal questions, any retrieval needing long-term context.

ctx.semantic.add("user uses Proxmox for homelab", NodeKind.FACT, importance=0.8)
ctx.semantic.add("user project: Mangafork", NodeKind.FACT, importance=0.7)

Facts are also extracted automatically from conversations:

User: "I'm working on a React project"
→ Auto-extracted: "user uses React" (semantic tier)

Memory API

v3 API (simple, backward compatible)

mem = sc.memory(user_id)

# Store messages
mem.add_user("hello!")
mem.add_assistant("hi there!")

# Retrieve for LLM
history = mem.get_for_llm(limit=10)

# Persistent facts
mem.remember("name", "Alice")
mem.remember("stack", "Python + FastAPI")
mem.recall("name")   # → "Alice"

# Compress old messages
mem.compress(keep_last=10)

# Clear conversation (keeps profile)
mem.clear()

# Stats
mem.count()   # → 42

v4 API (tiered, full control)

ctx = sc.context(user_id)

# Add to specific tiers
ctx.working.add("debug this error", NodeKind.MESSAGE)
ctx.episodic.add("session summary", NodeKind.SUMMARY)
ctx.semantic.add("user uses Proxmox", NodeKind.FACT, importance=0.8)

# Stats
ctx.stats()
# → {"working": 5, "episodic": 1, "semantic": 12}

# Prune expired and deleted nodes
ctx.prune()

# Get all active nodes across all tiers
ctx.get_all_active(limit=50)

Node Lifecycle

ACTIVE → SUPERSEDED  (replaced by newer conflicting fact)
       → EXPIRED     (TTL expired, soft deleted)
       → DELETED     (manually deleted)

Nodes in SUPERSEDED, EXPIRED, or DELETED state are excluded from retrieval but kept in storage for audit. Call ctx.prune() to hard-delete them.


Importance Scores

Every node has an importance score between 0.0 and 1.0.

Event Delta
New fact from user +0.10
Used in retrieval +0.02
Daily decay -0.005
Superseded by newer fact -0.10

Nodes with higher importance are prioritized during scoring and selected first when budget is tight.

# Apply decay periodically (call once per day or per session)
sc.apply_decay(user_id)   # single user
sc.apply_decay()          # all users

Deduplication

When a new fact is extracted, SimpleContext checks for similar existing facts using Jaccard similarity:

similarity = |tokens_A ∩ tokens_B| / |tokens_A ∪ tokens_B|

If similarity ≥ 0.65, the new fact supersedes the old one instead of creating a duplicate.


Configuration

memory:
  default_limit: 20         # max messages returned by get_for_llm()
  ttl_hours:
    working: 2              # working nodes expire after 2h
    episodic: 720           # episodic nodes expire after 30 days
    # semantic: null        # semantic nodes never expire
  compression:
    enabled: false          # auto-compress when threshold is reached
    threshold: 50           # compress when message count exceeds this
    keep_last: 10           # keep this many recent messages intact