Skip to content

Context Engine

The Context Engine is the retrieval brain of SimpleContext. It takes a user message and returns a ranked list of relevant memory nodes within budget.


How It Works

# 1. Plan retrieval based on intent
plan = sc.planner.plan(message, user_id, profile)

# 2. Retrieve, score, and select nodes
nodes = sc.engine.retrieve(plan)

# 3. Build messages for LLM
messages = sc.builder.build(system_prompt, nodes, message, history, profile)

RetrievalPlan

The planner analyzes the message and produces a plan:

from simplecontext import SimpleContext
from simplecontext.enums import Intent

plan = sc.planner.plan("debug my Python error", user_id)

plan.intent          # → "coding"
plan.query           # → "debug my Python error"
plan.working         # → True
plan.episodic        # → True
plan.semantic        # → True
plan.include_skills  # → True
plan.budget          # → {"working": 5, "episodic": 2, "semantic": 4, "skills": 3}
plan.focus_tags      # → ["debug", "error"]

Intent → Retrieval Strategy

Intent Working Episodic Semantic Skills
conversation
personal
coding
knowledge
task

Scoring Formula

score = relevance × 0.55
      + importance × 0.25
      + recency × 0.10
      + path_priority × 0.10

Relevance (0.55 weight)

Computed from token overlap between the query and node content:

relevance = text_similarity × 0.60
          + tag_similarity × 0.25
          + path_similarity × 0.15

If exact match is low (< 0.1), a fuzzy match with 0.75 threshold is tried with 70% weight.

Intent boost: +0.08 for nodes that match the detected intent type.

Recency (0.10 weight)

Exponential decay based on node age:

recency = 2^(-age_hours / half_life)
Tier Half-life
Working 1 hour
Episodic 72 hours
Semantic 720 hours

Path Priority (0.10 weight)

Path prefix Priority
/memory/semantic 1.0
/skills 0.8
/memory/episodic 0.6
/memory/working 0.3

Budget Enforcement

The selector enforces limits after scoring:

# Default limits
DEFAULT_BUDGET = {
    "working":  5,
    "episodic": 2,
    "semantic": 4,
    "skills":   2,
}
DEFAULT_MAX_TOTAL_NODES = 12
DEFAULT_MAX_TOTAL_CHARS = 8000

Top-scored nodes are selected first until budget or character limit is reached.


Debug Mode

Enable debug logging to see the full retrieval pipeline:

sc.enable_debug(True)

This logs: - Intent detected - Candidates collected per tier - Scores for each node - Final selection

# Get stats for a retrieval
stats = sc.engine.get_stats(plan)
# → {"candidates": 37, "active": 28, "selected": 11, "total_chars": 3200}

LRU Cache

The engine caches retrieval results for 30 seconds to avoid redundant DB queries on repeated or similar messages.

The cache is invalidated automatically when new messages are saved:

sc.engine.invalidate_cache(user_id)

Advanced: Adaptive Scoring

The adaptive scorer learns from feedback to improve future retrievals:

# After a successful turn, provide positive feedback
sc.feedback(user_id, selected_nodes=nodes, score=+1.0)

# After a bad retrieval
sc.feedback(user_id, selected_nodes=nodes, score=-1.0)

Advanced: Fuzzy Retrieval

For typo-tolerant search:

results = sc.fuzzy.search(user_id, "pyhton errror", threshold=0.75)

Advanced: Pattern Detection

Detect patterns in user interaction history:

patterns = sc.detect_patterns(user_id, time_window_days=7)
# → {"most_used_agent": "coding", "peak_hour": 14, "avg_session_length": 8}