Chatbot with persistent memory

This sample shows how to build an AI chatbot that remembers what a user told it days, weeks, or months ago. Same primitive that powers the /dashboard/chat agent on mnueron.com itself.

The shape

                  ┌──────────────────────┐
  user message ─► │ 1. Recall context    │ ─► top-5 relevant memories + runbooks
                  ├──────────────────────┤
                  │ 2. Call LLM with     │
                  │    system prompt +   │
                  │    recall + history  │
                  ├──────────────────────┤
                  │ 3. Save the user     │
                  │    turn as a memory  │
                  ├──────────────────────┤
                  │ 4. Return reply      │
                  └──────────────────────┘

Three round-trips: one to mnueron for recall, one to the LLM, one back to mnueron to save. Optional: a 4th to mnueron for save-on-assistant-turn if you want full conversation history searchable later.

Working code — TypeScript

import { Mnueron } from "@mnueron/sdk";
import OpenAI from "openai";

const mnueron = new Mnueron({
  apiUrl: "https://www.mnueron.com",
  apiToken: process.env.MNUERON_API_TOKEN!,
});
const openai = new OpenAI();

async function chat(userId: string, message: string): Promise<string> {
  // 1. Recall
  const hits = await mnueron.search({
    query: message,
    namespace: `user:${userId}`,
    k: 5,
  });

  const context = hits
    .map((h, i) => `[mem ${i + 1}] ${h.content}`)
    .join("\n");

  // 2. LLM
  const completion = await openai.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "system",
        content:
          "You are a personal assistant with persistent memory. " +
          "Below are facts you previously learned about this user. " +
          "Use them when answering. If something contradicts a memory, " +
          "tell the user and ask which is correct.\n\n" +
          (context || "(no prior memories for this user yet)"),
      },
      { role: "user", content: message },
    ],
  });

  const reply = completion.choices[0]?.message?.content ?? "";

  // 3. Save the user's message as a memory (fire-and-forget)
  mnueron
    .save({
      content: `User said: ${message}`,
      namespace: `user:${userId}`,
      source: "chatbot",
    })
    .catch(() => {});

  return reply;
}

Working code — Python

from mnueron import Mnueron
import openai

m = Mnueron(api_url="https://www.mnueron.com",
            api_token=os.environ["MNUERON_API_TOKEN"])
client = openai.OpenAI()

def chat(user_id: str, message: str) -> str:
    # 1. Recall
    hits = m.search(query=message,
                    namespace=f"user:{user_id}",
                    k=5)
    context = "\n".join(f"[mem {i+1}] {h['content']}"
                        for i, h in enumerate(hits))

    # 2. LLM
    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a personal assistant with persistent memory.\n"
                    "Below are facts you previously learned about this user.\n\n"
                    + (context or "(no prior memories for this user yet)")
                ),
            },
            {"role": "user", "content": message},
        ],
    )
    reply = completion.choices[0].message.content or ""

    # 3. Save the user's message
    try:
        m.save(content=f"User said: {message}",
               namespace=f"user:{user_id}",
               source="chatbot")
    except Exception:
        pass

    return reply

Why this works

Per-user namespace (user:{userId}) keeps each user's memories isolated even though they share one mnueron tenant.
Recall before LLM means every reply is grounded in what was said before — no hallucinated "as I mentioned earlier."
Save the user turn, not the assistant turn, by default. The assistant's words aren't usually facts you want to retrieve later; the user's are. (Save assistant turns when they contain decisions or actions the user agreed to.)
Fire-and-forget save doesn't block the reply. If saving fails the chat still works; the memory loss is graceful.

Going further

Auto-summarize long turns before saving — the auto-synopsis gate already does this on the server side when content > 800 chars.
Surface runbooks by switching search to /api/recall/unified — when the user asks "how do I deploy?" they get a saved runbook back, not just facts.
Track conversation sessions by appending each turn to a chat_sessions row — same shape as the dashboard chat agent does.
Detect entities by enabling entity extraction on save (paid tier) — then you can answer "what do we know about Maya?" with a structured entity card.