Agent with Persistent Memory

This tutorial shows how to give an agent a long-term memory that persists across sessions. You will:

Create a Memory container and tag it for filtering.
Write memory entries and observe the three deduplication outcomes: created, skipped, and updated.
Upload a Document with structured reference information.
Create an agent that retrieves from both memories and the document via knowledge_config, with write_memory_id enabled so the agent can persist new facts it learns.
Run a generation and observe the model answering accurately from injected context — with no RAG logic in the prompt.
Observe the agent writing a new fact to memory with source: "agent".
Query the knowledge layer directly to see memory entries and document chunks side by side.

By the end you will understand how Memories, Documents, and the Knowledge search layer compose with agents to build stateful, context-aware AI assistants.

Prerequisites

SOAT running locally. Follow the Quick Start guide to bring the stack up with Docker Compose.
New to SOAT? Read Key Concepts to understand projects, agents, and the IAM model before diving in.
CLI installed and configured, or SDK set up. See CLI or SDK.
For production hardening (secrets, env vars), see Advanced Configuration.
Server is at http://localhost:5047.
Ollama running locally with a chat model available.

CLI
SDK
curl

export SOAT_BASE_URL=http://localhost:5047

All code snippets below use a SoatClient instance created in Step 1. Memory and knowledge operations use the static SDK classes Memories and MemoryEntries imported from @soat/sdk.

import {
  SoatClient,
  createClient,
  createConfig,
  Memories,
  MemoryEntries,
} from '@soat/sdk';

export SOAT_URL=http://localhost:5047

Step 1 — Log in as admin

Admin is the built-in superuser role. It bypasses policy evaluation entirely. See Users for full authentication details.

CLI
SDK
curl

ADMIN_TOKEN=$(soat login-user --username admin --password Admin1234! | jq -r '.token')
export SOAT_TOKEN=$ADMIN_TOKEN

const soat = new SoatClient({ baseUrl: 'http://localhost:5047' });

const { data: login } = await soat.users.loginUser({
  body: { username: 'admin', password: 'Admin1234!' },
});

const ADMIN_TOKEN = login.token;

// Standard resources (projects, agents, AI providers) via SoatClient
const adminSoat = new SoatClient({
  baseUrl: 'http://localhost:5047',
  token: ADMIN_TOKEN,
});

// Memories and MemoryEntries use static SDK classes with an explicit client
const authClient = createClient(
  createConfig({
    baseUrl: 'http://localhost:5047',
    headers: { Authorization: `Bearer ${ADMIN_TOKEN}` },
  })
);

ADMIN_TOKEN=$(curl -s -X POST "$SOAT_URL/api/v1/users/login" \
  -H "Content-Type: application/json" \
  -d '{"username":"admin","password":"Admin1234!"}' | jq -r '.token')

Step 2 — Create a project

Every resource in SOAT lives inside a project. Create one to hold the memory and agent.

CLI
SDK
curl

PROJECT_ID=$(soat create-project --name "Support Demo" | jq -r '.id')
echo "PROJECT_ID: $PROJECT_ID"

const { data: project } = await adminSoat.projects.createProject({
  body: { name: 'Support Demo' },
});
const PROJECT_ID = project.id;

PROJECT_ID=$(curl -s -X POST "$SOAT_URL/api/v1/projects" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name":"Support Demo"}' | jq -r '.id')
echo "PROJECT_ID: $PROJECT_ID"

Step 3 — Create an AI provider

Set up a local AI provider backed by Ollama. This tutorial uses a local Ollama provider so it can run without external credentials. To connect xAI, OpenAI, Anthropic, or Amazon Bedrock instead, see Connect Third-Party LLMs.

CLI
SDK
curl

AI_PROVIDER_ID=$(soat create-ai-provider \
  --project-id "$PROJECT_ID" \
  --name "Local Ollama" \
  --provider "ollama" \
  --default-model "qwen2.5:0.5b" | jq -r '.id')
echo "AI_PROVIDER_ID: $AI_PROVIDER_ID"

const { data: aiProvider } = await adminSoat.aiProviders.createAiProvider({
  body: {
    project_id: PROJECT_ID,
    name: 'Local Ollama',
    provider: 'ollama',
    default_model: 'qwen2.5:0.5b',
  },
});
const AI_PROVIDER_ID = aiProvider.id;

AI_PROVIDER_ID=$(curl -s -X POST "$SOAT_URL/api/v1/ai-providers" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"project_id\":\"$PROJECT_ID\",\"name\":\"Local Ollama\",\"provider\":\"ollama\",\"default_model\":\"qwen2.5:0.5b\"}" \
  | jq -r '.id')
echo "AI_PROVIDER_ID: $AI_PROVIDER_ID"

Step 4 — Create a memory

A Memory is a named container that holds a collection of text entries. You can attach tags to a memory for later filtering — useful when an agent should search only a subset of all memories in a project.

CLI
SDK
curl

MEMORY_ID=$(soat create-memory \
  --project-id "$PROJECT_ID" \
  --name "Alice Profile" \
  --description "Facts about customer Alice gathered during support interactions" \
  --tags '["alice","customer"]' | jq -r '.id')
echo "MEMORY_ID: $MEMORY_ID"

const { data: memory } = await Memories.createMemory({
  client: authClient,
  body: {
    project_id: PROJECT_ID,
    name: 'Alice Profile',
    description:
      'Facts about customer Alice gathered during support interactions',
    tags: ['alice', 'customer'],
  },
});
const MEMORY_ID = memory.id;

MEMORY_ID=$(curl -s -X POST "$SOAT_URL/api/v1/memories" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"project_id\":\"$PROJECT_ID\",\"name\":\"Alice Profile\",\"description\":\"Facts about customer Alice gathered during support interactions\",\"tags\":[\"alice\",\"customer\"]}" \
  | jq -r '.id')
echo "MEMORY_ID: $MEMORY_ID"

Step 5 — Write memory entries

Memory entries are the individual facts stored inside a memory. Every write request goes through a semantic deduplication algorithm that compares the new content against existing entries:

created (HTTP 201) — no similar entry exists; the new fact is stored.
skipped (HTTP 200) — a near-identical entry already exists (similarity ≥ duplicate_threshold, default 0.95); the new content is discarded.
updated (HTTP 200) — an entry is similar but not identical (similarity ≥ update_threshold, default 0.75 and < duplicate_threshold); the existing entry is replaced with the richer version.

5a — First entry (action: created)

A genuinely new fact. No similar entry exists, so it is stored.

CLI
SDK
curl

soat create-memory-entry \
  --memory-id "$MEMORY_ID" \
  --content "Alice prefers email over phone calls for all support communication"
# → { "action": "created", ... }

const { data: e1 } = await MemoryEntries.createMemoryEntry({
  client: authClient,
  path: { memory_id: MEMORY_ID },
  body: {
    content:
      'Alice prefers email over phone calls for all support communication',
  },
});
console.log(e1.action); // "created"

curl -s -X POST "$SOAT_URL/api/v1/memories/$MEMORY_ID/entries" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content":"Alice prefers email over phone calls for all support communication"}' | jq .
# → { "action": "created", ... }

5b — Near-duplicate (action: skipped)

The content is almost identical to 5a. The similarity score exceeds duplicate_threshold (0.95), so the write is silently ignored and the existing entry is unchanged.

CLI
SDK
curl

soat create-memory-entry \
  --memory-id "$MEMORY_ID" \
  --content "Alice prefers email over phone calls"
# → { "action": "skipped", ... }

const { data: e2 } = await MemoryEntries.createMemoryEntry({
  client: authClient,
  path: { memory_id: MEMORY_ID },
  body: { content: 'Alice prefers email over phone calls' },
});
console.log(e2.action); // "skipped"

curl -s -X POST "$SOAT_URL/api/v1/memories/$MEMORY_ID/entries" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content":"Alice prefers email over phone calls"}' | jq .
# → { "action": "skipped", ... }

5c — Improved version (action: updated)

The content is related but adds new detail (similarity between 0.75 and 0.95). The existing entry is replaced with the richer version, keeping memory clean and up to date.

CLI
SDK
curl

soat create-memory-entry \
  --memory-id "$MEMORY_ID" \
  --content "Alice prefers email, especially for billing inquiries; she checks it twice a day"
# → { "action": "updated", ... }

const { data: e3 } = await MemoryEntries.createMemoryEntry({
  client: authClient,
  path: { memory_id: MEMORY_ID },
  body: {
    content:
      'Alice prefers email, especially for billing inquiries; she checks it twice a day',
  },
});
console.log(e3.action); // "updated"

curl -s -X POST "$SOAT_URL/api/v1/memories/$MEMORY_ID/entries" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content":"Alice prefers email, especially for billing inquiries; she checks it twice a day"}' | jq .
# → { "action": "updated", ... }

5d — Second distinct fact (action: created)

An unrelated fact is added. No existing entry is similar, so it is stored as a new entry.

CLI
SDK
curl

soat create-memory-entry \
  --memory-id "$MEMORY_ID" \
  --content "Alice's fiscal year ends in March; she starts renewal discussions in January"
# → { "action": "created", ... }

const { data: e4 } = await MemoryEntries.createMemoryEntry({
  client: authClient,
  path: { memory_id: MEMORY_ID },
  body: {
    content:
      "Alice's fiscal year ends in March; she starts renewal discussions in January",
  },
});
console.log(e4.action); // "created"

curl -s -X POST "$SOAT_URL/api/v1/memories/$MEMORY_ID/entries" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"content":"Alice'\''s fiscal year ends in March; she starts renewal discussions in January"}' | jq .
# → { "action": "created", ... }

Step 6 — List entries to verify

After the four writes, the memory holds exactly two entries — the skipped near-duplicate was discarded and the improved version replaced the original.

CLI
SDK
curl

soat list-memory-entries --memory-id "$MEMORY_ID" | jq '[.[] | .content]'
# [
#   "Alice prefers email, especially for billing inquiries; she checks it twice a day",
#   "Alice's fiscal year ends in March; she starts renewal discussions in January"
# ]

const { data: entries } = await MemoryEntries.listMemoryEntries({
  client: authClient,
  path: { memory_id: MEMORY_ID },
});
console.log(entries.map((e) => e.content));

curl -s "$SOAT_URL/api/v1/memories/$MEMORY_ID/entries" \
  -H "Authorization: Bearer $ADMIN_TOKEN" | jq '[.[] | .content]'

Step 7 — Upload a support-policy document

A Document is a text file indexed for semantic search. Here we store Alice's account support policy — structured reference material that the agent should consult alongside the memory entries written in Step 5.

The path field gives the document a logical location inside the project (similar to a file path). We will use /alice/support-policy.txt so we can later filter the entire /alice/ subtree with a single document_paths prefix.

CLI
SDK
curl

DOC_ID=$(soat create-document \
  --project-id "$PROJECT_ID" \
  --path "/alice/support-policy.txt" \
  --content "Alice Corp Support Policy: All priority-1 incidents must receive an initial response within 2 hours. Priority-2 incidents within 8 hours. Refunds are approved automatically for outages exceeding 4 hours. Alice Corp is entitled to a dedicated support engineer during business hours (9 AM–6 PM EST)." \
  | jq -r '.id')
echo "DOC_ID: $DOC_ID"

const { data: doc } = await adminSoat.documents.createDocument({
  body: {
    project_id: PROJECT_ID,
    path: '/alice/support-policy.txt',
    content:
      'Alice Corp Support Policy: All priority-1 incidents must receive an initial response within 2 hours. Priority-2 incidents within 8 hours. Refunds are approved automatically for outages exceeding 4 hours. Alice Corp is entitled to a dedicated support engineer during business hours (9 AM–6 PM EST).',
  },
});
const DOC_ID = doc.id;

DOC_ID=$(curl -s -X POST "$SOAT_URL/api/v1/documents" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"project_id\":\"$PROJECT_ID\",\"path\":\"/alice/support-policy.txt\",\"content\":\"Alice Corp Support Policy: All priority-1 incidents must receive an initial response within 2 hours. Priority-2 incidents within 8 hours. Refunds are approved automatically for outages exceeding 4 hours. Alice Corp is entitled to a dedicated support engineer during business hours (9 AM-6 PM EST).\"}" \
  | jq -r '.id')
echo "DOC_ID: $DOC_ID"

Step 8 — Create an agent with `knowledge_config`

The knowledge_config field on an agent tells SOAT which memories and documents to search before every generation. The search query is automatically derived from the last user message — no explicit RAG logic needed in the prompt.

The fields you can set in knowledge_config:

Field	Description
`memory_ids`	Search specific memories by ID
`memory_tags`	Search memories whose tags match (supports glob patterns)
`document_paths`	Include chunks from documents whose path starts with the given prefix
`document_ids`	Include chunks from specific documents by ID
`min_score`	Minimum cosine similarity (0–1) for a result to be injected
`limit`	Maximum number of results to inject
`write_memory_id`	ID of a memory the agent can write to; enables the `write_memory` tool automatically during generation

Here we combine the memory from Step 4 with the document uploaded in Step 7 so the agent can draw on both personal customer facts and the structured support policy.

CLI
SDK
curl

AGENT_ID=$(soat create-agent \
  --project-id "$PROJECT_ID" \
  --ai-provider-id "$AI_PROVIDER_ID" \
  --name "Support Agent" \
  --instructions "You are a helpful customer support assistant. Use the provided knowledge context to answer questions accurately and concisely. When you learn new facts about a customer, use the write_memory tool to persist them." \
  --knowledge-config '{"memory_ids":["'"$MEMORY_ID"'"],"document_paths":["/alice/"],"limit":5,"write_memory_id":"'"$MEMORY_ID"'"}' \
  | jq -r '.id')
echo "AGENT_ID: $AGENT_ID"

const { data: agent } = await adminSoat.agents.createAgent({
  body: {
    project_id: PROJECT_ID,
    ai_provider_id: AI_PROVIDER_ID,
    name: 'Support Agent',
    instructions:
      'You are a helpful customer support assistant. Use the provided knowledge context to answer questions accurately and concisely.',
    knowledge_config: {
      memory_ids: [MEMORY_ID],
      document_paths: ['/alice/'],
      limit: 5,
      write_memory_id: MEMORY_ID,
    },
  },
});
const AGENT_ID = agent.id;

AGENT_ID=$(curl -s -X POST "$SOAT_URL/api/v1/agents" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"project_id\":\"$PROJECT_ID\",\"ai_provider_id\":\"$AI_PROVIDER_ID\",\"name\":\"Support Agent\",\"instructions\":\"You are a helpful customer support assistant. Use the provided knowledge context to answer questions accurately and concisely. When you learn new facts about a customer, use the write_memory tool to persist them.\",\"knowledge_config\":{\"memory_ids\":[\"$MEMORY_ID\"],\"document_paths\":[\"/alice/\"],\"limit\":5,\"write_memory_id\":\"$MEMORY_ID\"}}" \
  | jq -r '.id')
echo "AGENT_ID: $AGENT_ID"

Step 9 — Run a generation

Send a user message that requires combining personal customer facts (from memory) with the support policy (from the document). Before calling the model, SOAT searches both sources using the user message as the query and injects all matching results as a system message.

CLI
SDK
curl

soat create-agent-generation \
  --agent-id "$AGENT_ID" \
  --messages '[{"role":"user","content":"Alice has a P1 outage since 3 hours ago. How should we handle it and how do we best reach her?"}]' \
  | jq '{status: .status, output: .output.content}'

Expected shape:

{
  "status": "completed",
  "output": "Since Alice has a P1 outage, an initial response should have been sent within 2 hours per the support policy ... Contact her by email, which she checks twice a day and prefers for all support communication ..."
}

The model combines two distinct knowledge sources:

From memory — Alice prefers email; she checks it twice a day.
From the document — P1 incidents require a response within 2 hours; outages over 4 hours trigger automatic refunds.

Neither fact appeared in the user message.

const { data: generation } = await adminSoat.agents.createAgentGeneration({
  path: { agent_id: AGENT_ID },
  body: {
    messages: [
      {
        role: 'user',
        content:
          'Alice has a P1 outage since 3 hours ago. How should we handle it and how do we best reach her?',
      },
    ],
  },
});

console.log(generation.status); // "completed"
console.log(generation.output.content);
// e.g. "P1 SLA requires a response within 2 hours ... reach Alice by email ..."

curl -s -X POST "$SOAT_URL/api/v1/agents/$AGENT_ID/generate" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Alice has a P1 outage since 3 hours ago. How should we handle it and how do we best reach her?"}]}' \
  | jq '{status: .status, output: .output.content}'

Step 10 — Observe the agent writing to memory

When write_memory_id is set on the agent's knowledge_config, SOAT automatically makes a write_memory tool available during generation. If the model decides to call it (for example, because the user reveals new information), the fact is persisted via the same deduplication algorithm used for manual writes.

Send a message that introduces a new fact not yet in memory:

CLI
SDK
curl

soat create-agent-generation \
  --agent-id "$AGENT_ID" \
  --messages '[{"role":"user","content":"Just so you know, Alice moved to the West Coast and is now in the PT timezone."}]' \
  | jq '{status: .status, output: .output.content}'

const { data: gen2 } = await adminSoat.agents.createAgentGeneration({
  path: { agent_id: AGENT_ID },
  body: {
    messages: [
      {
        role: 'user',
        content:
          'Just so you know, Alice moved to the West Coast and is now in the PT timezone.',
      },
    ],
  },
});
console.log(gen2.status); // "completed"

curl -s -X POST "$SOAT_URL/api/v1/agents/$AGENT_ID/generate" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"messages":[{"role":"user","content":"Just so you know, Alice moved to the West Coast and is now in the PT timezone."}]}' \
  | jq '{status: .status, output: .output.content}'

After the generation completes, list the memory entries and look for any with source == "agent":

CLI
SDK
curl

soat list-memory-entries --memory-id "$MEMORY_ID" \
  | jq '[.[] | select(.source == "agent") | {content: .content, source: .source}]'

const { data: entries } = await MemoryEntries.listMemoryEntries({
  client: authClient,
  path: { memory_id: MEMORY_ID },
});
const agentEntries = entries.filter((e) => e.source === 'agent');
console.log(agentEntries.map((e) => e.content));

curl -s "$SOAT_URL/api/v1/memories/$MEMORY_ID/entries" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  | jq '[.[] | select(.source == "agent") | {content: .content, source: .source}]'

If the model called write_memory, you will see an entry with "source": "agent" containing the timezone fact. The write goes through the same deduplication algorithm — subsequent mentions of Alice's timezone will be deduplicated automatically.

Step 11 — Query the knowledge layer directly

The Knowledge endpoint is the same search layer the agent uses internally. Pass both memory_ids and document_paths to see exactly which chunks — from both sources — would be injected for a given question.

CLI
SDK
curl

soat search-knowledge \
  --project-id "$PROJECT_ID" \
  --query "P1 outage response and how to reach Alice" \
  --memory-ids '["'"$MEMORY_ID"'"]' \
  --document-paths '["/alice/"]' \
  | jq '.results[] | {score: .score, source_type: .source_type, content: .content}'

Expected output — note the two different source_type values:

{ "score": 0.69, "source_type": "document", "content": "Alice Corp Support Policy: All priority-1 incidents must receive an initial response within 2 hours ..." }
{ "score": 0.62, "source_type": "memory", "content": "Alice prefers email, especially for billing inquiries; she checks it twice a day" }
{ "score": 0.50, "source_type": "memory", "content": "Alice's fiscal year ends in March; she starts renewal discussions in January" }

Each result shows a score (cosine similarity) so you can tune min_score and limit on knowledge_config with confidence.

const res = await fetch('http://localhost:5047/api/v1/knowledge/search', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    Authorization: `Bearer ${ADMIN_TOKEN}`,
  },
  body: JSON.stringify({
    project_id: PROJECT_ID,
    query: 'P1 outage response and how to reach Alice',
    memory_ids: [MEMORY_ID],
    document_paths: ['/alice/'],
  }),
});

const { results } = await res.json();
results.forEach((r) => console.log(r.score, r.source_type, r.content));

curl -s -X POST "$SOAT_URL/api/v1/knowledge/search" \
  -H "Authorization: Bearer $ADMIN_TOKEN" \
  -H "Content-Type: application/json" \
  -d "{\"project_id\":\"$PROJECT_ID\",\"query\":\"P1 outage response and how to reach Alice\",\"memory_ids\":[\"$MEMORY_ID\"],\"document_paths\":[\"/alice/\"]}" \
  | jq '.results[] | {score: .score, source_type: .source_type, content: .content}'

What's next

Tag-based filtering — create separate memories per customer (e.g. tags: ["bob"]) and set memory_tags: ["alice"] on the agent to ensure each agent only retrieves the right customer's facts.
Agent-sourced entries — set source: "agent" when writing entries programmatically from an agent's output to distinguish automated facts from manually curated ones.
Document subtrees — use document_paths prefixes like /alice/ to scope retrieval to one customer's documents, keeping context focused and token-efficient.
Adjust dedup thresholds — lower update_threshold to be more aggressive about replacing stale facts, or raise duplicate_threshold to allow more near-duplicate entries to coexist.

Prerequisites​

Step 1 — Log in as admin​

Step 2 — Create a project​

Step 3 — Create an AI provider​

Step 4 — Create a memory​

Step 5 — Write memory entries​

5a — First entry (action: created)​

5b — Near-duplicate (action: skipped)​

5c — Improved version (action: updated)​

5d — Second distinct fact (action: created)​

Step 6 — List entries to verify​

Step 7 — Upload a support-policy document​

Step 8 — Create an agent with knowledge_config​

Step 9 — Run a generation​

Step 10 — Observe the agent writing to memory​

Step 11 — Query the knowledge layer directly​

What's next​