Sessions

Overview

Sessions provide a simplified 1 user ↔ 1 agent conversational interface. They are a sub-resource of Agents, nested under /agents/:agent_id/sessions, and hide the underlying Conversation, Actor, and generation plumbing.

By default, interacting with an agent requires three API calls:

Create a session — POST /agents/:agent_id/sessions
Save a user message — POST /agents/:agent_id/sessions/:session_id/messages (returns 201, does not trigger generation)
Generate a response — POST /agents/:agent_id/sessions/:session_id/generate (triggers the LLM, returns the assistant reply)

When auto_generate is enabled on the session, step 3 is handled automatically — POST .../messages saves the message and returns the assistant reply in one call, reducing the flow to two API calls.

The session automatically creates and manages the underlying conversation. An optional actor_id can be supplied to associate an existing Actor as the session owner; if omitted the session is created with no actor.

See the Permissions Reference for the IAM action strings for this module.

Key Concepts

How Sessions Relate to Other Concepts

Concept	Relationship
Chats	Raw LLM completions — no agents, no tools, caller manages history
Sessions	1 user ↔ 1 agent — full tool support, automatic history, nested under agents
Conversations	Multi-party dialogue engine — powers sessions internally, available as escape hatch

Lifecycle

A session starts in open status. It can be updated to closed when the interaction is complete. Deleting a session cascades to the underlying conversation.

Actor ID

The optional actor_id field associates an existing Actor as the owner of the session. When omitted, actor_id is null and no actor is created automatically. Sessions can be filtered by this field.

Escape Hatch

Each session exposes its conversation_id, allowing advanced users to drop into the full Conversations API when multi-party or lower-level control is needed.

Auto-Generate

When auto_generate is set to true on a session, POST .../messages saves the user message and automatically triggers LLM generation in the same request. The response body contains the assistant reply instead of just the saved user message.

This collapses the three-call flow into two calls: create a session, then send messages.

auto_generate defaults to false. It can be set at session creation or toggled at any time:

PATCH /agents/:agent_id/sessions/:session_id
Content-Type: application/json

{ "auto_generate": false }

The explicit POST .../generate endpoint continues to work regardless of this setting. Async generation (?async=true) is also supported on POST .../messages when auto_generate is enabled — the request returns 202 Accepted immediately and generation proceeds in the background.

Tool Context

Sessions support the same tool_context mechanism as direct agent generations — see Tool Context in the Agents module for the full specification.

Auto-Populated Headers

When a generation is triggered through a session (either via POST .../generate or auto-generate), the server automatically injects the following keys into tool_context before forwarding to tool calls:

Header	Value
`X-Soat-Context-actor_id`	Public ID of the session's actor (`actr_...`), if set; omitted otherwise
`X-Soat-Context-actor_external_id`	External ID of the session's actor, if set; omitted otherwise
`X-Soat-Context-session_id`	Public ID of the session (`sess_...`)

Any values provided by the caller in tool_context are merged on top and take precedence over the auto-populated values.

Example

Adding a caller-supplied tenant_id alongside the automatically injected session fields:

{
  "tool_context": {
    "tenant_id": "tenant_xyz"
  }
}

The tool will receive all four headers: X-Soat-Context-actor_id, X-Soat-Context-actor_external_id, X-Soat-Context-session_id, and X-Soat-Context-tenant_id.

Data Model

Session

Field	Type	Description
`id`	string	Public identifier prefixed with `sess_`
`agent_id`	string	Public ID of the agent this session belongs to
`conversation_id`	string	Public ID of the underlying conversation
`status`	string	`open` (default) or `closed`
`name`	string	Optional display name
`actor_id`	string \| null	Optional public ID of the Actor associated with this session (`actr_` prefix); `null` when no actor is set
`tags`	object	Free-form key-value metadata
`auto_generate`	boolean	When `true`, saving a message via `POST .../messages` automatically triggers LLM generation (default: `false`)
`created_at`	string	ISO 8601 creation timestamp
`updated_at`	string	ISO 8601 last-updated timestamp

Message (within a session)

Messages are returned with simplified roles:

Field	Type	Description
`role`	string	`user` or `assistant` — stored on the message record itself
`content`	string	Message text
`model`	string	Model used for assistant messages
`created_at`	string	ISO 8601 timestamp

Examples

Basic session flow

CLI
SDK
curl

soat create-agent-session --agent-id agt_01 --name "My Session"
soat add-session-message --agent-id agt_01 --session-id sess_01 --message "Hello!"
soat generate-session-response --agent-id agt_01 --session-id sess_01

// SDK
import { SoatClient } from '@soat/sdk';
const soat = new SoatClient({
  baseUrl: 'https://api.example.com',
  token: 'sk_...',
});

const { data: session } = await soat.sessions.createAgentSession({
  path: { agent_id: 'agt_01' },
  body: { name: 'My Session' },
});

await soat.sessions.addSessionMessage({
  path: { agent_id: 'agt_01', session_id: session.id },
  body: { message: 'Hello!' },
});

const { data: reply } = await soat.sessions.generateSessionResponse({
  path: { agent_id: 'agt_01', session_id: session.id },
});

curl -X POST https://api.example.com/api/v1/agents/agt_01/sessions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Session"}'

curl -X POST https://api.example.com/api/v1/agents/agt_01/sessions/sess_01/messages \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"message": "Hello!"}'

curl -X POST https://api.example.com/api/v1/agents/agt_01/sessions/sess_01/generate \
  -H "Authorization: Bearer <token>"

Async Generation

By default POST .../generate waits for the LLM to finish and returns the result synchronously. Pass ?async=true to return immediately with a 202 Accepted response:

{ "status": "accepted", "session_id": "sess_..." }

Concurrency and cancel-previous

Both sync and async calls go through the same concurrency handling. When a new generation request arrives while a previous one is still in-flight, the server cancels the previous generation and starts a fresh one. This ensures the model always sees the complete, up-to-date message history:

pos 0  user       "Hello"
pos 1  user       "What is 2+2?"
pos 2  user       "Are you sure?"   ← arrived while first generation was in-flight
pos 3  assistant  "Yes, 2+2 is definitely 4."  ← model saw all three messages

The cancel-previous mechanism uses an in-memory AbortController per session. Each process tracks active generations; the abort signal is threaded through to the underlying LLM call so that in-flight streaming or text generation is cancelled as soon as possible.

Trade-off: Aborted generations still consume LLM tokens for the portion already processed before cancellation. For cost-sensitive workloads, consider rate-limiting generation requests.

Multi-replica deployments: The in-memory abort map is per-process. In a multi-replica setup, a new generation request reaching a different replica will not cancel a generation running on another replica. The snapshot-position safety net still applies in that case.

If generating_at is set but no in-memory controller exists for the session (e.g., stale state after a process restart) and less than 5 minutes have elapsed, the generation is rejected as already in progress:

Sync: returns 409 Conflict to the caller.
Async: the duplicate generation is silently dropped (the 202 response is still returned, but no LLM call is made).

Message ordering with concurrent writes

Each conversation message is assigned a monotonically increasing position. When the assistant reply is written, it is inserted at the position that corresponds to the last message the model actually saw — not the position at write time. Any user messages that arrived while generation was in-flight are shifted up by one so that causal order is preserved:

pos 0  user    "Hello"
pos 1  user    "What is 2+2?"
pos 2  assistant  "4"          ← inserted at snapshot position + 1
pos 3  user    "Are you sure?" ← shifted up from 2 → 3 (arrived mid-generation)

A subsequent POST .../generate call therefore sees the latest user message at the end of the history and responds to it correctly.

Webhook Events

The following events are dispatched to project webhooks as sessions change state:

Event type	Trigger
`sessions.created`	A new session is created
`sessions.updated`	A session's `name`, `status`, or `tags` are changed
`sessions.deleted`	A session is deleted
`sessions.generation.completed`	LLM generation finished successfully
`sessions.generation.requires_action`	LLM returned a client-tool call requiring tool outputs
`sessions.generation.started`	LLM generation has started for a session

All events include session_id. Generation events additionally include generation_id and trace_id in the data payload.

Permissions are namespaced under agents: since sessions are an agent sub-resource.

Overview​

Key Concepts​

How Sessions Relate to Other Concepts​

Lifecycle​

Actor ID​

Tags​

Escape Hatch​

Auto-Generate​

Tool Context​

Auto-Populated Headers​

Example​

Data Model​

Session​

Message (within a session)​

Examples​

Basic session flow​

Async Generation​

Concurrency and cancel-previous​

Message ordering with concurrent writes​

Webhook Events​