Skip to main content

Sessions

Overview

Sessions provide a simplified 1 user ↔ 1 agent conversational interface. They are a sub-resource of Agents, nested under /agents/:agent_id/sessions, and hide the underlying Conversation, Actor, and generation plumbing.

By default, interacting with an agent requires three API calls:

  1. Create a sessionPOST /agents/:agent_id/sessions
  2. Save a user messagePOST /agents/:agent_id/sessions/:session_id/messages (returns 201, does not trigger generation)
  3. Generate a responsePOST /agents/:agent_id/sessions/:session_id/generate (triggers the LLM, returns the assistant reply)

When auto_generate is enabled on the session, step 3 is handled automatically — POST .../messages saves the message and returns the assistant reply in one call, reducing the flow to two API calls.

The session automatically creates and manages the underlying conversation. An optional actor_id can be supplied to associate an existing Actor as the session owner; if omitted the session is created with no actor.

See the Permissions Reference for the IAM action strings for this module.

Key Concepts

How Sessions Relate to Other Concepts

ConceptRelationship
ChatsRaw LLM completions — no agents, no tools, caller manages history
Sessions1 user ↔ 1 agent — full tool support, automatic history, nested under agents
ConversationsMulti-party dialogue engine — powers sessions internally, available as escape hatch

Lifecycle

A session starts in open status. It can be updated to closed when the interaction is complete. Deleting a session cascades to the underlying conversation.

Actor ID

The optional actor_id field associates an existing Actor as the owner of the session. When omitted, actor_id is null and no actor is created automatically. Sessions can be filtered by this field.

Tags

Sessions support arbitrary key-value metadata via the tags JSONB field. Tags can be fully replaced (PUT .../tags) or merged (PATCH .../tags).

Escape Hatch

Each session exposes its conversation_id, allowing advanced users to drop into the full Conversations API when multi-party or lower-level control is needed.

Auto-Generate

When auto_generate is set to true on a session, POST .../messages saves the user message and automatically triggers LLM generation in the same request. The response body contains the assistant reply instead of just the saved user message.

This collapses the three-call flow into two calls: create a session, then send messages.

auto_generate defaults to false. It can be set at session creation or toggled at any time:

PATCH /agents/:agent_id/sessions/:session_id
Content-Type: application/json

{ "auto_generate": false }

The explicit POST .../generate endpoint continues to work regardless of this setting. Async generation (?async=true) is also supported on POST .../messages when auto_generate is enabled — the request returns 202 Accepted immediately and generation proceeds in the background.

Tool Context

Sessions support the same tool_context mechanism as direct agent generations — see Tool Context in the Agents module for the full specification.

Auto-Populated Headers

When a generation is triggered through a session (either via POST .../generate or auto-generate), the server automatically injects the following keys into tool_context before forwarding to tool calls:

HeaderValue
X-Soat-Context-actor_idPublic ID of the session's actor (actr_...), if set; omitted otherwise
X-Soat-Context-actor_external_idExternal ID of the session's actor, if set; omitted otherwise
X-Soat-Context-session_idPublic ID of the session (sess_...)

Any values provided by the caller in tool_context are merged on top and take precedence over the auto-populated values.

Example

Adding a caller-supplied tenant_id alongside the automatically injected session fields:

{
"tool_context": {
"tenant_id": "tenant_xyz"
}
}

The tool will receive all four headers: X-Soat-Context-actor_id, X-Soat-Context-actor_external_id, X-Soat-Context-session_id, and X-Soat-Context-tenant_id.

Data Model

Session

FieldTypeDescription
idstringPublic identifier prefixed with sess_
agent_idstringPublic ID of the agent this session belongs to
conversation_idstringPublic ID of the underlying conversation
statusstringopen (default) or closed
namestringOptional display name
actor_idstring | nullOptional public ID of the Actor associated with this session (actr_ prefix); null when no actor is set
tagsobjectFree-form key-value metadata
auto_generatebooleanWhen true, saving a message via POST .../messages automatically triggers LLM generation (default: false)
created_atstringISO 8601 creation timestamp
updated_atstringISO 8601 last-updated timestamp

Message (within a session)

Messages are returned with simplified roles:

FieldTypeDescription
rolestringuser or assistant — stored on the message record itself
contentstringMessage text
modelstringModel used for assistant messages
created_atstringISO 8601 timestamp

Examples

Basic session flow

soat create-agent-session --agent-id agt_01 --name "My Session"
soat add-session-message --agent-id agt_01 --session-id sess_01 --message "Hello!"
soat generate-session-response --agent-id agt_01 --session-id sess_01

Async Generation

By default POST .../generate waits for the LLM to finish and returns the result synchronously. Pass ?async=true to return immediately with a 202 Accepted response:

{ "status": "accepted", "session_id": "sess_..." }

Concurrency and cancel-previous

Both sync and async calls go through the same concurrency handling. When a new generation request arrives while a previous one is still in-flight, the server cancels the previous generation and starts a fresh one. This ensures the model always sees the complete, up-to-date message history:

pos 0 user "Hello"
pos 1 user "What is 2+2?"
pos 2 user "Are you sure?" ← arrived while first generation was in-flight
pos 3 assistant "Yes, 2+2 is definitely 4." ← model saw all three messages

The cancel-previous mechanism uses an in-memory AbortController per session. Each process tracks active generations; the abort signal is threaded through to the underlying LLM call so that in-flight streaming or text generation is cancelled as soon as possible.

Trade-off: Aborted generations still consume LLM tokens for the portion already processed before cancellation. For cost-sensitive workloads, consider rate-limiting generation requests.

Multi-replica deployments: The in-memory abort map is per-process. In a multi-replica setup, a new generation request reaching a different replica will not cancel a generation running on another replica. The snapshot-position safety net still applies in that case.

If generating_at is set but no in-memory controller exists for the session (e.g., stale state after a process restart) and less than 5 minutes have elapsed, the generation is rejected as already in progress:

  • Sync: returns 409 Conflict to the caller.
  • Async: the duplicate generation is silently dropped (the 202 response is still returned, but no LLM call is made).

Message ordering with concurrent writes

Each conversation message is assigned a monotonically increasing position. When the assistant reply is written, it is inserted at the position that corresponds to the last message the model actually saw — not the position at write time. Any user messages that arrived while generation was in-flight are shifted up by one so that causal order is preserved:

pos 0 user "Hello"
pos 1 user "What is 2+2?"
pos 2 assistant "4" ← inserted at snapshot position + 1
pos 3 user "Are you sure?" ← shifted up from 2 → 3 (arrived mid-generation)

A subsequent POST .../generate call therefore sees the latest user message at the end of the history and responds to it correctly.

Webhook Events

The following events are dispatched to project webhooks as sessions change state:

Event typeTrigger
sessions.createdA new session is created
sessions.updatedA session's name, status, or tags are changed
sessions.deletedA session is deleted
sessions.generation.completedLLM generation finished successfully
sessions.generation.requires_actionLLM returned a client-tool call requiring tool outputs
sessions.generation.startedLLM generation has started for a session

All events include session_id. Generation events additionally include generation_id and trace_id in the data payload.

Permissions are namespaced under agents: since sessions are an agent sub-resource.