Skip to main content

Chat with an LLM

This tutorial walks through the full flow of having a back-and-forth conversation with an LLM. You will:

  1. Log in as admin.
  2. Create a project.
  3. Create a local AI provider backed by Ollama.
  4. Create an agent.
  5. Open a session.
  6. Send messages and receive replies from the model.
  7. View the conversation history.
  8. Run async generation.
  9. Capture generation lifecycle events via webhook.

By the end you will understand how AI Providers, Agents, Sessions, and Webhooks compose together to drive both sync and async LLM conversations.

Prerequisites

  • SOAT running locally. Follow the Quick Start guide to bring the stack up with Docker Compose.
  • New to SOAT? Read Key Concepts to understand projects, agents, and sessions before diving in.
  • CLI installed and configured, or SDK set up. See CLI or SDK.
  • For production hardening (secrets, env vars), see Advanced Configuration.
  • Server is at http://localhost:5047.
  • Ollama running locally with a chat model available.
  • This repo's tutorial test stack already provisions Ollama with qwen2.5:0.5b, so this tutorial runs in automated tests without external credentials.
export SOAT_BASE_URL=http://localhost:5047

CLI path flags in this tutorial are resource-specific and kebab-cased, for example --agent-id, --session-id, and --webhook-id.


Step 1 — Log in as admin

Admin is the built-in superuser role. It bypasses policy evaluation entirely. See Users for full authentication and user management details.

ADMIN_TOKEN=$(soat login-user --username admin --password Admin1234! | jq -r '.token')
export SOAT_TOKEN=$ADMIN_TOKEN

Step 2 — Create a project

Every resource in SOAT lives inside a project. Create one to hold the agent and its supporting configuration.

PROJECT_ID=$(soat create-project --name "LLM Chat Demo" | jq -r '.id')
echo "PROJECT_ID: $PROJECT_ID"
# PROJECT_ID: proj_vh9qHLINTdsrAqwK

Step 3 — Create a local AI provider

For local development and tutorial tests, the simplest setup is an AI provider backed by Ollama. It uses the server's OLLAMA_BASE_URL, so no secret is required.

Step 3 — Create a local AI provider

For local development and tutorial tests, the simplest setup is an AI provider backed by Ollama. It uses the server's OLLAMA_BASE_URL, so no secret is required. This tutorial uses a local Ollama provider so it can run without external credentials. To connect xAI, OpenAI, Anthropic, or Amazon Bedrock instead, see Connect Third-Party LLMs.

AI_PROVIDER_ID=$(soat create-ai-provider \
--project-id "$PROJECT_ID" \
--name "Local Ollama" \
--provider "ollama" \
--default-model "qwen2.5:0.5b" | jq -r '.id')
echo "AI_PROVIDER_ID: $AI_PROVIDER_ID"
# AI_PROVIDER_ID: aip_8BTcGUvXnehCCQKs

Step 4 — Create an agent

An agent is bound to an AI provider and carries a system prompt (instructions). It is the entity that generates responses.

AGENT_ID=$(soat create-agent \
--project-id "$PROJECT_ID" \
--ai-provider-id "$AI_PROVIDER_ID" \
--name "Local Assistant" \
--instructions "You are a concise assistant running on a local Ollama model. Keep answers short (max 20 words), clear, and practical." \
| jq -r '.id')
echo "AGENT_ID: $AGENT_ID"
# AGENT_ID: agt_KO5nAMmsSOVBWLlN

Step 5 — Create a session

A session is a single conversation thread tied to an agent. Setting auto_generate to true means the agent generates a reply automatically every time you send a user message.

SESSION_ID=$(soat create-agent-session \
--agent-id "$AGENT_ID" \
--name "My first chat" \
--auto-generate true | jq -r '.id')
echo "SESSION_ID: $SESSION_ID"
# SESSION_ID: sess_N0oEzsx3ayvgKwy3

Step 6 — Send messages and receive replies

Because auto_generate is enabled, every call to add-session-message triggers generation immediately and returns the assistant reply inline. The conversation context is maintained across calls — the model sees all previous messages. See Sessions for the full message and generation API.

7a — First message

soat add-session-message \
--agent-id "$AGENT_ID" \
--session-id "$SESSION_ID" \
--message "What is the capital of France?"

Example output:

{
"status": "completed",
"message": {
"role": "assistant",
"content": "The capital of France is Paris.",
"model": "qwen2.5:0.5b"
},
"generation_id": "agt_gen_mznGfHSV4YAGiBXy",
"trace_id": "agt_trace_8rcvif0n29WE37NL"
}

7b — Queue a follow-up message for async generation

Now disable auto_generate and add a follow-up user message. We will generate the assistant reply in Step 10 using async mode.

soat update-session \
--agent-id "$AGENT_ID" \
--session-id "$SESSION_ID" \
--auto-generate false

soat add-session-message \
--agent-id "$AGENT_ID" \
--session-id "$SESSION_ID" \
--message "In one short sentence, what is the population of Paris?"

Step 7 — View the conversation history

Fetch all messages in the session to review the full exchange. Messages are persisted on the underlying Conversation model; the session provides a scoped view into it.

soat list-agent-session-messages \
--agent-id "$AGENT_ID" \
--session-id "$SESSION_ID" | jq '.data[] | {role, content}'

Example output:

{ "role": "user", "content": "What is the capital of France?" }
{ "role": "assistant", "content": "The capital of France is Paris." }
{ "role": "user", "content": "What is the population of that city?" }
{ "role": "assistant", "content": "The population of Paris … approximately 2.1 million …" }

Step 8 - Start a local webhook listener

Start the CLI listener before creating the webhook. It opens a local HTTP endpoint and prints each matching delivery. In the automated tutorial tests, SOAT_WEBHOOK_BASE_URL is injected so the server container can reach this listener. See CLI Commands for all soat listen flags and Webhooks for the full delivery and signing model.

WEBHOOK_BASE_URL=${SOAT_WEBHOOK_BASE_URL:-http://localhost:8787}
soat listen --port 8787 --path /webhook --filter sessions.generation.* --json > session-webhooks.log 2>&1 &
LISTENER_PID=$!
sleep 2

Optional: pass --secret <webhook-secret> to validate X-Soat-Signature.


Step 9 - Create a session webhook subscription

Subscribe to session generation events so you can observe the async lifecycle. See Webhooks for the full list of event types, retry rules, and HMAC signing.

WEBHOOK_ID=$(soat create-webhook \
--project-id "$PROJECT_ID" \
--name "session-events" \
--url "$WEBHOOK_BASE_URL/webhook" \
--events '["sessions.generation.*"]' | jq -r '.id')
echo "WEBHOOK_ID: $WEBHOOK_ID"

Step 10 - Trigger async generation

Disable auto_generate, add a user message, then trigger generation with async=true. See Sessions — Async Generation for status codes and how to poll for completion.

soat update-session \
--agent-id "$AGENT_ID" \
--session-id "$SESSION_ID" \
--auto-generate false

soat add-session-message \
--agent-id "$AGENT_ID" \
--session-id "$SESSION_ID" \
--message "Give me 1 concise fact about Sao Paulo."

soat generate-session-response \
--agent-id "$AGENT_ID" \
--session-id "$SESSION_ID" \
--async true

Expected immediate response (accepted):

{
"status": "accepted",
"session_id": "sess_..."
}

When generation runs, your soat listen terminal should log events such as:

  • sessions.generation.started
  • sessions.generation.completed

Step 11 - Verify delivery and final assistant message

Wait for the async delivery, inspect the webhook listener output, then fetch session messages again. Delivery records are queryable via the Webhooks module.

for _ in $(seq 1 20); do soat list-webhook-deliveries --project-id "$PROJECT_ID" --webhook-id "$WEBHOOK_ID" | jq -e '[.data[] | select(.status == "completed")] | length > 0' && break || sleep 1; done # → ignore

soat list-webhook-deliveries \
--project-id "$PROJECT_ID" \
--webhook-id "$WEBHOOK_ID" | jq '.data[] | {event_type, status, status_code}'

cat session-webhooks.log

soat list-agent-session-messages \
--agent-id "$AGENT_ID" \
--session-id "$SESSION_ID" | jq '.data[] | {role, content}'

kill "$LISTENER_PID"
wait "$LISTENER_PID" 2>/dev/null || true

What's next

  • Manual generation: Create a session without auto_generate and call generate-session-response (soat generate-session-response --agent-id … --session-id …) explicitly for full control over when the model responds.
  • Session tags: Use replace-session-tags / merge-session-tags to attach metadata (e.g. user ID, conversation topic) to a session for filtering.
  • Agents with tools: Attach SOAT tools or HTTP tools to the agent so the model can take actions. See the Agents module.