Skip to main content

Knowledge

Overview

The Knowledge module provides unified semantic search across all knowledge sources in a project — documents and memory entries. A single endpoint searches across these sources simultaneously, ranks results by vector similarity, and returns an interleaved list tagged by source type.

Each result carries a source_type discriminant ("document" or "memory") so callers know where each piece of knowledge came from.

See the Permissions Reference for the IAM action strings for this module.

Data Model

KnowledgeResult

A KnowledgeResult is a discriminated union on source_type. All results share common fields; source-specific fields are only present for the matching type.

Common fields (all source types)

FieldTypeDescription
source_type"document" | "memory"Discriminant for the knowledge source type
contentstring|nullText content of the result
scorenumberRelevance score (0–1); only present when query is used
created_atstringISO 8601 creation timestamp
updated_atstringISO 8601 last-updated timestamp

Document result (source_type: "document")

FieldTypeDescription
document_idstringPublic document ID (doc_ prefix)
file_idstringID of the underlying File record
project_idstringID of the owning project
pathstring|nullLogical path within the project (e.g. /reports/q1.txt)
filenamestringOriginal filename
sizenumberFile size in bytes
titlestring|nullDocument title (if set)
metadataobject|nullArbitrary JSON metadata
tagsobjectKey-value tags associated with the document

Memory result (source_type: "memory")

FieldTypeDescription
entry_idstringPublic memory entry ID (me_ prefix)
memory_idstringPublic ID of the parent memory (mem_ prefix)

Key Concepts

Search Modes

The POST /knowledge/search endpoint accepts the following filters. At least one must be provided.

ParameterTypeDescription
querystringSemantic search query — ranks results by vector similarity
memory_idsstring[]Search entries within these specific memories
memory_tagsstring[]Search entries in memories whose tags match any of these patterns (supports glob: user*)
document_pathsstring[]Filter document results to paths starting with these prefixes
document_idsstring[]Filter document results to specific document IDs

When query is set, results include a score field and are ordered by descending relevance. min_score and limit apply additional controls.

memory_ids and memory_tags can be combined — the search includes entries from memories matching either (union semantics).

If neither memory_ids nor memory_tags is provided, the search does not include memory entries (only documents). Similarly, if neither document_paths nor document_ids is provided and no query is given alone, only memories are searched. This lets callers control exactly which sources to include.

Project Scoping

project_id is optional. When omitted, the server resolves accessible projects from the caller's identity (API key project scope, admin wildcard, or explicit project memberships).

Configuration

Environment VariableRequiredDescription
FILES_STORAGE_DIRYesDirectory where .txt files are stored (shared with Files)
EMBEDDING_PROVIDERYesEmbedding backend — only ollama is supported
EMBEDDING_MODELYesModel name, e.g. qwen3-embedding:0.6b
EMBEDDING_DIMENSIONSYesVector dimensions — must match the model output, e.g. 1024
OLLAMA_BASE_URLNoOllama server URL, defaults to http://localhost:11434

Examples

Semantic search across documents and memories

soat search-knowledge \
--project-id proj_ABC \
--query "quarterly revenue" \
--memory-ids mem_xyz \
--limit 5

Memory-only search by tag

soat search-knowledge \
--project-id proj_ABC \
--query "customer communication" \
--memory-tags "customer*"

Document-scoped retrieval

soat search-knowledge \
--project-id proj_ABC \
--query "quarterly revenue" \
--limit 5

Path-scoped document retrieval

soat search-knowledge \
--project-id proj_ABC \
--document-paths /docs/products/