Skip to main content

Documents Module

The Documents module stores plain-text documents along with an embedding vector in PostgreSQL, enabling semantic (vector) search across project content. Under the hood each document is backed by a Files record stored on disk.

Overview

A Document IS a File — it always uses .txt format and is associated with a project. When a document is created, its text content is passed to a configured embedding provider (currently Ollama only), and the resulting vector is stored alongside the text. This allows cosine-similarity search at query time without an external vector database.

Documents are identified by an id prefixed with doc_. The internal database primary key is never returned.

Configuration

Environment VariableRequiredDescription
FILES_STORAGE_DIRYesDirectory where .txt files are written (shared with Files)
EMBEDDING_PROVIDERYesEmbedding backend — only ollama is supported
EMBEDDING_MODELYesModel name, e.g. qwen3-embedding:0.6b
EMBEDDING_DIMENSIONSYesVector dimensions — must match the model output, e.g. 1024
OLLAMA_BASE_URLNoOllama server URL, defaults to http://localhost:11434

Ollama setup example

# Pull the embedding model
ollama pull qwen3-embedding:0.6b

# Verify it's running
ollama list

Set the server environment variables:

EMBEDDING_PROVIDER=ollama
EMBEDDING_MODEL=qwen3-embedding:0.6b
EMBEDDING_DIMENSIONS=1024
OLLAMA_BASE_URL=http://localhost:11434

Data Model

FieldTypeDescription
idstringPublic identifier prefixed with doc_
fileIdstringID of the underlying File record
projectIdstringID of the owning project
filenamestringOriginal filename (.txt extension)
sizenumberFile size in bytes
contentstringText content — only present in GET /documents/:id responses
createdAtstringISO 8601 creation timestamp
updatedAtstringISO 8601 last-updated timestamp

The embedding column (pgvector vector(N)) is stored in the database but never returned via the API.

Permissions

Document operations are governed by per-project policies. Grant the following permissions:

ActionPermissionREST EndpointMCP Tool
List documentsdocuments:ListDocumentsGET /api/v1/documentslist-documents
Get a documentdocuments:GetDocumentGET /api/v1/documents/:idget-document
Create a documentdocuments:CreateDocumentPOST /api/v1/documentscreate-document
Delete a documentdocuments:DeleteDocumentDELETE /api/v1/documents/:iddelete-document
Update a documentdocuments:UpdateDocumentPATCH /api/v1/documents/:idupdate-document
Semantic searchdocuments:SearchDocumentsPOST /api/v1/documents/searchsearch-documents

See the API Reference for full endpoint details, request/response schemas, and status codes.

Project ID Resolution

For endpoints that accept projectId, the field is optional. When omitted, the server resolves accessible projects based on the caller's identity:

Caller typeBehavior when projectId is omitted
project keyInfers the project from the key's own scope (single project)
JWT adminNo project filter — returns results across all projects
JWT userEnumerates all projects the user is a member of with the required permission

If projectId is supplied but the caller lacks permission for that project, the request returns 403 Forbidden.

MCP Tools

The following MCP tools are available for AI assistants:

Tool nameDescription
list-documentsList documents; omit projectId to retrieve all accessible documents
get-documentRetrieve a document including its text content
create-documentCreate a new text document with automatic embedding
delete-documentDelete a document and its underlying file
update-documentUpdate document content, title, metadata, or tags
search-documentsSemantic search; omit projectId to search across all accessible projects