Skip to content

Model Configuration

Model configuration is managed through the Settings panel in the browser UI (gear icon, any view). Settings are persisted locally to .a-society/settings.json; API keys are stored separately in .a-society/secrets.json. Both files are written with mode 0600.

On first launch, the Settings panel opens automatically and blocks navigation until at least one model is configured and activated.


Model fields

Each configured model has the following fields:

FieldDescription
displayNameA label shown in the UI. Does not affect API calls.
providerTypeanthropic or openai-compatible.
providerBaseUrlRequired for openai-compatible. The base URL of the API endpoint (e.g. https://api.openai.com/v1).
modelIdThe model identifier passed to the API (e.g. claude-opus-4-5, gpt-4o).
contextWindowThe model's context window in tokens. Used to calculate the auto-compaction threshold (80% of this value). Set it to match your model's actual context window.
maxOutputTokensMaximum tokens per response. Defaults: 4096 for Anthropic, 8192 for OpenAI-compatible. For manual Anthropic thinking, the thinking budget must be less than this value.
reasoningReasoning configuration. See below.
supportedInputTypesOptional. Declare which input modalities the model supports: image, audio, video. Stored with the model config but not yet acted on by the runtime.

Active model

Only one model can be active at a time. The active model is used for all role turns and for compaction LLM calls. The first model you add is automatically activated. You can switch the active model at any time from the Settings panel — the change takes effect on the next turn.


Reasoning configuration

Reasoning is configured per model. The mode must be compatible with the provider:

  • anthropic provider: disabled, anthropic-adaptive, anthropic-manual
  • openai-compatible provider: disabled, openai-chat, custom-openai-compatible

disabled

No reasoning. The model responds with standard output only.

openai-chat

OpenAI reasoning models (e.g. o3, o4-mini). Sends reasoning_effort and uses max_completion_tokens instead of max_tokens.

FieldValuesDescription
effortnone, minimal, low, medium, high, xhighControls how much reasoning the model performs.

anthropic-adaptive

Anthropic extended thinking in adaptive mode. The model decides how much thinking to use based on the task. Sends thinking.type: "adaptive" and output_config.effort.

FieldValuesDescription
effortlow, medium, high, xhigh, maxGuides the model's thinking depth.
displayomitted, summarizedWhether thinking content is omitted or summarized in the feed.

anthropic-manual

Anthropic extended thinking with an explicit token budget. You control exactly how many tokens are allocated for thinking. Sends thinking.type: "enabled" with budget_tokens.

FieldValuesDescription
effortlow, medium, high, xhigh, maxGuides the model's thinking depth.
displayomitted, summarizedWhether thinking content is omitted or summarized in the feed.
budgetTokenspositive integerThinking token budget. Must be less than maxOutputTokens.

custom-openai-compatible

For providers that expose reasoning through non-standard API fields (e.g. DeepSeek, local models). Lets you inject arbitrary request body fields and optionally configure how the reasoning trace is rendered in the UI.

Request configuration (request):

FieldValuesDescription
tokenLimitParammax_tokens, max_completion_tokensWhich request field to use for the output token limit.
extraBodyRecord<string, unknown>Additional fields merged into the API request body. Cannot override reserved keys: model, messages, stream, stream_options, tools, max_tokens, max_completion_tokens.

Trace configuration (trace, optional):

If the provider streams a reasoning trace in the response, you can configure how it is captured and displayed.

FieldValuesDescription
responseDeltaFieldstringThe field name in each streaming delta that carries the reasoning content.
requestMessageFieldstringThe field name used to replay reasoning content back to the model in subsequent messages.
replaynever, tool-calls-only, alwaysWhen to replay reasoning traces back to the model. tool-calls-only is the default — replays only on turns that involved tool calls.
displayhidden, collapsed, expandedHow reasoning traces appear in the UI feed.
labelstringDisplay label for the reasoning trace in the feed (e.g. "Thinking").

Both responseDeltaField and requestMessageField must be set for the trace configuration to take effect.


Compaction model

Context compaction uses the same active model via a separate LLMGateway instance in system mode — no tools, no consent gate. The same reasoning configuration, maxOutputTokens, and API key apply. Compaction runs as a standard text turn and its token usage is not tracked against the role session.