Model Configuration
Model configuration is managed through the Settings panel in the browser UI (gear icon, any view). Settings are persisted locally to .a-society/settings.json; API keys are stored separately in .a-society/secrets.json. Both files are written with mode 0600.
On first launch, the Settings panel opens automatically and blocks navigation until at least one model is configured and activated.
Model fields
Each configured model has the following fields:
| Field | Description |
|---|---|
displayName | A label shown in the UI. Does not affect API calls. |
providerType | anthropic or openai-compatible. |
providerBaseUrl | Required for openai-compatible. The base URL of the API endpoint (e.g. https://api.openai.com/v1). |
modelId | The model identifier passed to the API (e.g. claude-opus-4-5, gpt-4o). |
contextWindow | The model's context window in tokens. Used to calculate the auto-compaction threshold (80% of this value). Set it to match your model's actual context window. |
maxOutputTokens | Maximum tokens per response. Defaults: 4096 for Anthropic, 8192 for OpenAI-compatible. For manual Anthropic thinking, the thinking budget must be less than this value. |
reasoning | Reasoning configuration. See below. |
supportedInputTypes | Optional. Declare which input modalities the model supports: image, audio, video. Stored with the model config but not yet acted on by the runtime. |
Active model
Only one model can be active at a time. The active model is used for all role turns and for compaction LLM calls. The first model you add is automatically activated. You can switch the active model at any time from the Settings panel — the change takes effect on the next turn.
Reasoning configuration
Reasoning is configured per model. The mode must be compatible with the provider:
anthropicprovider:disabled,anthropic-adaptive,anthropic-manualopenai-compatibleprovider:disabled,openai-chat,custom-openai-compatible
disabled
No reasoning. The model responds with standard output only.
openai-chat
OpenAI reasoning models (e.g. o3, o4-mini). Sends reasoning_effort and uses max_completion_tokens instead of max_tokens.
| Field | Values | Description |
|---|---|---|
effort | none, minimal, low, medium, high, xhigh | Controls how much reasoning the model performs. |
anthropic-adaptive
Anthropic extended thinking in adaptive mode. The model decides how much thinking to use based on the task. Sends thinking.type: "adaptive" and output_config.effort.
| Field | Values | Description |
|---|---|---|
effort | low, medium, high, xhigh, max | Guides the model's thinking depth. |
display | omitted, summarized | Whether thinking content is omitted or summarized in the feed. |
anthropic-manual
Anthropic extended thinking with an explicit token budget. You control exactly how many tokens are allocated for thinking. Sends thinking.type: "enabled" with budget_tokens.
| Field | Values | Description |
|---|---|---|
effort | low, medium, high, xhigh, max | Guides the model's thinking depth. |
display | omitted, summarized | Whether thinking content is omitted or summarized in the feed. |
budgetTokens | positive integer | Thinking token budget. Must be less than maxOutputTokens. |
custom-openai-compatible
For providers that expose reasoning through non-standard API fields (e.g. DeepSeek, local models). Lets you inject arbitrary request body fields and optionally configure how the reasoning trace is rendered in the UI.
Request configuration (request):
| Field | Values | Description |
|---|---|---|
tokenLimitParam | max_tokens, max_completion_tokens | Which request field to use for the output token limit. |
extraBody | Record<string, unknown> | Additional fields merged into the API request body. Cannot override reserved keys: model, messages, stream, stream_options, tools, max_tokens, max_completion_tokens. |
Trace configuration (trace, optional):
If the provider streams a reasoning trace in the response, you can configure how it is captured and displayed.
| Field | Values | Description |
|---|---|---|
responseDeltaField | string | The field name in each streaming delta that carries the reasoning content. |
requestMessageField | string | The field name used to replay reasoning content back to the model in subsequent messages. |
replay | never, tool-calls-only, always | When to replay reasoning traces back to the model. tool-calls-only is the default — replays only on turns that involved tool calls. |
display | hidden, collapsed, expanded | How reasoning traces appear in the UI feed. |
label | string | Display label for the reasoning trace in the feed (e.g. "Thinking"). |
Both responseDeltaField and requestMessageField must be set for the trace configuration to take effect.
Compaction model
Context compaction uses the same active model via a separate LLMGateway instance in system mode — no tools, no consent gate. The same reasoning configuration, maxOutputTokens, and API key apply. Compaction runs as a standard text turn and its token usage is not tracked against the role session.