For the complete documentation index, see llms.txt. This page is also available as Markdown.

Kodesage MCP Quickstart

Kodesage exposes an MCP server that lets MCP-aware clients ask questions about your project's knowledge base using the same agentic reasoning the Kodesage web UI uses.

The endpoint is:

https://<your-kodesage-host>/mcp/

It uses streamable HTTP transport and is protected by a bearer token. The token is bound to a single project — the project is derived from the token on every call, so you do not pass a project ID.

Creating an Access Token

  1. Open Settings → Access Tokens for the project you want to query.

  2. Click New token.

  3. Fill in:

    • Token name — a label so you remember which app uses it.

    • Valid for (days) — token lifetime. Default 365, max 3650.

    • Conversation lifetime (days) — how long an idle conversation created via this token stays reachable. Default 90, max 3650.

  4. Copy the token immediately. It is shown only once and cannot be retrieved later. If you lose it, revoke it and create a new one.

Each token is tied to one user and one project. It carries the permissions of the user who created it and is re-checked on every call, so revoking the user's project access disables the token immediately. The MCP tools operate read-only against your code and knowledge base.

Revoke a token from the same Access Tokens screen. Revoked, expired, and inactive-user tokens are rejected with HTTP 401.

Service Users for Shared Integrations

Every conversation a token creates lands in the Ask Kodesage history of the user who owns the token. If you wire MCP into any integration used by more than one person, use a dedicated service user rather than a personal account:

  1. Create a new Kodesage user for the integration .

  2. Grant it access to the project(s) it should query.

  3. Sign in as that user and create the access token from the user's Settings → Access Tokens page.

  4. Use that token in the shared integration.

This keeps the shared conversation history isolated from anyone's personal history, makes it obvious in audits which traffic came from the integration, and lets you rotate or revoke the token without affecting a real person's account.

Authentication

Every MCP request must carry the bearer header:

Store the token in an environment variable, never in source control:

MCP Client Configuration

Most MCP clients take a JSON config that points at an HTTP MCP server and lets you set custom headers. The generic shape is:

Notes:

  • The URL must end with /.

  • Field names vary between clients, but every client needs the URL and an Authorization: Bearer … header.

  • If your client only supports stdio MCP servers, put a small HTTP→stdio bridge in front of the endpoint.

Available Tools

ask_kodesage

Ask Kodesage a question and get a final answer. This is the main tool.

Arguments

Name
Type
Required
Description

prompt

string

yes

The question. Maximum 32,000 characters by default.

conversation_id

string

no

Continue an existing conversation. Omit to start a new one.

Returns

The first call returns a new conversation_id. Pass it back on subsequent calls to keep the conversation going — the agent will see prior turns when answering. Each successful call resets the conversation's idle timer.

get_kodesage_conversation

Read the message history of an existing conversation.

Arguments

Name
Type
Required
Description

conversation_id

string

yes

A conversation_id from ask_kodesage.

Returns

Only completed turns are returned — one user message and the final assistant answer per turn. Intermediate reasoning steps and in-progress or failed turns are omitted.

Example

Minimal ask_kodesage invocation with curl:

Viewing MCP Conversations in the Web UI

Conversations started over MCP are added to the same Ask Kodesage history as conversations started in the web UI, scoped to the user the token belongs to. Sign in as that user and open the project to:

  • See the full agent trace, including every tool call and intermediate step (MCP clients only see the final answer).

  • Follow up on an MCP conversation from the web UI.

  • Audit which questions a token has been asking.

Limits & Errors

The MCP server enforces two limits, both configurable per deployment:

Limit
Default
Environment variable

Maximum prompt length (characters)

32000

MCP_MAX_PROMPT_CHARS

Concurrent ask_kodesage per token

3

MCP_MAX_CONCURRENT_PER_TOKEN

Note on concurrency. Each ask_kodesage call runs a full agentic loop that drives the Kodesage LLM. Raising MCP_MAX_CONCURRENT_PER_TOKEN too high lets a single token fan out many parallel agent runs and can saturate the GPU pool, slowing every user on the deployment. Keep it low unless you know your hardware has headroom.

Common errors:

  • HTTP 401 — missing, expired, or revoked token, or the user lost access to the project.

  • Conversation not found — the conversation_id is unknown, expired, or belongs to a different token/user/project.

  • Too many concurrent requests for this token — wait for an in-flight call to finish, then retry.

  • Prompt too long — shorten the prompt to within the configured limit.

  • Agent reached step budget without producing a final answer — the agent could not converge; try a more focused prompt.

Last updated