Kodesage MCP Quickstart
Kodesage exposes an MCP server that lets MCP-aware clients ask questions about your project's knowledge base using the same agentic reasoning the Kodesage web UI uses.
The endpoint is:
https://<your-kodesage-host>/mcp/It uses streamable HTTP transport and is protected by a bearer token. The token is bound to a single project — the project is derived from the token on every call, so you do not pass a project ID.
Creating an Access Token
Open Settings → Access Tokens for the project you want to query.
Click New token.
Fill in:
Token name — a label so you remember which app uses it.
Valid for (days) — token lifetime. Default
365, max3650.Conversation lifetime (days) — how long an idle conversation created via this token stays reachable. Default
90, max3650.
Copy the token immediately. It is shown only once and cannot be retrieved later. If you lose it, revoke it and create a new one.
Each token is tied to one user and one project. It carries the permissions of the user who created it and is re-checked on every call, so revoking the user's project access disables the token immediately. The MCP tools operate read-only against your code and knowledge base.
Revoke a token from the same Access Tokens screen. Revoked, expired, and inactive-user tokens are rejected with HTTP 401.
Service Users for Shared Integrations
Every conversation a token creates lands in the Ask Kodesage history of the user who owns the token. If you wire MCP into any integration used by more than one person, use a dedicated service user rather than a personal account:
Create a new Kodesage user for the integration .
Grant it access to the project(s) it should query.
Sign in as that user and create the access token from the user's Settings → Access Tokens page.
Use that token in the shared integration.
This keeps the shared conversation history isolated from anyone's personal history, makes it obvious in audits which traffic came from the integration, and lets you rotate or revoke the token without affecting a real person's account.
Authentication
Every MCP request must carry the bearer header:
Store the token in an environment variable, never in source control:
MCP Client Configuration
Most MCP clients take a JSON config that points at an HTTP MCP server and lets you set custom headers. The generic shape is:
Notes:
The URL must end with
/.Field names vary between clients, but every client needs the URL and an
Authorization: Bearer …header.If your client only supports stdio MCP servers, put a small HTTP→stdio bridge in front of the endpoint.
Available Tools
ask_kodesage
ask_kodesageAsk Kodesage a question and get a final answer. This is the main tool.
Arguments
prompt
string
yes
The question. Maximum 32,000 characters by default.
conversation_id
string
no
Continue an existing conversation. Omit to start a new one.
Returns
The first call returns a new conversation_id. Pass it back on subsequent calls to keep the conversation going — the agent will see prior turns when answering. Each successful call resets the conversation's idle timer.
get_kodesage_conversation
get_kodesage_conversationRead the message history of an existing conversation.
Arguments
conversation_id
string
yes
A conversation_id from ask_kodesage.
Returns
Only completed turns are returned — one user message and the final assistant answer per turn. Intermediate reasoning steps and in-progress or failed turns are omitted.
Example
Minimal ask_kodesage invocation with curl:
Viewing MCP Conversations in the Web UI
Conversations started over MCP are added to the same Ask Kodesage history as conversations started in the web UI, scoped to the user the token belongs to. Sign in as that user and open the project to:
See the full agent trace, including every tool call and intermediate step (MCP clients only see the final answer).
Follow up on an MCP conversation from the web UI.
Audit which questions a token has been asking.
Limits & Errors
The MCP server enforces two limits, both configurable per deployment:
Maximum prompt length (characters)
32000
MCP_MAX_PROMPT_CHARS
Concurrent ask_kodesage per token
3
MCP_MAX_CONCURRENT_PER_TOKEN
Note on concurrency. Each
ask_kodesagecall runs a full agentic loop that drives the Kodesage LLM. RaisingMCP_MAX_CONCURRENT_PER_TOKENtoo high lets a single token fan out many parallel agent runs and can saturate the GPU pool, slowing every user on the deployment. Keep it low unless you know your hardware has headroom.
Common errors:
HTTP 401 — missing, expired, or revoked token, or the user lost access to the project.
Conversation not found— theconversation_idis unknown, expired, or belongs to a different token/user/project.Too many concurrent requests for this token— wait for an in-flight call to finish, then retry.Prompt too long— shorten the prompt to within the configured limit.Agent reached step budget without producing a final answer— the agent could not converge; try a more focused prompt.
Last updated

