# Kodesage MCP Quickstart

Kodesage exposes an [MCP](https://modelcontextprotocol.io) server that lets MCP-aware clients ask questions about your project's knowledge base using the same agentic reasoning the Kodesage web UI uses.

The endpoint is:

```
https://<your-kodesage-host>/mcp/
```

It uses **streamable HTTP** transport and is protected by a bearer token. The token is bound to a single project — the project is derived from the token on every call, so you do not pass a project ID.

### Creating an Access Token

1. Open **Settings → Access Tokens** for the project you want to query.
2. Click **New token**.
3. Fill in:
   * **Token name** — a label so you remember which app uses it.
   * **Valid for (days)** — token lifetime. Default `365`, max `3650`.
   * **Conversation lifetime (days)** — how long an idle conversation created via this token stays reachable. Default `90`, max `3650`.
4. Copy the token immediately. **It is shown only once and cannot be retrieved later.** If you lose it, revoke it and create a new one.

Each token is tied to **one user and one project**. It carries the permissions of the user who created it and is re-checked on every call, so revoking the user's project access disables the token immediately. The MCP tools operate **read-only** against your code and knowledge base.

Revoke a token from the same **Access Tokens** screen. Revoked, expired, and inactive-user tokens are rejected with HTTP `401`.

### Service Users for Shared Integrations

Every conversation a token creates lands in the **Ask Kodesage** history of the user who owns the token. If you wire MCP into any integration used by more than one person, use a dedicated **service user** rather than a personal account:

1. Create a new Kodesage user for the integration .
2. Grant it access to the project(s) it should query.
3. Sign in as that user and create the access token from the user's **Settings → Access Tokens** page.
4. Use that token in the shared integration.

This keeps the shared conversation history isolated from anyone's personal history, makes it obvious in audits which traffic came from the integration, and lets you rotate or revoke the token without affecting a real person's account.

### Authentication

Every MCP request must carry the bearer header:

```
Authorization: Bearer your-token-here
```

Store the token in an environment variable, never in source control:

```
KODESAGE_ACCESS_TOKEN=your-token-here
```

### MCP Client Configuration

Most MCP clients take a JSON config that points at an HTTP MCP server and lets you set custom headers. The generic shape is:

```json
{
  "mcpServers": {
    "kodesage": {
      "type": "http",
      "url": "https://<your-kodesage-host>/mcp/",
      "headers": {
        "Authorization": "Bearer ${KODESAGE_ACCESS_TOKEN}"
      }
    }
  }
}
```

Notes:

* The URL **must end with `/`**.
* Field names vary between clients, but every client needs the URL and an `Authorization: Bearer …` header.
* If your client only supports stdio MCP servers, put a small HTTP→stdio bridge in front of the endpoint.

### Available Tools

#### `ask_kodesage`

Ask Kodesage a question and get a final answer. This is the main tool.

**Arguments**

| Name              | Type     | Required | Description                                                 |
| ----------------- | -------- | -------- | ----------------------------------------------------------- |
| `prompt`          | `string` | yes      | The question. Maximum 32,000 characters by default.         |
| `conversation_id` | `string` | no       | Continue an existing conversation. Omit to start a new one. |

**Returns**

```json
{
  "answer": "…the agent's final answer…",
  "conversation_id": "0d6f5e3a-…-…"
}
```

The first call returns a new `conversation_id`. Pass it back on subsequent calls to keep the conversation going — the agent will see prior turns when answering. Each successful call resets the conversation's idle timer.

#### `get_kodesage_conversation`

Read the message history of an existing conversation.

**Arguments**

| Name              | Type     | Required | Description                              |
| ----------------- | -------- | -------- | ---------------------------------------- |
| `conversation_id` | `string` | yes      | A `conversation_id` from `ask_kodesage`. |

**Returns**

```json
{
  "entries": [
    {"role": "user",      "content": "How does authentication work?"},
    {"role": "assistant", "content": "Authentication uses…"}
  ]
}
```

Only completed turns are returned — one user message and the final assistant answer per turn. Intermediate reasoning steps and in-progress or failed turns are omitted.

### Example

Minimal `ask_kodesage` invocation with `curl`:

```bash
curl -sS https://<your-kodesage-host>/mcp/ \
  -H "Authorization: Bearer $KODESAGE_ACCESS_TOKEN" \
  -d '{
    "jsonrpc": "2.0",
    "id": 1,
    "method": "tools/call",
    "params": {
      "name": "ask_kodesage",
      "arguments": {
        "prompt": "How does authentication work in this project?"
      }
    }
  }'
```

### Viewing MCP Conversations in the Web UI

Conversations started over MCP are added to the same **Ask Kodesage** history as conversations started in the web UI, scoped to the user the token belongs to. Sign in as that user and open the project to:

* See the full agent trace, including every tool call and intermediate step (MCP clients only see the final answer).
* Follow up on an MCP conversation from the web UI.
* Audit which questions a token has been asking.

### Limits & Errors

The MCP server enforces two limits, both configurable per deployment:

| Limit                               | Default | Environment variable           |
| ----------------------------------- | ------- | ------------------------------ |
| Maximum prompt length (characters)  | `32000` | `MCP_MAX_PROMPT_CHARS`         |
| Concurrent `ask_kodesage` per token | `3`     | `MCP_MAX_CONCURRENT_PER_TOKEN` |

> **Note on concurrency.** Each `ask_kodesage` call runs a full agentic loop that drives the Kodesage LLM. Raising `MCP_MAX_CONCURRENT_PER_TOKEN` too high lets a single token fan out many parallel agent runs and can saturate the GPU pool, slowing every user on the deployment. Keep it low unless you know your hardware has headroom.

Common errors:

* **HTTP 401** — missing, expired, or revoked token, or the user lost access to the project.
* **`Conversation not found`** — the `conversation_id` is unknown, expired, or belongs to a different token/user/project.
* **`Too many concurrent requests for this token`** — wait for an in-flight call to finish, then retry.
* **`Prompt too long`** — shorten the prompt to within the configured limit.
* **`Agent reached step budget without producing a final answer`** — the agent could not converge; try a more focused prompt.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.kodesage.ai/api/kodesage-mcp-quickstart.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
