Skip to main content

MCP Server Setup Guide

Deploy DevOps AI Toolkit MCP Server to Kubernetes using Helm chart - production-ready deployment with HTTP transport.

For the easiest setup, we recommend installing the complete dot-ai stack which includes all components pre-configured. See the Stack Installation Guide.

Continue below if you want to install components individually (for granular control over configuration).

Overview

The DevOps AI Toolkit provides main capabilities through MCP (Model Context Protocol):

  1. Kubernetes Deployment Recommendations - AI-powered application deployment assistance with enhanced semantic understanding
  2. Cluster Query - Natural language interface for querying cluster resources, status, and health
  3. Capability Management - Discover and store semantic resource capabilities for intelligent recommendation matching
  4. Pattern Management - Organizational deployment patterns that enhance AI recommendations
  5. Policy Management - Governance policies that guide users toward compliant configurations with optional Kyverno enforcement
  6. Kubernetes Issue Remediation - AI-powered root cause analysis and automated remediation
  7. Shared Prompts Library - Centralized prompt sharing via native slash commands
  8. REST API Gateway - HTTP endpoints for all toolkit capabilities

What You Get

  • HTTP Transport MCP Server - Direct HTTP/SSE access for MCP clients
  • Production Kubernetes Deployment - Scalable deployment with proper resource management
  • Integrated Qdrant Database - Vector database for capability and pattern management
  • External Access - Ingress configuration for team collaboration
  • Resource Management - Proper CPU/memory limits and requests
  • Security - RBAC and ServiceAccount configuration

Prerequisites

  • Kubernetes cluster (1.19+) with kubectl access
  • Helm 3.x installed
  • AI model API key (default: Anthropic). See AI Model Configuration for available model options.
  • OpenAI API key (required for vector embeddings)
  • Ingress controller (any standard controller)

Quick Start (5 Minutes)

Step 1: Set Environment Variables

Export your API keys and auth token:

# Required
export ANTHROPIC_API_KEY="sk-ant-api03-..."
export OPENAI_API_KEY="sk-proj-..."
export DOT_AI_AUTH_TOKEN=$(openssl rand -base64 32)

# Ingress class - change to match your ingress controller (traefik, haproxy, etc.)
export INGRESS_CLASS_NAME="nginx"

Step 2: Install the Controller

Install the dot-ai-controller to enable autonomous cluster operations:

# Set the controller version from https://github.com/vfarcic/dot-ai-controller/pkgs/container/dot-ai-controller%2Fcharts%2Fdot-ai-controller
export DOT_AI_CONTROLLER_VERSION="..."

# Install controller (includes CRDs for Solution and RemediationPolicy)
helm install dot-ai-controller \
oci://ghcr.io/vfarcic/dot-ai-controller/charts/dot-ai-controller:$DOT_AI_CONTROLLER_VERSION \
--namespace dot-ai \
--create-namespace \
--wait

The controller provides CRDs for autonomous cluster operations. Create Custom Resources like CapabilityScanConfig, Solution, RemediationPolicy, or ResourceSyncConfig to enable features such as capability scanning, solution tracking, and more. See the Controller Setup Guide for complete details.

Step 3: Install the MCP Server

Install the MCP server using the published Helm chart:

# Set the version from https://github.com/vfarcic/dot-ai/pkgs/container/dot-ai%2Fcharts%2Fdot-ai
export DOT_AI_VERSION="..."

helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \
--set secrets.openai.apiKey="$OPENAI_API_KEY" \
--set secrets.auth.token="$DOT_AI_AUTH_TOKEN" \
--set ingress.enabled=true \
--set ingress.className="$INGRESS_CLASS_NAME" \
--set ingress.host="dot-ai.127.0.0.1.nip.io" \
--set controller.enabled=true \
--namespace dot-ai \
--wait

Notes:

  • Replace dot-ai.127.0.0.1.nip.io with your desired hostname for external access.
  • For enhanced security, create a secret named dot-ai-secrets with keys anthropic-api-key, openai-api-key, and auth-token instead of using --set arguments.
  • For all available configuration options, see the Helm values file.
  • Global annotations: Add annotations to all Kubernetes resources using annotations in your values file (e.g., for Reloader integration: reloader.stakater.com/auto: "true").
  • Custom endpoints (OpenRouter, self-hosted): See Custom Endpoint Configuration for environment variables, then use --set or values file with ai.customEndpoint.enabled=true and ai.customEndpoint.baseURL.
  • Observability/Tracing: Add tracing environment variables via extraEnv in your values file. See Observability Guide for complete configuration.
  • User-Defined Prompts: Load custom prompts from your git repository via extraEnv. See User-Defined Prompts for configuration.

Step 4: Configure MCP Client

Create an .mcp.json file in your project root:

{
"mcpServers": {
"dot-ai": {
"type": "http",
"url": "http://dot-ai.127.0.0.1.nip.io",
"headers": {
"Authorization": "Bearer <your-auth-token>"
}
}
}
}

Replace <your-auth-token> with the token from Step 1 (run echo $DOT_AI_AUTH_TOKEN to view it).

Save this configuration:

  • Claude Code: Save as .mcp.json in your project directory
  • Other clients: See MCP Client Compatibility for filename and location

Notes:

  • Replace the URL with your actual hostname if you changed ingress.host.
  • For production deployments with TLS, see TLS Configuration below.

Step 5: Start Your MCP Client

Start your MCP client (e.g., claude for Claude Code). The client will automatically connect to your Kubernetes-deployed MCP server.

Step 6: Verify Everything Works

In your MCP client, ask:

Show dot-ai status

You should see comprehensive system status including Kubernetes connectivity, vector database, and all available features.

Capability Scanning for AI Recommendations

Many MCP tools depend on capability data to function:

  • recommend: Uses capabilities to find resources matching your deployment intent
  • manageOrgData (patterns): References capabilities when applying organizational patterns
  • manageOrgData (policies): Validates resources against stored capability metadata

Without capability data, these tools may not work or will produce poor results.

Enabling Capability Scanning

Create a CapabilityScanConfig CR to enable autonomous capability discovery. The controller watches for CRD changes and automatically scans new resources. See the Capability Scan Guide for setup instructions.

AI Model Configuration

The DevOps AI Toolkit supports multiple AI models. Choose your model by setting the AI_PROVIDER environment variable.

Model Requirements

All AI models must meet these minimum requirements:

  • Context window: 200K+ tokens (some tools like capability scanning use large context)
  • Output tokens: 8K+ tokens (for YAML generation and policy creation)
  • Function calling: Required for MCP tool interactions

Available Models

ProviderModelAI_PROVIDERAPI Key RequiredRecommended
AnthropicClaude Haiku 4.5anthropic_haikuANTHROPIC_API_KEYYes
AnthropicClaude Opus 4.5anthropic_opusANTHROPIC_API_KEYYes
AnthropicClaude Sonnet 4.5anthropicANTHROPIC_API_KEYYes
AWSAmazon Bedrockamazon_bedrockAWS credentials (see setup)Yes
GoogleGemini 3 ProgoogleGOOGLE_GENERATIVE_AI_API_KEYYes (might be slow)
GoogleGemini 3 Flashgoogle_flashGOOGLE_GENERATIVE_AI_API_KEYYes (preview)
HostHost Environment LLMhostNone (uses host's AI)Yes (if supported)
Moonshot AIKimi K2kimiMOONSHOT_API_KEYYes
Moonshot AIKimi K2 Thinkingkimi_thinkingMOONSHOT_API_KEYYes (might be slow)
OpenAIGPT-5.1 CodexopenaiOPENAI_API_KEYNo *
xAIGrok-4xaiXAI_API_KEYNo *

* Note: These models may not perform as well as other providers for complex DevOps reasoning tasks.

Models Not Supported

ProviderModelReason
DeepSeekDeepSeek V3.2 (deepseek-chat)128K context limit insufficient for heavy workflows
DeepSeekDeepSeek R1 (deepseek-reasoner)64K context limit insufficient for most workflows

Why DeepSeek is not supported: Integration testing revealed that DeepSeek's context window limitations (128K for V3.2, 64K for R1) cause failures in context-heavy operations like Kyverno policy generation, which can exceed 130K tokens. The toolkit requires 200K+ context for reliable operation across all features.

Helm Configuration

Set AI provider in your Helm values:

ai:
provider: anthropic_haiku # or anthropic, anthropic_opus, google, etc.

secrets:
anthropic:
apiKey: "your-api-key"

Or via --set:

helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--set ai.provider=anthropic_haiku \
--set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \
# ... other settings

AI Keys Are Optional: The MCP server starts successfully without AI API keys. Tools like Shared Prompts Library and REST API Gateway work without AI. AI-powered tools (deployment recommendations, remediation, pattern/policy management, capability scanning) require AI keys (unless using the host provider) and will show helpful error messages when accessed without configuration.

Embedding Provider Configuration

The DevOps AI Toolkit supports multiple embedding providers for semantic search capabilities in pattern management, capability discovery, and policy matching.

Available Embedding Providers

ProviderEMBEDDINGS_PROVIDERModelDimensionsAPI Key Required
Amazon Bedrockamazon_bedrockamazon.titan-embed-text-v2:01024AWS credentials
Googlegoogletext-embedding-004 (deprecated)768GOOGLE_API_KEY
Googlegooglegemini-embedding-001768GOOGLE_API_KEY
OpenAIopenai (default)text-embedding-3-small1536OPENAI_API_KEY

Helm Configuration

Set embedding provider via extraEnv in your values file:

extraEnv:
- name: EMBEDDINGS_PROVIDER
value: "google"
- name: GOOGLE_API_KEY
valueFrom:
secretKeyRef:
name: dot-ai-secrets
key: google-api-key

Notes:

  • Same Provider: If using the same provider for both AI models and embeddings (e.g., AI_PROVIDER=google and EMBEDDINGS_PROVIDER=google), you only need to set one API key
  • Mixed Providers: You can use different providers for AI models and embeddings (e.g., AI_PROVIDER=anthropic with EMBEDDINGS_PROVIDER=google)
  • Embedding Support: Not all AI model providers support embeddings. Anthropic does not provide embeddings; use OpenAI, Google, or Amazon Bedrock for embeddings
  • Google Deprecation: text-embedding-004 will be discontinued on January 14, 2026. Use gemini-embedding-001 for new deployments. When switching models, you must delete and recreate all embeddings (patterns, capabilities, policies) as vectors from different models are not compatible

Custom Endpoint Configuration

You can configure custom OpenAI-compatible endpoints for AI models. This enables using alternative providers like OpenRouter, self-hosted models, or air-gapped deployments.

In-Cluster Ollama Example

Deploy with a self-hosted Ollama service running in the same Kubernetes cluster:

Create a values.yaml file:

ai:
provider: openai
model: "llama3.3:70b" # Your self-hosted model
customEndpoint:
enabled: true
baseURL: "http://ollama-service.default.svc.cluster.local:11434/v1"

secrets:
customLlm:
apiKey: "ollama" # Ollama doesn't require authentication
openai:
apiKey: "your-openai-key" # Still needed for vector embeddings

Install with custom values:

helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--values values.yaml \
--create-namespace \
--namespace dot-ai \
--wait

Other Self-Hosted Options

vLLM (Self-Hosted):

ai:
provider: openai
model: "meta-llama/Llama-3.1-70B-Instruct"
customEndpoint:
enabled: true
baseURL: "http://vllm-service:8000/v1"

secrets:
customLlm:
apiKey: "dummy" # vLLM may not require authentication
openai:
apiKey: "your-openai-key"

LocalAI (Self-Hosted):

ai:
provider: openai
model: "your-model-name"
customEndpoint:
enabled: true
baseURL: "http://localai-service:8080/v1"

secrets:
customLlm:
apiKey: "dummy"
openai:
apiKey: "your-openai-key"

OpenRouter Example

OpenRouter provides access to 100+ LLM models from multiple providers:

ai:
provider: openai
model: "anthropic/claude-3.5-sonnet"
customEndpoint:
enabled: true
baseURL: "https://openrouter.ai/api/v1"

secrets:
customLlm:
apiKey: "sk-or-v1-your-key-here"
openai:
apiKey: "your-openai-key" # Still needed for embeddings

Note: OpenRouter does not support embedding models. Use OpenAI, Google, or Amazon Bedrock for embeddings.

Get your OpenRouter API key at https://openrouter.ai/

Important Notes

  • Context window: 200K+ tokens recommended
  • Output tokens: 8K+ tokens minimum
  • Function calling: Must support OpenAI-compatible function calling

Testing Status:

  • Validated with OpenRouter (alternative SaaS provider)
  • Not yet tested with self-hosted Ollama, vLLM, or LocalAI
  • We need your help testing! Report results in issue #193

Notes:

  • OpenAI API key is still required for vector embeddings (Qdrant operations)
  • If model requirements are too high for your setup, please open an issue
  • Configuration examples are based on common patterns but not yet validated

TLS Configuration

To enable HTTPS, add these values (requires cert-manager with a ClusterIssuer):

ingress:
tls:
enabled: true
clusterIssuer: letsencrypt # Your ClusterIssuer name

Then update your .mcp.json URL to use https://.

Web UI Visualization

Enable rich visualizations of query results by connecting to a DevOps AI Web UI instance.

When configured, the query tool includes a visualizationUrl field in responses that opens interactive visualizations (resource topology, relationships, health status) in your browser.

Configuration

Add the Web UI base URL to your Helm values:

webUI:
baseUrl: "https://dot-ai-ui.example.com" # Your Web UI instance URL

Or via --set:

helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--set webUI.baseUrl="https://dot-ai-ui.example.com" \
# ... other settings

Feature Toggle Behavior

  • Not configured (default): Query responses contain only text summaries. No visualizationUrl field is included.
  • Configured: Query responses include a visualizationUrl field (format: {baseUrl}/v/{sessionId}) that opens the visualization in the Web UI.

Example Query Response

When webUI.baseUrl is configured, query responses include:

**View visualization**: https://dot-ai-ui.example.com/v/abc123-session-id

This URL opens an interactive visualization of the query results in the Web UI.

Gateway API (Alternative to Ingress)

For Kubernetes 1.26+, you can use Gateway API v1 for advanced traffic management with role-oriented design (platform teams manage Gateways, app teams create routes).

When to Use

Use Gateway API when:

  • Running Kubernetes 1.26+ with Gateway API support
  • Need advanced routing (weighted traffic, header-based routing)
  • Prefer separation of infrastructure and application concerns

Use Ingress when:

  • Running Kubernetes < 1.26
  • Simpler requirements met by Ingress features

Prerequisites

  • Kubernetes 1.26+ cluster
  • Gateway API CRDs installed: kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.1/standard-install.yaml
  • Gateway controller running (Istio, Envoy Gateway, Kong, etc.)
  • Existing Gateway resource created by platform team (reference pattern)

Reference an existing platform-managed Gateway:

helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \
--set secrets.openai.apiKey="$OPENAI_API_KEY" \
--set secrets.auth.token="$DOT_AI_AUTH_TOKEN" \
--set ingress.enabled=false \
--set gateway.name="cluster-gateway" \
--set gateway.namespace="gateway-system" \
--namespace dot-ai \
--wait

Configuration Reference

# Reference pattern (RECOMMENDED)
gateway:
name: "cluster-gateway" # Existing Gateway name
namespace: "gateway-system" # Gateway namespace (optional)
timeouts:
request: "3600s" # SSE streaming timeout
backendRequest: "3600s"

# Creation pattern (development/testing only)
gateway:
create: true # Create Gateway (NOT for production)
className: "istio" # GatewayClass name

Complete Guide

See Gateway API Deployment Guide for:

  • Platform team Gateway setup (HTTP and HTTPS)
  • Application team deployment steps
  • Cross-namespace access (ReferenceGrant)
  • Development/testing creation pattern
  • Troubleshooting and verification
  • Migration from Ingress

MCP Client Compatibility

The DevOps AI Toolkit works with any MCP-compatible coding agent or development tool.

Claude Code

  • Create .mcp.json in your project root with the configuration from Step 4
  • Start with claude - MCP tools automatically available

Cursor

  • Settings -> "MCP Servers" -> Add configuration -> Restart

Cline (VS Code Extension)

  • Configure in VS Code settings or extension preferences

VS Code (with MCP Extension)

  • Add configuration to settings.json under mcp.servers

Other MCP Clients

  • Any client supporting the Model Context Protocol standard
  • Use the HTTP configuration pattern shown in Step 4

Next Steps

Once your MCP server is running:

1. Explore Available Tools and Features

2. Enable Observability (Optional)

  • Observability Guide - Distributed tracing with OpenTelemetry for debugging workflows, measuring AI performance, and monitoring Kubernetes operations

3. Production Considerations

  • Consider backup strategies for vector database content (organizational patterns and capabilities)
  • Review TLS Configuration for HTTPS

Support