AI Engine Deployment
Deploy the DevOps AI Toolkit Engine to Kubernetes using Helm chart — production-ready deployment.
For the easiest setup, we recommend installing the complete dot-ai stack which includes all components pre-configured. See the Stack Installation Guide.
Continue below if you want to install components individually (for granular control over configuration).
Overview
The DevOps AI Toolkit Engine provides:
- Kubernetes Deployment Recommendations — AI-powered application deployment assistance with enhanced semantic understanding
- Cluster Query — Natural language interface for querying cluster resources, status, and health
- Capability Management — Discover and store semantic resource capabilities for intelligent recommendation matching
- Pattern Management — Organizational deployment patterns that enhance AI recommendations
- Policy Management — Governance policies that guide users toward compliant configurations with optional Kyverno enforcement
- Kubernetes Issue Remediation — AI-powered root cause analysis and automated remediation
- Shared Prompts Library — Centralized prompt sharing via native slash commands
- REST API Gateway — HTTP endpoints for all toolkit capabilities
Access these tools through MCP clients or the CLI.
What You Get
- Production Kubernetes Deployment — Scalable deployment with proper resource management
- Integrated Qdrant Database — Vector database for capability and pattern management
- External Access — Ingress configuration for team collaboration
- Resource Management — Proper CPU/memory limits and requests
- Authentication — Static token (default) and OAuth (opt-in, requires HTTPS)
- Security — RBAC and ServiceAccount configuration
Prerequisites
- Kubernetes cluster (1.19+) with kubectl access
- Helm 3.x installed
- AI model API key (default: Anthropic). See AI Model Configuration for available model options.
- Ingress controller (any standard controller)
Quick Start (5 Minutes)
Step 1: Set Environment Variables
Export your API key and configure authentication:
# Required
export ANTHROPIC_API_KEY="sk-ant-api03-..."
# Required — static token for REST API, CI/CD, and MCP clients without OAuth
export DOT_AI_AUTH_TOKEN=$(openssl rand -base64 32)
# Ingress class - change to match your ingress controller (traefik, haproxy, etc.)
export INGRESS_CLASS_NAME="nginx"
Step 2: Install the Controller
Install the dot-ai-controller to enable autonomous cluster operations:
# Set the controller version from https://github.com/vfarcic/dot-ai-controller/pkgs/container/dot-ai-controller%2Fcharts%2Fdot-ai-controller
export DOT_AI_CONTROLLER_VERSION="..."
# Install controller (includes CRDs for Solution and RemediationPolicy)
helm install dot-ai-controller \
oci://ghcr.io/vfarcic/dot-ai-controller/charts/dot-ai-controller:$DOT_AI_CONTROLLER_VERSION \
--namespace dot-ai \
--create-namespace \
--wait
The controller provides CRDs for autonomous cluster operations. Create Custom Resources like CapabilityScanConfig, Solution, RemediationPolicy, or ResourceSyncConfig to enable features such as capability scanning, solution tracking, and more. See the Controller Setup Guide for complete details.
Step 3: Install the Server
Install the server using the published Helm chart:
# Set the version from https://github.com/vfarcic/dot-ai/pkgs/container/dot-ai%2Fcharts%2Fdot-ai
export DOT_AI_VERSION="..."
helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \
--set secrets.auth.token="$DOT_AI_AUTH_TOKEN" \
--set localEmbeddings.enabled=true \
--set ingress.enabled=true \
--set ingress.className="$INGRESS_CLASS_NAME" \
--set ingress.host="dot-ai.127.0.0.1.nip.io" \
--set controller.enabled=true \
--namespace dot-ai \
--wait
Notes:
- Authentication:
DOT_AI_AUTH_TOKENis required and provides shared token auth for REST API, CI/CD, and MCP clients without OAuth. To enable OAuth with individual user identity, setdex.enabled: true(requires HTTPS). See Authentication for details. localEmbeddings.enabled=truedeploys an in-cluster embedding service (HuggingFace TEI) so semantic search works without any embedding API keys. See Local Embeddings for details.- Replace
dot-ai.127.0.0.1.nip.iowith your desired hostname for external access. - For enhanced security, create a secret named
dot-ai-secretswith keysanthropic-api-keyandauth-tokeninstead of using--setarguments. - For all available configuration options, see the Helm values file.
- Global annotations: Add annotations to all Kubernetes resources using
annotationsin your values file (e.g., for Reloader integration:reloader.stakater.com/auto: "true"). - Custom endpoints (OpenRouter, self-hosted): See Custom Endpoint Configuration for environment variables, then use
--setor values file withai.customEndpoint.enabled=trueandai.customEndpoint.baseURL. - Observability/Tracing: Add tracing environment variables via
extraEnvin your values file. See Observability Guide for complete configuration. - User-Defined Prompts: Load custom prompts from your git repository via
extraEnv. See User-Defined Prompts for configuration.
Step 4: Connect a Client
With the server running, connect using your preferred access method:
- MCP Client Setup — Connect via MCP protocol from Claude Code, Cursor, or other MCP clients. MCP clients with OAuth support authenticate via browser automatically; others use the static token.
- CLI — Use the command-line interface for terminal and CI/CD pipelines
Capability Scanning for AI Recommendations
Many MCP tools depend on capability data to function:
- recommend: Uses capabilities to find resources matching your deployment intent
- manageOrgData (patterns): References capabilities when applying organizational patterns
- manageOrgData (policies): Validates resources against stored capability metadata
Without capability data, these tools may not work or will produce poor results.
Enabling Capability Scanning
Create a CapabilityScanConfig CR to enable autonomous capability discovery. The controller watches for CRD changes and automatically scans new resources. See the Capability Scan Guide for setup instructions.
AI Model Configuration
The DevOps AI Toolkit supports multiple AI models. Choose your model by setting the AI_PROVIDER environment variable.
Model Requirements
All AI models must meet these minimum requirements:
- Context window: 200K+ tokens (some tools like capability scanning use large context)
- Output tokens: 8K+ tokens (for YAML generation and policy creation)
- Function calling: Required for MCP tool interactions
Available Models
| Provider | Model | AI_PROVIDER | API Key Required | Recommended |
|---|---|---|---|---|
| Anthropic | Claude Haiku 4.5 | anthropic_haiku | ANTHROPIC_API_KEY | Yes |
| Anthropic | Claude Opus 4.6 | anthropic_opus | ANTHROPIC_API_KEY | Yes |
| Anthropic | Claude Sonnet 4.6 | anthropic | ANTHROPIC_API_KEY | Yes |
| AWS | Amazon Bedrock | amazon_bedrock | AWS credentials (see setup) | Yes |
| Gemini 3.1 Pro | google | GOOGLE_GENERATIVE_AI_API_KEY | Yes (might be slow) | |
| Gemini 3 Flash | google_flash | GOOGLE_GENERATIVE_AI_API_KEY | Yes (preview) | |
| Host | Host Environment LLM | host | None (uses host's AI) | Yes (if supported) |
| Moonshot AI | Kimi K2.5 | kimi | MOONSHOT_API_KEY | Yes |
| Alibaba | Qwen 3.5 Plus | alibaba | ALIBABA_API_KEY | Yes |
| OpenAI | GPT-5.4 | openai | OPENAI_API_KEY | No * |
| xAI | Grok-4 | xai | XAI_API_KEY | No * |
Migration note:
AI_PROVIDER=kimi_thinkingwas removed. If you were using that value, switch toAI_PROVIDER=kimi— Kimi K2.5 includes thinking mode by default.
* Note: These models may not perform as well as other providers for complex DevOps reasoning tasks.
Models Not Supported
| Provider | Model | Reason |
|---|---|---|
| DeepSeek | DeepSeek V3.2 (deepseek-chat) | 128K context limit insufficient for heavy workflows |
| DeepSeek | DeepSeek R1 (deepseek-reasoner) | 64K context limit insufficient for most workflows |
Why DeepSeek is not supported: Integration testing revealed that DeepSeek's context window limitations (128K for V3.2, 64K for R1) cause failures in context-heavy operations like Kyverno policy generation, which can exceed 130K tokens. The toolkit requires 200K+ context for reliable operation across all features.
Helm Configuration
Set AI provider in your Helm values:
ai:
provider: anthropic_haiku # or anthropic, anthropic_opus, google, etc.
secrets:
anthropic:
apiKey: "your-api-key"
Or via --set:
helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--set ai.provider=anthropic_haiku \
--set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \
# ... other settings
AI Keys Are Optional: The MCP server starts successfully without AI API keys. Tools like Shared Prompts Library and REST API Gateway work without AI. AI-powered tools (deployment recommendations, remediation, pattern/policy management, capability scanning) require AI keys (unless using the host provider) and will show helpful error messages when accessed without configuration.
Embedding Provider Configuration
The DevOps AI Toolkit supports multiple embedding providers for semantic search in pattern management, capability discovery, and policy matching.
Local Embeddings (Zero-Config)
The recommended setup for new deployments. An in-cluster HuggingFace Text Embeddings Inference (TEI) service provides embeddings without any API keys.
localEmbeddings:
enabled: true # Deploys TEI with all-MiniLM-L6-v2 (384 dimensions)
This is already included in the Quick Start above. No additional configuration needed.
| Property | Value |
|---|---|
| Model | all-MiniLM-L6-v2 (384 dimensions) |
| Resource footprint | ~256 MB RAM, 250m CPU (request) |
| GPU | Not required |
| Architecture | amd64 only (no ARM64/Apple Silicon — see TEI issue #769) |
To customize the model or resources:
localEmbeddings:
enabled: true
model: "sentence-transformers/all-MiniLM-L6-v2" # Any TEI-compatible model
dimensions: 384 # Must match model output dimensions
resources:
requests:
cpu: "250m"
memory: "256Mi"
limits:
cpu: "1"
memory: "512Mi"
To disable local embeddings (e.g., if using a cloud provider instead):
localEmbeddings:
enabled: false
Cloud Embedding Providers
Use a cloud provider if you need higher-quality embeddings or are already paying for an API key.
| Provider | EMBEDDINGS_PROVIDER | Model | Dimensions | API Key Required |
|---|---|---|---|---|
| Amazon Bedrock | amazon_bedrock | amazon.titan-embed-text-v2:0 | 1024 | AWS credentials |
google | gemini-embedding-001 | 768 | GOOGLE_API_KEY | |
| OpenAI | openai | text-embedding-3-small | 1536 | OPENAI_API_KEY |
Set the cloud embedding provider via extraEnv in your values file:
localEmbeddings:
enabled: false # Disable local embeddings when using a cloud provider
secrets:
openai:
apiKey: "your-openai-key"
# Only needed if using a non-OpenAI embedding provider:
extraEnv:
- name: EMBEDDINGS_PROVIDER
value: "google"
- name: GOOGLE_API_KEY
valueFrom:
secretKeyRef:
name: dot-ai-secrets
key: google-api-key
Notes:
- Same Provider: If using the same provider for both AI models and embeddings (e.g.,
AI_PROVIDER=googleandEMBEDDINGS_PROVIDER=google), you only need to set one API key - Mixed Providers: You can use different providers for AI models and embeddings (e.g.,
AI_PROVIDER=anthropicwithEMBEDDINGS_PROVIDER=google) - Embedding Support: Not all AI model providers support embeddings. Anthropic does not provide embeddings; use OpenAI, Google, Amazon Bedrock, or local embeddings
Switching Embedding Providers (Migration)
Switching between embedding providers (e.g., from OpenAI 1536-dim to local 384-dim) requires re-embedding all stored data because vector dimensions differ between models. A REST API endpoint handles this automatically.
Migrate all collections:
curl -X POST https://your-dot-ai-host/api/v1/embeddings/migrate
Migrate a single collection:
curl -X POST https://your-dot-ai-host/api/v1/embeddings/migrate \
-H "Content-Type: application/json" \
-d '{"collection": "patterns"}'
The endpoint re-embeds all points using the currently configured provider. Collections where vector dimensions already match the target are skipped. The response reports per-collection results:
{
"success": true,
"data": {
"collections": [
{
"collection": "patterns",
"status": "migrated",
"previousDimensions": 1536,
"newDimensions": 384,
"total": 42,
"processed": 42,
"failed": 0
}
],
"summary": {
"totalCollections": 1,
"migrated": 1,
"skipped": 0,
"failed": 0
}
}
}
The migration endpoint is also available via the auto-generated CLI (dot-ai embeddings migrate).
Custom Endpoint Configuration
You can configure custom OpenAI-compatible endpoints for AI models. This enables using alternative providers like OpenRouter, self-hosted models, or air-gapped deployments.
In-Cluster Ollama Example
Deploy with a self-hosted Ollama service running in the same Kubernetes cluster:
Create a values.yaml file:
ai:
provider: openai
model: "llama3.3:70b" # Your self-hosted model
customEndpoint:
enabled: true
baseURL: "http://ollama-service.default.svc.cluster.local:11434/v1"
localEmbeddings:
enabled: true # Use local embeddings instead of OpenAI
secrets:
customLlm:
apiKey: "ollama" # Ollama doesn't require authentication
Install with custom values:
helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--values values.yaml \
--create-namespace \
--namespace dot-ai \
--wait
Other Self-Hosted Options
vLLM (Self-Hosted):
ai:
provider: openai
model: "meta-llama/Llama-3.1-70B-Instruct"
customEndpoint:
enabled: true
baseURL: "http://vllm-service:8000/v1"
localEmbeddings:
enabled: true
secrets:
customLlm:
apiKey: "dummy" # vLLM may not require authentication
LocalAI (Self-Hosted):
ai:
provider: openai
model: "your-model-name"
customEndpoint:
enabled: true
baseURL: "http://localai-service:8080/v1"
localEmbeddings:
enabled: true
secrets:
customLlm:
apiKey: "dummy"
OpenRouter Example
OpenRouter provides access to 100+ LLM models from multiple providers:
ai:
provider: openai
model: "anthropic/claude-3.5-sonnet"
customEndpoint:
enabled: true
baseURL: "https://openrouter.ai/api/v1"
localEmbeddings:
enabled: true # OpenRouter doesn't support embeddings; use local instead
secrets:
customLlm:
apiKey: "sk-or-v1-your-key-here"
Get your OpenRouter API key at https://openrouter.ai/
Custom Headers
Pass custom HTTP headers to AI provider APIs using ai.customEndpoint.headers. Headers are specified as a JSON string and sent with every LLM request. Custom headers are merged with provider-specific defaults (e.g., the Anthropic beta header), with your custom headers taking precedence on conflicts.
This is useful for enterprise LLM gateways and proxies that require authentication, versioning, or routing headers.
Example — corporate proxy with authentication headers:
ai:
provider: anthropic
customEndpoint:
enabled: true
baseURL: "https://proxy.corp.example.com/anthropic"
headers: '{"x-api-version": "2026-02-20", "x-proxy-auth": "token123"}'
secrets:
anthropic:
apiKey: "your-anthropic-key"
Notes:
headersmust be valid JSON (e.g.,'{"key": "value"}'). Invalid JSON is ignored with a warning.- Headers apply to LLM requests only, not embeddings.
- Custom headers override provider defaults when the same header key is used.
Native Provider with Custom Base URL
By default, setting a custom base URL without specifying a provider routes requests through an OpenAI-compatible endpoint. If your proxy fronts a non-OpenAI provider (e.g., Anthropic), you can preserve native provider features — such as cache control, extended context, and native tool calling — by explicitly setting the provider.
Example — Anthropic proxy with native features preserved:
ai:
provider: anthropic # Preserves Anthropic-specific features (cache control, extended context)
customEndpoint:
enabled: true
baseURL: "https://proxy.corp.example.com/anthropic"
secrets:
anthropic:
apiKey: "your-anthropic-key"
Without provider: anthropic, this would fall back to OpenAI-compatible mode and lose Anthropic-specific capabilities. The same pattern works for other providers (openai, google, xai).
Important Notes
- Context window: 200K+ tokens recommended
- Output tokens: 8K+ tokens minimum
- Function calling: Must support OpenAI-compatible function calling
Testing Status:
- Validated with OpenRouter (alternative SaaS provider)
- Not yet tested with self-hosted Ollama, vLLM, or LocalAI
- We need your help testing! Report results in issue #193
Notes:
- For embeddings, use
localEmbeddings.enabled=true(recommended) or set an OpenAI/Google/Bedrock API key. See Embedding Provider Configuration. - If model requirements are too high for your setup, please open an issue
- Configuration examples are based on common patterns but not yet validated
MCP Server Integration
Extend dot-ai tools with capabilities from external MCP servers running in your cluster. Instead of building custom integrations for each observability or infrastructure platform, dot-ai connects as an MCP client to discover and use tools from any compatible MCP server.
Any MCP server that supports HTTP transport can be connected — Prometheus, Jaeger, Grafana, Datadog, or any other server from the MCP ecosystem. The examples below use Prometheus, but the configuration pattern is the same for any server.
How It Works
┌──────────────────────────────────────────────────────────────┐
│ Your Cluster │
│ │
│ ┌─────────────┐ MCP Protocol ┌──────────────────┐ │
│ │ dot-ai │◄────────────────────►│ Prometheus MCP │ │
│ │ (MCP Client)│ │ Server │ │
│ │ │ └──────────────────┘ │
│ │ remediate │ MCP Protocol ┌──────────────────┐ │
│ │ operate │◄────────────────────►│ Jaeger MCP │ │
│ │ query │ │ Server │ │
│ └─────────────┘ └──────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────┘
- You deploy MCP servers in your cluster (dot-ai does not manage their lifecycle)
- You configure
mcpServersin Helm values with each server's endpoint - dot-ai connects to each server at startup and discovers available tools
- The
attachTofield controls which dot-ai tools can use each server's tools - During AI analysis, tools from MCP servers are used alongside existing tools automatically
Configuration
Add MCP servers to your Helm values:
mcpServers:
prometheus:
enabled: true
endpoint: "http://prometheus-mcp.monitoring.svc:8080/mcp"
attachTo:
- remediate
- query
jaeger:
enabled: true
endpoint: "http://jaeger-mcp.tracing.svc:3000/mcp"
attachTo:
- remediate
Configuration Reference
Each entry under mcpServers has the following fields:
| Field | Type | Required | Description |
|---|---|---|---|
enabled | boolean | Yes | Whether to connect to this MCP server |
endpoint | string | Yes | Full URL of the MCP server endpoint (must be reachable from dot-ai pod) |
attachTo | string[] | Yes | Which dot-ai tools can use this server's tools. Valid values: remediate, operate, query |
Tool Namespacing
Tools from MCP servers are automatically namespaced as {server}__{tool} to avoid collisions. For example, a Prometheus MCP server configured as prometheus with a tool named execute_query becomes prometheus__execute_query in dot-ai.
Prometheus Example
This example shows how to integrate a Prometheus MCP server so that remediate and query can use Prometheus metrics during analysis.
Prerequisites:
- Prometheus running in your cluster (e.g., via
prometheus-community/prometheusHelm chart) - A Prometheus MCP server deployed and accessible (e.g., pab1it0/prometheus-mcp-server)
Deploy the Prometheus MCP server (example using a Kubernetes Deployment):
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus-mcp
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-mcp
template:
metadata:
labels:
app: prometheus-mcp
spec:
containers:
- name: prometheus-mcp
image: ghcr.io/pab1it0/prometheus-mcp-server:latest
ports:
- containerPort: 8080
env:
- name: PROMETHEUS_URL
value: "http://prometheus-server.monitoring.svc:80"
- name: MCP_TRANSPORT
value: "http"
- name: PORT
value: "8080"
---
apiVersion: v1
kind: Service
metadata:
name: prometheus-mcp
namespace: monitoring
spec:
selector:
app: prometheus-mcp
ports:
- port: 8080
targetPort: 8080
Configure dot-ai to connect:
mcpServers:
prometheus:
enabled: true
endpoint: "http://prometheus-mcp.monitoring.svc:8080/mcp"
attachTo:
- remediate
- query
Once configured, the remediate tool can correlate cluster events with Prometheus metrics (CPU/memory trends, error rates) for more accurate root cause analysis. The query tool can answer questions like "what's the memory usage trend for my-api?" using live Prometheus data.
Verifying MCP Server Connections
After deployment, verify MCP server connections using the version/status tool:
Show dot-ai status
The response includes an mcpServers section showing connected servers, their endpoints, attached operations, and discovered tool count. Use this to confirm servers connected successfully.
Startup Behavior
- No MCP servers configured (default): dot-ai starts normally without MCP server augmentation.
- MCP servers configured: dot-ai connects to each enabled server at startup, discovers tools, and makes them available to the configured operations.
- MCP server unreachable: Startup fails fast with a clear error message. Configured MCP servers must be reachable — there is no background retry. Fix the endpoint or disable the server entry to proceed.
TLS Configuration
HTTPS is required for OAuth authentication (dex.enabled: true). If you only use static token authentication, HTTPS is optional.
Ingress-Level TLS (cert-manager)
To terminate TLS at the Ingress (requires cert-manager with a ClusterIssuer):
ingress:
tls:
enabled: true
clusterIssuer: letsencrypt # Your ClusterIssuer name
Then update your .mcp.json URL to use https://.
Reverse Proxy / External TLS Termination
If TLS terminates upstream (Traefik, nginx, cloud load balancer), leave ingress.tls.enabled: false and set the external URLs explicitly:
externalUrl: "https://dot-ai.example.com"
dex:
enabled: true
externalUrl: "https://dex.dot-ai.example.com"
Web UI Visualization
Enable rich visualizations of query results by connecting to a DevOps AI Web UI instance.
When configured, the query tool includes a visualizationUrl field in responses that opens interactive visualizations (resource topology, relationships, health status) in your browser.
Configuration
Add the Web UI base URL to your Helm values:
webUI:
baseUrl: "https://dot-ai-ui.example.com" # Your Web UI instance URL
Or via --set:
helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--set webUI.baseUrl="https://dot-ai-ui.example.com" \
# ... other settings
Feature Toggle Behavior
- Not configured (default): Query responses contain only text summaries. No
visualizationUrlfield is included. - Configured: Query responses include a
visualizationUrlfield (format:{baseUrl}/v/{sessionId}) that opens the visualization in the Web UI.
Example Query Response
When webUI.baseUrl is configured, query responses include:
**View visualization**: https://dot-ai-ui.example.com/v/abc123-session-id
This URL opens an interactive visualization of the query results in the Web UI.
Gateway API (Alternative to Ingress)
For Kubernetes 1.26+, you can use Gateway API v1 for advanced traffic management with role-oriented design (platform teams manage Gateways, app teams create routes).
When to Use
Use Gateway API when:
- Running Kubernetes 1.26+ with Gateway API support
- Need advanced routing (weighted traffic, header-based routing)
- Prefer separation of infrastructure and application concerns
Use Ingress when:
- Running Kubernetes < 1.26
- Simpler requirements met by Ingress features
Prerequisites
- Kubernetes 1.26+ cluster
- Gateway API CRDs installed:
kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.1/standard-install.yaml - Gateway controller running (Istio, Envoy Gateway, Kong, etc.)
- Existing Gateway resource created by platform team (reference pattern)
Quick Start (Reference Pattern - RECOMMENDED)
Reference an existing platform-managed Gateway:
helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \
--set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \
--set secrets.auth.token="$DOT_AI_AUTH_TOKEN" \
--set localEmbeddings.enabled=true \
--set ingress.enabled=false \
--set gateway.name="cluster-gateway" \
--set gateway.namespace="gateway-system" \
--namespace dot-ai \
--wait
Configuration Reference
# Reference pattern (RECOMMENDED)
gateway:
name: "cluster-gateway" # Existing Gateway name
namespace: "gateway-system" # Gateway namespace (optional)
timeouts:
request: "3600s" # SSE streaming timeout
backendRequest: "3600s"
# Creation pattern (development/testing only)
gateway:
create: true # Create Gateway (NOT for production)
className: "istio" # GatewayClass name
Complete Guide
See Gateway API Deployment Guide for:
- Platform team Gateway setup (HTTP and HTTPS)
- Application team deployment steps
- Cross-namespace access (ReferenceGrant)
- Development/testing creation pattern
- Troubleshooting and verification
- Migration from Ingress
Next Steps
Once the server is running:
1. Configure Authentication & Authorization
- Authentication — Understand static token vs OAuth, enable Dex for per-user identity, or connect your identity provider
- Authorization (RBAC) — Control what each user can do with per-user and per-group permissions (requires OAuth)
2. Explore Tools
- Tools Overview — Complete guide to all available tools, how they work together, and recommended usage flow
3. Enable Observability (Optional)
- Observability Guide — Distributed tracing with OpenTelemetry for debugging workflows, measuring AI performance, and monitoring Kubernetes operations
4. Connect MCP Servers (Optional)
- MCP Server Integration — Augment dot-ai tools with capabilities from external MCP servers (Prometheus, Jaeger, etc.)
5. Production Considerations
- Consider backup strategies for vector database content (organizational patterns and capabilities)
- Review TLS Configuration for HTTPS
Support
- Bug Reports: GitHub Issues