GitLab Duo Self-Hosted Support Engineer Playbook

Tier: Premium, Ultimate
Add-on: GitLab Duo Enterprise
Offering: GitLab Self-Managed

Version history

Introduced in GitLab 17.1 with a flag named ai_custom_model. Disabled by default.
Enabled on GitLab Self-Managed in GitLab 17.6.
Changed to require GitLab Duo add-on in GitLab 17.6 and later.
Feature flag ai_custom_model removed in GitLab 17.8.
Generally available in GitLab 17.9.
Changed to include Premium in GitLab 18.0.

Support Engineer Playbook and Common Issues

This section provides Support Engineers with essential commands and troubleshooting steps for debugging GitLab Duo Self-Hosted issues.

Essential Debugging Commands

Display AI Gateway Environment Variables

Check all AI Gateway environment variables to verify configuration:

docker exec -it <ai-gateway-container> env | grep AIGW

Key variables to verify:

AIGW_CUSTOM_MODELS__ENABLED - must be true
AIGW_GITLAB_URL - should match your GitLab instance URL
AIGW_GITLAB_API_URL - should be accessible from the container
AIGW_AUTH__BYPASS_EXTERNAL - should only be true during troubleshooting

Verify User Permissions

Check if a user has the correct permissions for Code Suggestions with self-hosted models:

# In GitLab Rails console
user = User.find_by_id("<user_id>")
user.allowed_to_use?(:code_suggestions, service_name: :self_hosted_models)

Examine AI Gateway Client Logs

View AI Gateway client logs to identify connection issues:

docker logs <ai-gateway-container> | grep "Gitlab::Llm::AiGateway::Client"

View GitLab Logs for AI Gateway Requests

To see the actual requests made to the AI Gateway, use:

# View live logs
sudo gitlab-ctl tail | grep -E "(ai_gateway|llm\.log)"

# View specific log file with JSON formatting
sudo cat /var/log/gitlab/gitlab-rails/llm.log | jq '.'

# Filter for specific request types
 sudo cat /var/log/gitlab/gitlab-rails/llm.log | jq 'select(.message)'

 sudo cat /var/log/gitlab/gitlab-rails/llm.log | grep Llm::CompletionWorker | jq '.'

View AI Gateway Logs for Model Requests

To see the actual requests sent to the model:

# View AI Gateway container logs
docker logs <ai-gateway-container> 2>&1 | grep -E "(model|litellm|custom_openai)"

# For structured logs, if available
docker logs <ai-gateway-container> 2>&1 | grep "model_endpoint"

Common Configuration Issues and Solutions

Missing `/v1` Suffix in Model Endpoint

Symptom: 404 errors when making requests to vLLM or OpenAI-compatible models

How to spot in logs:

# Look for 404 errors in AI Gateway logs
docker logs <ai-gateway-container> | grep "404"

Solution: Ensure the model endpoint includes the /v1 suffix:

Incorrect: http://localhost:4000
Correct: http://localhost:4000/v1

Certificate Validation Issues

Symptom: SSL certificate errors or connection failures

How to spot in logs:

# Look for SSL/TLS errors
sudo cat /var/log/gitlab/gitlab-rails/llm.log | grep -i "ssl\|certificate\|tls"

Validation: Verify certificate status - GitLab server must use a trusted certificate, as self-signed certificates are not supported.

Solution:

Use trusted certificates for GitLab instance
If using self-signed certificates, configure proper certificate paths in the AI Gateway container

Network Connectivity Issues

Symptom: Timeouts or connection refused errors

How to spot in logs:

# Look for network-related errors
docker logs <ai-gateway-container> | grep -E "(timeout|connection|refused|unreachable)"

Validation commands:

# Test from AI Gateway container to GitLab
docker exec -it <ai-gateway-container> curl "$AIGW_GITLAB_API_URL/projects"

# Test from AI Gateway container to model endpoint
docker exec -it <ai-gateway-container> curl "<model_endpoint>/health"

Authentication and Authorization Issues

Symptom: 401 Unauthorized or 403 Forbidden errors

How to spot in logs:

# Look for authentication errors
sudo cat /var/log/gitlab/gitlab-rails/llm.log | jq 'select(.status == 401 or .status == 403)'

Common causes:

User doesn't have GitLab Duo Enterprise seat assigned
License issues
Incorrect AI Gateway URL configuration

Model Configuration Issues

Symptom: Model not responding or returning errors

How to spot in logs:

# Look for model-specific errors
docker logs <ai-gateway-container> | grep -E "(model_name|model_endpoint|litellm)"

Validation:

# Test model directly from AI Gateway container
docker exec -it <ai-gateway-container> sh
curl --request POST "<model_endpoint>/v1/chat/completions" \
     --header 'Content-Type: application/json' \
     --data '{"model": "<model_name>", "messages": [{"role": "user", "content": "Hello"}]}'

Log Analysis Workflow

Step 1: Enable Verbose Logging

Check if the expanded_ai_logging feature flag is enabled, in GitLab Rails console:

Feature.enabled?(:expanded_ai_logging)

If it returns false, enable the flag using:

Feature.enable(:expanded_ai_logging)

Step 2: Reproduce the Issue

Have the user reproduce the issue while monitoring logs:

# Terminal 1: Monitor GitLab logs
sudo gitlab-ctl tail | grep -E "(ai_gateway|llm\.log)"

# Terminal 2: Monitor AI Gateway logs
docker logs -f <ai-gateway-container>

Step 3: Analyze Request Flow

GitLab to AI Gateway: Check if request reaches AI Gateway
AI Gateway to Model: Verify model endpoint is called
Response Path: Ensure response is properly formatted and returned

Step 4: Common Error Patterns

Error Pattern	Location	Likely Cause
`Connection refused`	GitLab logs	AI Gateway not accessible
`404 Not Found`	AI Gateway logs	Missing `/v1` in model endpoint
`401 Unauthorized`	GitLab logs	Authentication/license issues
`Timeout`	Either	Network or model performance issues
`SSL certificate verify failed`	GitLab logs	Certificate validation issues

Quick Diagnostic Commands

AI Gateway Instance Commands:

1. Test AI Gateway health:

curl --silent --output /dev/null --write-out "%{http_code}" "<ai-gateway-url>/monitoring/healthz"

2. Check AI Gateway environment variables:

docker exec <ai-gateway-container> env | grep AIGW

3. Check AI Gateway logs for errors:

docker logs <ai-gateway-container> 2>&1 | grep --ignore-case error | tail --lines=20

GitLab Self-Managed Instance Commands:

4. Check user permissions (GitLab Rails console):

sudo gitlab-rails console

Then in the console:

User.find_by_id('<user_id>').can?(:access_code_suggestions)

5. Check GitLab LLM logs for errors:

sudo tail --lines=100 /var/log/gitlab/gitlab-rails/llm.log | grep --ignore-case error

6. Check feature flags:

sudo gitlab-rails console

Then:

Feature.enabled?(:expanded_ai_logging)

7. Test connectivity from GitLab to AI Gateway:

curl --verbose "<ai-gateway-url>/monitoring/healthz"

Emergency Diagnostic One-liner

For quick issue identification:

# Check all critical components at once
docker exec <ai-gateway-container> env | grep AIGW_CUSTOM_MODELS__ENABLED && \
curl --silent "<ai-gateway-url>/monitoring/healthz" && \
sudo tail --lines=10 /var/log/gitlab/gitlab-rails/llm.log | jq '.level'

Escalation Criteria

Escalate to Custom Models team when:

All basic troubleshooting steps completed without resolution
Model integration issues that require deep technical knowledge
Feature not listed in self-hosted models unit primitives
Suspected GitLab Duo platform bugs affecting multiple users
Performance issues with specific model configurations

GitLab Duo Self-Hosted Support Engineer Playbook

Version history

Support Engineer Playbook and Common Issues

Essential Debugging Commands

Display AI Gateway Environment Variables

Verify User Permissions

Examine AI Gateway Client Logs

View GitLab Logs for AI Gateway Requests

View AI Gateway Logs for Model Requests

Common Configuration Issues and Solutions

Missing /v1 Suffix in Model Endpoint

Certificate Validation Issues

Network Connectivity Issues

Authentication and Authorization Issues

Model Configuration Issues

Log Analysis Workflow

Step 1: Enable Verbose Logging

Step 2: Reproduce the Issue

Step 3: Analyze Request Flow

Step 4: Common Error Patterns

Quick Diagnostic Commands

AI Gateway Instance Commands:

GitLab Self-Managed Instance Commands:

Emergency Diagnostic One-liner

Escalation Criteria

Additional Resources

Missing `/v1` Suffix in Model Endpoint