GitLab Duo Self-Hosted Support Engineer Playbook
- Tier: Premium, Ultimate
- Add-on: GitLab Duo Enterprise
- Offering: GitLab Self-Managed
Version history
-
Introduced in GitLab 17.1 with a flag named
ai_custom_model
. Disabled by default. - Enabled on GitLab Self-Managed in GitLab 17.6.
- Changed to require GitLab Duo add-on in GitLab 17.6 and later.
- Feature flag
ai_custom_model
removed in GitLab 17.8. - Generally available in GitLab 17.9.
- Changed to include Premium in GitLab 18.0.
Support Engineer Playbook and Common Issues
This section provides Support Engineers with essential commands and troubleshooting steps for debugging GitLab Duo Self-Hosted issues.
Essential Debugging Commands
Display AI Gateway Environment Variables
Check all AI Gateway environment variables to verify configuration:
docker exec -it <ai-gateway-container> env | grep AIGW
Key variables to verify:
-
AIGW_CUSTOM_MODELS__ENABLED
- must betrue
-
AIGW_GITLAB_URL
- should match your GitLab instance URL -
AIGW_GITLAB_API_URL
- should be accessible from the container -
AIGW_AUTH__BYPASS_EXTERNAL
- should only betrue
during troubleshooting
Verify User Permissions
Check if a user has the correct permissions for Code Suggestions with self-hosted models:
# In GitLab Rails console
user = User.find_by_id("<user_id>")
user.allowed_to_use?(:code_suggestions, service_name: :self_hosted_models)
Examine AI Gateway Client Logs
View AI Gateway client logs to identify connection issues:
docker logs <ai-gateway-container> | grep "Gitlab::Llm::AiGateway::Client"
View GitLab Logs for AI Gateway Requests
To see the actual requests made to the AI Gateway, use:
# View live logs
sudo gitlab-ctl tail | grep -E "(ai_gateway|llm\.log)"
# View specific log file with JSON formatting
sudo cat /var/log/gitlab/gitlab-rails/llm.log | jq '.'
# Filter for specific request types
sudo cat /var/log/gitlab/gitlab-rails/llm.log | jq 'select(.message)'
sudo cat /var/log/gitlab/gitlab-rails/llm.log | grep Llm::CompletionWorker | jq '.'
View AI Gateway Logs for Model Requests
To see the actual requests sent to the model:
# View AI Gateway container logs
docker logs <ai-gateway-container> 2>&1 | grep -E "(model|litellm|custom_openai)"
# For structured logs, if available
docker logs <ai-gateway-container> 2>&1 | grep "model_endpoint"
Common Configuration Issues and Solutions
/v1
Suffix in Model Endpoint
Missing Symptom: 404 errors when making requests to vLLM or OpenAI-compatible models
How to spot in logs:
# Look for 404 errors in AI Gateway logs
docker logs <ai-gateway-container> | grep "404"
Solution: Ensure the model endpoint includes the /v1
suffix:
- Incorrect:
http://localhost:4000
- Correct:
http://localhost:4000/v1
Certificate Validation Issues
Symptom: SSL certificate errors or connection failures
How to spot in logs:
# Look for SSL/TLS errors
sudo cat /var/log/gitlab/gitlab-rails/llm.log | grep -i "ssl\|certificate\|tls"
Validation: Verify certificate status - GitLab server must use a trusted certificate, as self-signed certificates are not supported.
Solution:
- Use trusted certificates for GitLab instance
- If using self-signed certificates, configure proper certificate paths in the AI Gateway container
Network Connectivity Issues
Symptom: Timeouts or connection refused errors
How to spot in logs:
# Look for network-related errors
docker logs <ai-gateway-container> | grep -E "(timeout|connection|refused|unreachable)"
Validation commands:
# Test from AI Gateway container to GitLab
docker exec -it <ai-gateway-container> curl "$AIGW_GITLAB_API_URL/projects"
# Test from AI Gateway container to model endpoint
docker exec -it <ai-gateway-container> curl "<model_endpoint>/health"
Authentication and Authorization Issues
Symptom: 401 Unauthorized or 403 Forbidden errors
How to spot in logs:
# Look for authentication errors
sudo cat /var/log/gitlab/gitlab-rails/llm.log | jq 'select(.status == 401 or .status == 403)'
Common causes:
- User doesn't have GitLab Duo Enterprise seat assigned
- License issues
- Incorrect AI Gateway URL configuration
Model Configuration Issues
Symptom: Model not responding or returning errors
How to spot in logs:
# Look for model-specific errors
docker logs <ai-gateway-container> | grep -E "(model_name|model_endpoint|litellm)"
Validation:
# Test model directly from AI Gateway container
docker exec -it <ai-gateway-container> sh
curl --request POST "<model_endpoint>/v1/chat/completions" \
--header 'Content-Type: application/json' \
--data '{"model": "<model_name>", "messages": [{"role": "user", "content": "Hello"}]}'
Log Analysis Workflow
Step 1: Enable Verbose Logging
Check if the expanded_ai_logging
feature flag is enabled, in GitLab Rails console:
Feature.enabled?(:expanded_ai_logging)
If it returns false
, enable the flag using:
Feature.enable(:expanded_ai_logging)
Step 2: Reproduce the Issue
Have the user reproduce the issue while monitoring logs:
# Terminal 1: Monitor GitLab logs
sudo gitlab-ctl tail | grep -E "(ai_gateway|llm\.log)"
# Terminal 2: Monitor AI Gateway logs
docker logs -f <ai-gateway-container>
Step 3: Analyze Request Flow
- GitLab to AI Gateway: Check if request reaches AI Gateway
- AI Gateway to Model: Verify model endpoint is called
- Response Path: Ensure response is properly formatted and returned
Step 4: Common Error Patterns
Error Pattern | Location | Likely Cause |
---|---|---|
Connection refused |
GitLab logs | AI Gateway not accessible |
404 Not Found |
AI Gateway logs | Missing /v1 in model endpoint |
401 Unauthorized |
GitLab logs | Authentication/license issues |
Timeout |
Either | Network or model performance issues |
SSL certificate verify failed |
GitLab logs | Certificate validation issues |
Quick Diagnostic Commands
AI Gateway Instance Commands:
1. Test AI Gateway health:
curl --silent --output /dev/null --write-out "%{http_code}" "<ai-gateway-url>/monitoring/healthz"
2. Check AI Gateway environment variables:
docker exec <ai-gateway-container> env | grep AIGW
3. Check AI Gateway logs for errors:
docker logs <ai-gateway-container> 2>&1 | grep --ignore-case error | tail --lines=20
GitLab Self-Managed Instance Commands:
4. Check user permissions (GitLab Rails console):
sudo gitlab-rails console
Then in the console:
User.find_by_id('<user_id>').can?(:access_code_suggestions)
5. Check GitLab LLM logs for errors:
sudo tail --lines=100 /var/log/gitlab/gitlab-rails/llm.log | grep --ignore-case error
6. Check feature flags:
sudo gitlab-rails console
Then:
Feature.enabled?(:expanded_ai_logging)
7. Test connectivity from GitLab to AI Gateway:
curl --verbose "<ai-gateway-url>/monitoring/healthz"
Emergency Diagnostic One-liner
For quick issue identification:
# Check all critical components at once
docker exec <ai-gateway-container> env | grep AIGW_CUSTOM_MODELS__ENABLED && \
curl --silent "<ai-gateway-url>/monitoring/healthz" && \
sudo tail --lines=10 /var/log/gitlab/gitlab-rails/llm.log | jq '.level'
Escalation Criteria
Escalate to Custom Models team when:
- All basic troubleshooting steps completed without resolution
- Model integration issues that require deep technical knowledge
- Feature not listed in self-hosted models unit primitives
- Suspected GitLab Duo platform bugs affecting multiple users
- Performance issues with specific model configurations