Set up your self-hosted model infrastructure
DETAILS: Tier: For a limited time, Ultimate. On October 17, 2024, Ultimate with GitLab Duo Enterprise. Offering: Self-managed Status: Beta
- Introduced in GitLab 17.1 with a flag named
ai_custom_model
. Disabled by default.
FLAG: The availability of this feature is controlled by a feature flag. For more information, see the history.
By self-hosting the model, AI Gateway, and GitLab instance, there are no calls to external architecture, ensuring maximum levels of security.
To set up your self-hosted model infrastructure:
- Install the large language model (LLM) serving infrastructure.
- Configure your GitLab instance.
- Install the GitLab AI Gateway.
Install large language model serving infrastructure
Install one of the following GitLab-approved LLM models:
Model family | Model | Code completion | Code generation | GitLab Duo Chat |
---|---|---|---|---|
Mistral | Codestral 22B (see setup instructions) | {check-circle} Yes | {check-circle} Yes | {dotted-circle} No |
Mistral | Mistral 7B | {dotted-circle} No | {check-circle} Yes | {check-circle} Yes |
Mistral | Mixtral 8x22B | {dotted-circle} No | {check-circle} Yes | {check-circle} Yes |
Mistral | Mixtral 8x7B | {dotted-circle} No | {check-circle} Yes | {check-circle} Yes |
Mistral | Mistral 7B Text | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
Mistral | Mixtral 8x22B Text | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
Mistral | Mixtral 8x7B Text | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
Claude 3 | Claude 3.5 Sonnet | {check-circle} No | {check-circle} Yes | {check-circle} Yes |
The following models are under evaluation, and support is limited:
Model family | Model | Code completion | Code generation | GitLab Duo Chat |
---|---|---|---|---|
CodeGemma | CodeGemma 2b | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
CodeGemma | CodeGemma 7b-it (Instruction) | {dotted-circle} No | {check-circle} Yes | {dotted-circle} No |
CodeGemma | CodeGemma 7b-code (Code) | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
CodeLlama | Code-Llama 13b-code | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
CodeLlama | Code-Llama 13b | {dotted-circle} No | {check-circle} Yes | {dotted-circle} No |
DeepSeekCoder | DeepSeek Coder 33b Instruct | {check-circle} Yes | {check-circle} Yes | {dotted-circle} No |
DeepSeekCoder | DeepSeek Coder 33b Base | {check-circle} Yes | {dotted-circle} No | {dotted-circle} No |
GPT | GPT-3.5-Turbo | {check-circle} No | {dotted-circle} Yes | {dotted-circle} No |
GPT | GPT-4 | {check-circle} No | {dotted-circle} Yes | {dotted-circle} No |
GPT | GPT-4 Turbo | {check-circle} No | {dotted-circle} Yes | {dotted-circle} No |
GPT | GPT-4o | {check-circle} No | {dotted-circle} Yes | {dotted-circle} No |
GPT | GPT-4o-mini | {check-circle} No | {dotted-circle} Yes | {dotted-circle} No |
Use a serving architecture
To host your models, you should use:
- For non-cloud on-premise deployments, vLLM.
- For cloud deployments, AWS Bedrock or Azure as a cloud providers.
Configure your GitLab instance
Prerequisites:
- Upgrade to the latest version of GitLab.
-
The GitLab instance must be able to access the AI Gateway.
-
Where your GitLab instance is installed, update the
/etc/gitlab/gitlab.rb
file.sudo vim /etc/gitlab/gitlab.rb
-
Add and save the following environment variables.
gitlab_rails['env'] = { 'GITLAB_LICENSE_MODE' => 'production', 'CUSTOMER_PORTAL_URL' => 'https://customers.gitlab.com', 'AI_GATEWAY_URL' => '<path_to_your_ai_gateway>:<port>' }
-
Run reconfigure:
sudo gitlab-ctl reconfigure
-
GitLab AI Gateway
Install the GitLab AI Gateway.
Enable logging
Prerequisites:
- You must be an administrator for your self-managed instance.
To enable logging and access the logs, enable the feature flag:
Feature.enable(:expanded_ai_logging)
Disabling the feature flag stops logs from being written.
Logs in your GitLab installation
In your instance log directory, a file called llm.log
is populated.
For more information on:
- Logged events and their properties, see the logged event documentation.
- How to rotate, manage, export and visualize the logs in
llm.log
, see the log system documentation.
Logs in your AI Gateway container
To specify the location of logs generated by AI Gateway, run:
docker run -e AIGW_GITLAB_URL=<your_gitlab_instance> \
-e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
-e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
-e AIGW_LOGGING__TO_FILE="aigateway.log" \
-v <your_file_path>:"aigateway.log"
<image>
If you do not specify a file name, logs are streamed to the output.
Additionally, the outputs of the AI Gateway execution can also be useful for debugging issues. To access them:
-
When using Docker:
docker logs <container-id>
-
When using Kubernetes:
kubectl logs <container-name>
To ingest these logs into the logging solution, see your logging provider documentation.
Logs in your inference service provider
GitLab does not manage logs generated by your inference service provider. Please refer to the documentation of your inference service provider on how to use their logs.
Cross-referencing logs between AI Gateway and GitLab
The property correlation_id
is assigned to every request and is carried across different components that respond to a
request. For more information, see the documentation on finding logs with a correlation ID.
Correlation ID is not available in your model provider logs.
Troubleshooting
First, run the debugging scripts to verify your self-hosted model setup.
For more information on other actions to take, see the troubleshooting documentation.