Set up your self-hosted model infrastructure

DETAILS: Tier: For a limited time, Ultimate. On October 17, 2024, Ultimate with GitLab Duo Enterprise. Offering: Self-managed Status: Beta

Introduced in GitLab 17.1 with a flag named ai_custom_model. Disabled by default.

FLAG: The availability of this feature is controlled by a feature flag. For more information, see the history.

By self-hosting the model, AI Gateway, and GitLab instance, there are no calls to external architecture, ensuring maximum levels of security.

To set up your self-hosted model infrastructure:

Install the large language model (LLM) serving infrastructure.
Configure your GitLab instance.
Install the GitLab AI Gateway.

Install large language model serving infrastructure

Install one of the following GitLab-approved LLM models:

Model family	Model	Code completion	Code generation	GitLab Duo Chat
Mistral	Codestral 22B (see setup instructions)	{check-circle} Yes	{check-circle} Yes	{dotted-circle} No
Mistral	Mistral 7B	{dotted-circle} No	{check-circle} Yes	{check-circle} Yes
Mistral	Mixtral 8x22B	{dotted-circle} No	{check-circle} Yes	{check-circle} Yes
Mistral	Mixtral 8x7B	{dotted-circle} No	{check-circle} Yes	{check-circle} Yes
Mistral	Mistral 7B Text	{check-circle} Yes	{dotted-circle} No	{dotted-circle} No
Mistral	Mixtral 8x22B Text	{check-circle} Yes	{dotted-circle} No	{dotted-circle} No
Mistral	Mixtral 8x7B Text	{check-circle} Yes	{dotted-circle} No	{dotted-circle} No
Claude 3	Claude 3.5 Sonnet	{check-circle} No	{check-circle} Yes	{check-circle} Yes

The following models are under evaluation, and support is limited:

Model family	Model	Code completion	Code generation	GitLab Duo Chat
CodeGemma	CodeGemma 2b	{check-circle} Yes	{dotted-circle} No	{dotted-circle} No
CodeGemma	CodeGemma 7b-it (Instruction)	{dotted-circle} No	{check-circle} Yes	{dotted-circle} No
CodeGemma	CodeGemma 7b-code (Code)	{check-circle} Yes	{dotted-circle} No	{dotted-circle} No
CodeLlama	Code-Llama 13b-code	{check-circle} Yes	{dotted-circle} No	{dotted-circle} No
CodeLlama	Code-Llama 13b	{dotted-circle} No	{check-circle} Yes	{dotted-circle} No
DeepSeekCoder	DeepSeek Coder 33b Instruct	{check-circle} Yes	{check-circle} Yes	{dotted-circle} No
DeepSeekCoder	DeepSeek Coder 33b Base	{check-circle} Yes	{dotted-circle} No	{dotted-circle} No
GPT	GPT-3.5-Turbo	{check-circle} No	{dotted-circle} Yes	{dotted-circle} No
GPT	GPT-4	{check-circle} No	{dotted-circle} Yes	{dotted-circle} No
GPT	GPT-4 Turbo	{check-circle} No	{dotted-circle} Yes	{dotted-circle} No
GPT	GPT-4o	{check-circle} No	{dotted-circle} Yes	{dotted-circle} No
GPT	GPT-4o-mini	{check-circle} No	{dotted-circle} Yes	{dotted-circle} No

Use a serving architecture

To host your models, you should use:

For non-cloud on-premise deployments, vLLM.
For cloud deployments, AWS Bedrock or Azure as a cloud providers.

Configure your GitLab instance

Prerequisites:

Upgrade to the latest version of GitLab.

The GitLab instance must be able to access the AI Gateway.

Where your GitLab instance is installed, update the /etc/gitlab/gitlab.rb file.
```
sudo vim /etc/gitlab/gitlab.rb
```

Add and save the following environment variables.

gitlab_rails['env'] = {
'GITLAB_LICENSE_MODE' => 'production',
'CUSTOMER_PORTAL_URL' => 'https://customers.gitlab.com',
'AI_GATEWAY_URL' => '<path_to_your_ai_gateway>:<port>'
}

Run reconfigure:
```
sudo gitlab-ctl reconfigure
```

GitLab AI Gateway

Install the GitLab AI Gateway.

Enable logging

Prerequisites:

You must be an administrator for your self-managed instance.

To enable logging and access the logs, enable the feature flag:

Feature.enable(:expanded_ai_logging)

Disabling the feature flag stops logs from being written.

Logs in your GitLab installation

In your instance log directory, a file called llm.log is populated.

For more information on:

Logged events and their properties, see the logged event documentation.
How to rotate, manage, export and visualize the logs in llm.log, see the log system documentation.

Logs in your AI Gateway container

To specify the location of logs generated by AI Gateway, run:

docker run -e AIGW_GITLAB_URL=<your_gitlab_instance> \
 -e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
 -e AIGW_GITLAB_API_URL=https://<your_gitlab_domain>/api/v4/ \
 -e AIGW_LOGGING__TO_FILE="aigateway.log" \
 -v <your_file_path>:"aigateway.log"
 <image>

If you do not specify a file name, logs are streamed to the output.

Additionally, the outputs of the AI Gateway execution can also be useful for debugging issues. To access them:

When using Docker:
```
docker logs <container-id>
```
When using Kubernetes:
```
kubectl logs <container-name>
```

To ingest these logs into the logging solution, see your logging provider documentation.

Logs in your inference service provider

GitLab does not manage logs generated by your inference service provider. Please refer to the documentation of your inference service provider on how to use their logs.

Cross-referencing logs between AI Gateway and GitLab

The property correlation_id is assigned to every request and is carried across different components that respond to a request. For more information, see the documentation on finding logs with a correlation ID.

Correlation ID is not available in your model provider logs.

Troubleshooting

First, run the debugging scripts to verify your self-hosted model setup.

For more information on other actions to take, see the troubleshooting documentation.