VisibleThread -
Help Center Find helpful articles on different VisibleThread Products

Follow

How to configure the VT Writer LLM server

Deployment Overview

The VisibleThread LLM is a general-purpose large language model, which can be deployed internally for use with the VT Writer AI application. The service uses Ollama to run the Llama 3 8b language model.

 

Deployment Options

The VT LLM service has been tested on both Red Hat 8 and Ubuntu 20. Deployment options for Windows Server are currently under evaluation.

VisibleThread LLM requires infrastructure with access to GPU and compatibility with NVIDIA Cuda drivers. Below are the high-level steps for deploying the LLM and specific examples for Azure and AWS environments.

 

Prerequisites

You must have a running RHEL8 or Ubuntu server with a compatible GPU. This server must have access to the internet to install Ollama as described here: https://github.com/ollama/ollama?tab=readme-ov-file

 

General Deployment Steps

1. Download and install the correct NVIDIA drivers** for your OS/Hardware. Detailed steps for installing NVIDIA drivers are here: https://docs.nvidia.com/datacenter/tesla/tesla-installation-notes/index.html

2. Download and configure Ollama and the Mistral model.

 

Deployment on Azure

Deployment on Azure was tested on an Azure Standard_NC4as_T4_v3 Virtual Machine.

 

Steps to Deploy on Azure

1. Provision an Azure Standard_NC4as_T4_v3 instance running RHEL 8.8 or Ubuntu 20.04.
2. Ensure access to the instance on:
- Port 22 (SSH)
- Port 11434 (HTTP for Ollama)
3. Install the required Nvidia drivers and Ollama.

 

Installing GPU Drivers on RHEL 8

# Step 1: Configure GPU drivers
sudo rpm -Uvh https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo yum install -y dkms
sudo wget https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo -O /etc/yum.repos.d/cuda-rhel8.repo
sudo yum install -y cuda-drivers

# Verify drivers are installed
nvcc --version

# Step 2: Install and configure Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama run llama3

# Step 3: Configure Ollama service
sudo mkdir -p /etc/systemd/system/ollama.service.d
sudo bash -c 'echo -e "[Service]\nEnvironment=\"OLLAMA_HOST=0.0.0.0:11434\"" > /etc/systemd/system/ollama.service.d/environment.conf'
sudo systemctl daemon-reload
sudo systemctl restart ollama
journalctl -f -u ollama -n 1000 --no-pager

 

 

Deployment on AWS

1. Provision a EC2 instance from the Amazon/Deep Learning OSS Nvidia Driver AMI GPU TensorFlow 2.15 (Amazon Linux 2) 20240213 AMI
2. The instance type should be g4dn.2xlarge
3. Ensure you have access to the instance on port 22 (for ssh) and port 11434 (http port for ollama)
4. SSH into the system and type the following:

 

#GPU drivers should be automatically be installed on the AMI, to ensure they are up to #date, perform an update and reboot. sudo yum update -y sudo reboot #verify drivers are running 
nvcc --version

# install and configure ollama
curl -fsSL https://ollama.com/install.sh | sh

#configure Ollama
ollama run mistral # this will download the model, which will take some time as it is 4+ GB

# create the service
sudo mkdir -p /etc/systemd/system/ollama.service.d
sudo bash -c 'echo -e "[Service]\nEnvironment=\"OLLAMA_HOST=0.0.0.0:11434\"" > /etc/systemd/system/ollama.service.d/environment.conf'
sudo systemctl daemon-reload

# restart the service
sudo systemctl restart ollama

# check logs
journalctl -f -u ollama -n 1000 --no-pager

 

Deployment on Windows


Deployment on Windows is via Ollama Windows installer https://ollama.com/blog/windows-preview


This is currently being evaluated by the VisibleThread engineering team. Contact our support team at support@visiblethread.com if you wish to deploy the VisibleThread LLM on windows.


Verifying the install


To verify the deployment was successful, from a different machine type:

# on Linux
curl http://<ip address>:11434/api/generate -d '{"model":"mistral","system":"","prompt":"","template":""}'

 

You should receive a response similar to:

 

{"model":"mistral","created_at":"2024-03- 05T14:47:21.370491701Z","response":"","done":true}
Was this article helpful?
0 out of 0 found this helpful

Get Additional Help

Visit our Helpdesk for additional help and support.