Hands-On Tutorial: ollama and Docker Integration

Photo by Dong Cheng on Unsplash

Hands-On Tutorial: ollama and Docker Integration

·

2 min read

Introduction

As you know, Ollama is a very easy-to-use tool for local LLM. It lets people run and experience the latest models with little effort, using simple commands such as:

  • ollama run llama3

  • ollama ps / ollama list

  • ollama serve

What's more groundbreaking is that it's now available on Docker. Here is the news.

And now you can use Docker to combine the popular open-webui and Ollama to build an easy-to-use, modern, data-controlled chatbot just for you.

Prerequisite

  • Install Ollama

  • Ensure good network conditions; the model file is usually large, so 100Mbps bandwidth is necessary

  • [Optional] Install Docker

DIY

Installing Ollama and running it through the command line on macOS is very easy.

# Install via Homebrew on macOS or use the setup package for Windows
brew install ollama

ollama run llama3

Running Ollama on Docker with Open-WebUI

# Run ollama
docker run -d --gpus=all -v 'D:\repos\ollama:/root/.ollama/models' -p 11434:11434 --name ollama ollama/ollama

# Run open-webui
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v D:\repos\open-webui:/app/backend/data --name open-webui ghcr.io/open-webui/open-webui:main

On the Windows platform, you can choose the model storage path by specifying the path in the system environment variables, called OLLAMA_MODELS.

End

Please note, it seems less efficient when running on Docker. While it does enable and accelerate with GPU(nVidia 4070Ti 12GB), switching and initializing the model is very slow (running the model on a host machine is super fast on my machine with a 7b llama3 model).

And, GPU acceleration running on Docker is not available on Apple Silicon.