Configure Ollama
Introduction
When running Ollama on a server, you may need to configure various aspects of the deployment - from changing model storage locations to adjusting performance parameters. This guide covers common server-side administration tasks:
- Understanding Ollama environment variables and configuration options
- Changing the model storage location (useful for disk space management)
- Configuring Ollama via systemd service files
- Migrating existing models to new storage locations
For information on connecting to a remote Ollama instance from your local machine, see Connecting to Remote Ollama Servers with SSH Tunneling.
showing Environment variables
ollama serve --help
Start ollama
Usage:
ollama serve [flags]
Aliases:
serve, start
Flags:
-h, --help help for serve
Environment Variables:
OLLAMA_DEBUG Show additional debug information (e.g. OLLAMA_DEBUG=1)
OLLAMA_HOST IP Address for the ollama server (default 127.0.0.1:11434)
OLLAMA_CONTEXT_LENGTH Context length to use unless otherwise specified (default: 4096)
OLLAMA_KEEP_ALIVE The duration that models stay loaded in memory (default "5m")
OLLAMA_MAX_LOADED_MODELS Maximum number of loaded models per GPU
OLLAMA_MAX_QUEUE Maximum number of queued requests
OLLAMA_MODELS The path to the models directory
OLLAMA_NUM_PARALLEL Maximum number of parallel requests
OLLAMA_NOPRUNE Do not prune model blobs on startup
OLLAMA_ORIGINS A comma separated list of allowed origins
OLLAMA_SCHED_SPREAD Always schedule model across all GPUs
OLLAMA_FLASH_ATTENTION Enabled flash attention
OLLAMA_KV_CACHE_TYPE Quantization type for the K/V cache (default: f16)
OLLAMA_LLM_LIBRARY Set LLM library to bypass autodetection
OLLAMA_GPU_OVERHEAD Reserve a portion of VRAM per GPU (bytes)
OLLAMA_LOAD_TIMEOUT How long to allow model loads to stall before giving up (default "5m")
showing the current model folder size
sudo du -h -d 1 /usr/share/ollama/.ollama/models
Why Change the Model Storage Location?
You might want to change Ollama’s model storage location for several reasons:
- Disk space: Models can be large (7B models ~4GB, 70B models ~40GB+)
- Performance: Move models to faster storage (NVMe SSD vs HDD)
- Organization: Separate system and data partitions
Edit Config file in systemd
sudo vi /etc/systemd/system/ollama.service
add the following line to point to the new location
Environment="OLLAMA_MODELS=/your/desired/path" # point to the new location
make sure to change the ownership of the new path
sudo chown ollama:ollama <new model path>
and move existing models to the new location
sudo -u ollama rsync -av --ignore-existing /usr/share/ollama/.ollama/models/ <new model path>/
restart the service
sudo systemctl daemon-reload
sudo systemctl restart ollama
double check if the service runs.
systemctl status ollama
ollama list
Verify Models Work After Migration
After moving models, test that they’re accessible and functioning properly:
# List all available models
ollama list
# Test a model with a simple prompt
ollama run <model-name> "test prompt"
If the models are listed and respond correctly, the migration was successful!
