Ollama Security Hardening: Practical Guide for Cloud Deployments
Introduction
Ollama makes running large language models locally simple—but default setups can leave your cloud VM wide open for abuse, prompt injection, or supply-chain attacks. This guide delivers a concise, developer-focused checklist for hardening Ollama, blocking common attack vectors, and keeping your model deployment secure.
Threat Model: What Are You Defending Against?
- Unauthenticated access (default binds to
0.0.0.0:11434) - Prompt injection and model abuse
- Data exfiltration via prompts
- Lateral movement after VM compromise
- Supply-chain risk from arbitrary model pulls
- Denial of service (DoS) via resource exhaustion
1. Network Exposure: First Kill Switch
Lock Ollama to localhost
Bind Ollama to 127.0.0.1 so it’s not externally reachable.
|
|
For persistent setup with systemd:
|
|
|
|
Firewall Rules
Block external access even if bound locally.
|
|
For cloud providers, never expose port 11434 publicly via security groups.
2. Reverse Proxy: Authentication and TLS
Put Nginx or Caddy in front to add TLS and authentication.
Nginx Example with Basic Auth
|
|
API Key Authentication
Add token-based authentication:
|
|
3. OS-Level Hardening
Run Ollama as a Non-Root User
|
|
Systemd Sandboxing
Increase isolation with systemd options:
|
|
File Permissions
Limit access to Ollama data:
|
|
4. Model Supply Chain Control
Verify Models Before Pulling
Don’t blindly pull models. Pin exact versions, verify sources, and mirror internally if possible.
|
|
Airgap for Sensitive Environments
- Download models once
- Serve from internal registry
- Block outbound internet
5. Prompt Injection and Abuse Mitigation
Ollama offers no built-in prompt safety. Add middleware to filter risky prompts.
Middleware Example
Wrap API requests with FastAPI or Flask:
|
|
Output Filtering
- Regex-based filters
- Classification models
- Response allowlists
6. Logging and Monitoring
Enable Proxy-Level Logging
|
|
Monitor
- Request rate
- Prompt length spikes
- Unusual token activity
Fail2ban for Basic Protection
|
|
7. Resource Controls: Prevent DoS
Limit CPU/Memory Usage
Temporary:
|
|
Persistent (systemd):
|
|
8. Isolation Strategy
Option A: Docker
Run Ollama fully isolated.
|
|
Option B: Dedicated VPC Subnet
- No internet egress
- Only proxy can access Ollama
9. Red-Team Yourself: Quick Security Checks
Test your setup:
|
|
Try:
- Large prompt DoS
- Jailbreak prompts
- File exfiltration attempts
Minimal Secure Setup: 80/20 Rule
- Bind Ollama to localhost
- Add Nginx with TLS + authentication
- Block port 11434 externally
- Run as non-root
- Enable request logging
This covers most attack vectors with minimal effort.
Advanced Security: For Production
- Mutual TLS (mTLS) between proxy and backend
- OAuth2 proxy integration (Google, Okta)
- WAF rules (Cloudflare, AWS WAF)
- Rate limiting (Nginx):
|
|
Conclusion
Securing Ollama on a cloud VM is straightforward if you follow this actionable checklist: lock down network exposure, enforce authentication and TLS, sandbox your process, control model supply, and monitor for abuse. For mission-critical or public-facing deployments, add advanced isolation, authentication, and rate limiting. Harden now—avoid being someone else’s testbed.