Ollama
CLI and API for running open models locally with simple pull-and-run commands.
Quick facts
- Price model
- Open source
- Starting price
- Free
- Best for
- Developers · Local API endpoints · Model experimentation
- Replaces
- OpenAI API for local tasks, Hosted LLM sandboxes
- Platforms
- MacWindowsLinux
- Last verified
- 2026-06-22
Why it's listed
The standard way to wire local models into your toolchain without cloud API invoices.
Ollama packages models into easy installs (`ollama run llama3`) and exposes a local API for apps like Continue, Open WebUI, and custom scripts. No token billing—just your electricity and hardware.
The catch
Jump to setup guide ↓Command-line first; less hand-holding than LM Studio for non-technical users.
How to set up Ollama
Minimum path to a working local model you can chat with in the terminal and wire into other apps—no OpenAI invoice.
- Time
- 20–30 min
- Difficulty
- Moderate
- Verified
- 2026-06-22
Before you start
- 8GB RAM minimum; 16GB+ recommended for useful models
- macOS, Windows, or Linux
- ~5–10 GB free disk per model you pull
Install Ollama
Download the installer from ollama.com for your OS and run it. On Mac/Linux you can also use: curl -fsSL https://ollama.com/install.sh | sh. Confirm it works: ollama --version
Pull your first model
Start small: ollama pull llama3.2:3b (fast on laptops) or ollama pull llama3.2 (better answers, needs more RAM). Downloads once—no per-message fee.
Chat in the terminal
Run ollama run llama3.2:3b and ask a real question. Type /bye to exit. If responses are gibberish or hang, your model is too large for your RAM—pull a smaller tag.
Confirm the local API
Ollama serves http://localhost:11434. Test: curl http://localhost:11434/api/tags — you should see your pulled models listed. This endpoint is what Continue, Open WebUI, and scripts connect to.
Optional — add a chat UI
If you want a browser interface, install Open WebUI (listed in our directory) and point it at localhost:11434. Keep Ollama running in the background.
Troubleshooting
- Model download fails or is very slow
- Check disk space and retry. On corporate networks, try off-hours or a home connection.
- Runs but answers are empty or cut off
- RAM is the bottleneck. Pull a smaller quant (e.g. :3b) or close other apps.
- Port 11434 already in use
- Another Ollama instance is running. Quit duplicate installs or change OLLAMA_HOST in env docs.
Keep it working
- ollama pull <model> periodically to update weights—still no subscription
- Remove unused models: ollama rm <name> to reclaim disk
- Set models to load on boot only if you use them daily (saves RAM otherwise)
Official docs: github.com/ollama/ollama/blob/main/docs/README.md
Good fit for
- Developers
- Self-hosters
- Automation builders
Not ideal for
- Non-technical users who want a polished chat UI out of the box
Alternatives
LM Studio
Run local LLMs on your Mac or PC with a friendly desktop app—no API subscription required.
Replaces: ChatGPT Plus, Claude Pro…
Jan
Open-source ChatGPT-style app that runs models locally on your device.
Replaces: ChatGPT Plus, Poe subscription
Open WebUI
Self-hosted web interface for chatting with local or BYOK-connected models.
Replaces: ChatGPT Team, Custom GPT enterprise tiers
TypingMind
Premium front-end for your own API keys—better UX without another model subscription.
Replaces: ChatGPT Plus for power users, Poe