How To Use Ollama
📖 Bu rehber ToolPazar ekibi tarafından hazırlanmıştır. Tüm araçlarımız ücretsiz ve reklamsızdır.
What Ollama actually is
On macOS or Linux, a single curl command gets you the binary:
Installing Ollama
On Windows, grab the installer from ollama.com. On Linux servers, the install script also registers a systemd unit so the daemon survives reboots. Verify the install:
Pulling and running your first model
Pick a model based on your RAM. For a 16GB laptop, Llama 3.1 8B quantized to Q4 is the sweet spot. For 8GB machines, drop to Phi-3 Mini or Qwen 2.5 3B. For 32GB+, Mistral Small or Llama 3.1 70B (heavily quantized) become viable.
Using the HTTP API
The first run streams tokens to your terminal. Subsequent runs reuse the loaded model from memory until it idles out (five minutes by default).
Picking the right quantization
With the OpenAI SDK, just swap the base URL and use any string for the API key: