最後更新: 2024-05-07
介紹
它是一個可以在本機運行知名 LLM (i.e. Llama 3, Phi 3, Mistral, Gemma) 的軟件
HomePage
Memory requirement
* At least
Models RAM 7B 8 GB 13B 16 GB 33B 32 GB
Notes
Llama 3(8B) File Size: 4.7GB # ollama run llama3 Llama 3(70B) File Size: 40GB # ollama run llama3:70b
目錄
Manual Install
Get binary filie
# 292M
curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama
chmod +x /usr/bin/ollama
ollama -v
ollama version is 0.1.33
Startup service
/etc/systemd/system/ollama.service
[Unit] Description=Ollama Service After=network-online.target [Service] ExecStart=/usr/bin/ollama serve User=ollama Group=ollama Restart=always RestartSec=3 [Install] WantedBy=default.target
systemctl enable ollama --now
ps aux | grep ollama
ollama 244 0.0 2.8 2047884 295992 ? Ssl 16:22 0:05 /usr/bin/ollama serve
CLI
Pull a model
# Ollama 可下載的 LLM https://ollama.com/library
ollama pull llama3
# 沒有注明 size (:7b) 的話就會下載 default 那個 size
ollama pull codellama:7b
ollama pull codellama:13b-python
Remove a model
ollama rm llama3
Copy a model
ollama cp llama3 my-model
Find Model File Location
ollama show codellama:13b --modelfile
FROM /usr/share/ollama/.ollama/models/blobs/sha256-...
tree /usr/share/ollama/.ollama/models/
/usr/share/ollama/.ollama/models/ ├── blobs │ ├── sha256-2e0493f67d0c8c9c68a8aeacdf6a38a2151cb3c4c1d42accf296e19810527988 │ ├── sha256-35e261fb2c733bb8c50e83e972781ec3ee589c83944aaac582e13ba679ac3592 │ ├── sha256-590d74a5569b8a20eb2a8b0aa869d1d1d3faf6a7fdda1955ae827073c7f502fc │ ├── sha256-7f6a57943a88ef021326428676fe749d38e82448a858433f41dae5e05ac39963 │ ├── sha256-8c17c2ebb0ea011be9981cc3922db8ca8fa61e828c5d3f44cb6ae342bf80460b │ └── sha256-e73cc17c718156e5ad34b119eb363e2c10389a503673f9c36144c42dfde8334c └── manifests └── registry.ollama.ai └── library └── codellama └── 13b
# 查看 mediaType, digest
cat manifests/registry.ollama.ai/library/codellama/13b | jq
Run AI Bot
i.e.
ollama run llama3
ollama run codellama:7b
Multiline input
>>> """Hello, ... world! ... """
/x
- /? Help
- /set Set session variables
- /show Show model information
- /load <model> Load a session or model
- /save <model> Save your current session
- /clear Clear session context
- /bye Exit
/show info
Model details: Family qwen2 Parameter Size 8B Quantization Level Q4_0
REST API
* ollama default api port: 11434/tcp
更改 CLI client 連 Server 的 port
export OLLAMA_HOST="server:port"
Generate a response
curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?" }'
Chat with a model
curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [ { "role": "user", "content": "why is the sky blue?" } ] }'
Allow listening on all local interfaces
/etc/systemd/system/ollama.service
[Service] ... # Environment="OLLAMA_HOST=0.0.0.0:8080" # 使用 Default 的 11434 Environment="OLLAMA_HOST=0.0.0.0"
systemctl daemon-reload
systemctl restart ollama