Ollama

最後更新: 2024-05-07

介紹

它是一個可以在本機運行知名 LLM (i.e. Llama 3, Phi 3, Mistral, Gemma) 的軟件

HomePage

Memory requirement

 * At least

Models   RAM
7B        8 GB
13B      16 GB
33B      32 GB

Notes

Llama 3(8B)      File Size: 4.7GB      # ollama run llama3
Llama 3(70B)     File Size: 40GB       # ollama run llama3:70b

目錄

 


Manual Install

 

Get binary filie

# 292M

curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama

chmod +x /usr/bin/ollama

ollama -v

ollama version is 0.1.33

Startup service

/etc/systemd/system/ollama.service

[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=/usr/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3

[Install]
WantedBy=default.target

systemctl enable ollama --now

ps aux | grep ollama

ollama       244  0.0  2.8 2047884 295992 ?      Ssl  16:22   0:05 /usr/bin/ollama serve

 


CLI

 

Pull a model

# Ollama 可下載的 LLM https://ollama.com/library

ollama pull llama3

# 沒有注明 size (:7b) 的話就會下載 default 那個 size

ollama pull codellama:7b

ollama pull codellama:13b-python

Remove a model

ollama rm llama3

Copy a model

ollama cp llama3 my-model

Find Model File Location

ollama show codellama:13b --modelfile

FROM /usr/share/ollama/.ollama/models/blobs/sha256-...

tree /usr/share/ollama/.ollama/models/

/usr/share/ollama/.ollama/models/
├── blobs
│   ├── sha256-2e0493f67d0c8c9c68a8aeacdf6a38a2151cb3c4c1d42accf296e19810527988
│   ├── sha256-35e261fb2c733bb8c50e83e972781ec3ee589c83944aaac582e13ba679ac3592
│   ├── sha256-590d74a5569b8a20eb2a8b0aa869d1d1d3faf6a7fdda1955ae827073c7f502fc
│   ├── sha256-7f6a57943a88ef021326428676fe749d38e82448a858433f41dae5e05ac39963
│   ├── sha256-8c17c2ebb0ea011be9981cc3922db8ca8fa61e828c5d3f44cb6ae342bf80460b
│   └── sha256-e73cc17c718156e5ad34b119eb363e2c10389a503673f9c36144c42dfde8334c
└── manifests
    └── registry.ollama.ai
        └── library
            └── codellama
                └── 13b

# 查看 mediaType, digest

cat manifests/registry.ollama.ai/library/codellama/13b | jq

Run AI Bot

i.e.

ollama run llama3

ollama run codellama:7b

Multiline input

>>> """Hello,
... world!
... """

/x

  • /?       Help
  • /set            Set session variables
  • /show           Show model information
  • /load <model>   Load a session or model
  • /save <model>   Save your current session
  • /clear          Clear session context
  • /bye            Exit

/show info

Model details:
Family              qwen2
Parameter Size      8B
Quantization Level  Q4_0

 


REST API

 

 * ollama default api port: 11434/tcp

更改 CLI client 連 Server 的 port

export OLLAMA_HOST="server:port"

Generate a response

curl http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt":"Why is the sky blue?"
}'

Chat with a model

curl http://localhost:11434/api/chat -d '{
  "model": "llama3",
  "messages": [
    { "role": "user", "content": "why is the sky blue?" }
  ]
}'

Allow listening on all local interfaces

/etc/systemd/system/ollama.service

[Service]
...
# Environment="OLLAMA_HOST=0.0.0.0:8080"
# 使用 Default 的 11434
Environment="OLLAMA_HOST=0.0.0.0"

systemctl daemon-reload

systemctl restart ollama

 

 

 

 

 

Creative Commons license icon Creative Commons license icon