最後更新: 2024-10-10
目錄
- Llama
- Code Llama
-
Llama2-Chinese
---- -
Gemma
---- -
Pi
---- -
Qwen (中文)
---- -
mixtral
---- (寫 code 用) - Yi-Coder
- Starcoder
- deepseek-coder
Llama
Llama 3.1
- 8B, 4.7G
- 70B, 40G
- 405B, 229G
Context length: 128K
Llama 3
有 8b(4.7G), 70b(40), 405b(229G) 版本公開了
ollama pull llama3 # 預設取得 8b 版本
Code Llama
Web
Code Llama is a code-specialized version of Llama 2
that was created by further training Llama 2
on its code-specific datasets, sampling more data from that same dataset for longer.
Size
- 70B 131GB
- 34B 63GB
- 13B 24GB
- 7B ~12.55GB
分支
i.e.
- 7b-instruct # natural language
- 7b-code # Base model for code completion
- 7b-python # fine-tuned on 100B tokens of Python code
Example prompts
- Instruct(default)
- Code completion
- Python
Instruct
# It trained to output human-like answers to questions(closest to ChatGPT)
ollama run codellama "Where is the bug in this code? $(cat fib.py)"
ollama run codellama "write a unit test for this function: $(cat fib.py)"
ollama run codellama 'You are an expert programmer that writes simple,
concise code and explanations. Write a python function to generate the nth fibonacci number.'
Code completion
Generate by comment
# generate subsequent tokens based on the provided prompt
ollama run codellama:7b-code '# A simple python function to remove whitespace from a string:'
Fill-in-the-middle (FIM)
# model can complete code between two already written code blocks.
Format: <PRE> {prefix} <SUF>{suffix} <MID>
i.e.
def compute_gcd(x, y): <FILL> return result
相當於
ollama run codellama:7b-code '<PRE> def compute_gcd(x, y): <SUF>return result <MID>'
Python
fine-tuned on 100B additional Python tokens
Llama2-Chinese
Web
Gemma
它是 Gemmi 的 Open Source 版本, 由 Google 開發.
Model: lightweight text2text model
https://ai.google.dev/gemma
gemma2
https://ollama.com/library/gemma2
- 2b # 1.6GiB
- 9b # 5.4 GiB (Default)
- 27b # 16GiB
gemma(v1)
https://ollama.com/library/gemma
它共有2個版本
- 2b # 建議用於 Mobile devices (1.7 GiB)
- 7b # 建議用於 Desktop computers (5 GiB)
Pi
3.5
3.8b, 2.2G
token context length: 128K
Optimization
- supervised fine-tuning
- proximal policy optimization
- direct preference optimization
v3
一共有兩個被本 3B (2.2G) and 14B (7.9G)
Qwen
通义千问, By Alibaba Cloud
Qwen2
https://ollama.com/library/qwen2
- 7b (Default) # 4.4 GiB
- 72b # 41 GiB
Qwen 1.5
6 model sizes, including 0.5B, 1.8B, 4B (default), 7B, 14B, 32B (new) and 72B
A set of Mixture of Experts (MoE) model with open weights by Mistral AI in 8x7b and 8x22b parameter sizes.
- It has strong maths and coding capabilities
- It is natively capable of function calling
- 64K tokens context window allows precise information recall from large documents
Yi-Coder
Link
Supporting 52 major programming languages.
context length of 128K tokens.
功能
- Code Completion
- Code Insertion
- Repo Q&A
- A Powerful Natural Language to SQL Converter
Size
- 9b(default) # 5G
- 1.5b # 866M
Usage Example
System Prompt:
You are Yi-Coder, you are exceptionally skilled in programming, coding, and any computer-related issues.
[1]
Write a quick sort algorithm.
[2] To identify errors and insert the correct code to fix them
prompt = """ ```python def quick_sort(arr): if len(arr) <= 1: return arr else: pivot = arr[len(arr) // 2] left = [x for x in arr if x < pivot] right = [x for x in arr if x > pivot] return quick_sort(left) + middle + quick_sort(right) print(quick_sort([3,6,8,10,1,2,1])) # Prints "[1, 1, 2, 3, 6, 8, 10]" ``` Is there a problem with this code? """
[3]
key components:
- NL2SQLConverter
- DatabaseManager
- Main Function
Count the number of orders for each city Who are the top 5 users with the most orders
Starcoder
transparently trained open code
starcoder2
https://github.com/bigcode-project/starcoder2
a context window of 16,384 tokens, with sliding window attention of 4,096 tokens.
StarCoder2 models are intended for code completion,
they are not instruction models and commands like
"Write a function that computes the square root."
do not work well.
- 15b # 9.1G (trained on 600+ programming languages)
- 7b # 4G (17 languages)
- 3b(default) # 1.7G (17 languages)
- instruct # 9.1G (follows natural and human-written instructions)
deepseek-coder
deepseek-coder-v2
https://ollama.com/library/deepseek-coder-v2
- 16b (Default) # 9 GiB
- 236b # 133 GiB