Size (B) | Speed (T/s) | Model | Type | Quant | Spec Dec (B) | Spec Quant |
---|---|---|---|---|---|---|
1.5 | 282 | qwen 2.5 | MLX | 4 | - | - |
1.5 | 76 | qwen 2.5 | MLX | 8 | - | - |
7 | 70 | qwen 2.5 | GUFF | Q4_K_M | - | - |
7 | 101 | qwen 2.5 | MLX | 4 | - | - |
7 | 58 | qwen 2.5 | MLX | 8 | - | - |
12 | 35 | wayfarer | GUFF | Q6_K | - | - |
12 | 65 | wayfarer | MLX | 4 | - | - |
12 | 45 | wayfarer | MLX | 6 | - | - |
12 | 36 | wayfarer | MLX | 8 | - | - |
14 | 36 | qwen 2.5 | GUFF | Q4_K_M | - | - |
14 | 52 | qwen 2.5 | MLX | 4 | - | - |
14 | 55 | qwen 2.5 | MLX | 4 | 1.5 | 4 |
14 | 30 | qwen 2.5 | MLX | 8 | - | - |
24 | 35 | mistral small 3 | MLX | 4 | - | - |
32 | 18 | qwen 2.5 | GUFF | Q4_K_M | - | - |
32 | 23 | qwen 2.5 | MLX | 4 | - | - |
32 | 30 | qwen 2.5 | MLX | 4 | 1.5 | 4 |
32 | 30 | qwen 2.5 | MLX | 4 | 1.5 | 4 |
32 | 34 | qwen 2.5 | MLX | 4 | 1.5 | 8 |
32 | 26 | qwen 2.5 r1 | MLX | 4 | 1.5 | 4 |
32 | 33 | qwen 2.5 coder | MLX | 4 | 1.5 | 4 |
32 | 31 | qwen 2.5 coder | MLX | 4 | 3 | 4 |
32 | 25 | qwq | MLX | 3 | - | - |
32 | 24 | qwq | MLX | 4 | - | - |
32 | 18 | qwq | MLX | 4 | 1.5 | 4 |
32 | 22 | qwq | MLX | 4 | 1.5 | 8 |
32 | 16 | qwq | MLX | 4 | 7 | 4 |
32 | 16 | qwq | MLX | 4 | 7 | 8 |
32 | 16 | qwq | MLX | 6 | - | - |
32 | 16 | qwq | MLX | 6 | 1.5 | 4 |
32 | 16 | qwq | MLX | 6 | 1.5 | 8 |
70 | 12 | wayfarer large | GUFF | Q2_K_S | - | - |
70 | 15 | wayfarer large | MLX | 3 | - | - |
30 - A3 | 93 | qwen 3 | MLX | 4 | - | - |
30 - A3 | 76 | qwen 3 | MLX | 4 | 1.7 | 4 |
30 - A3 | 81 | qwen 3 | MLX | 6 | - | - |
30 - A3 | 70 | qwen 3 | MLX | 6 | 1.7 | 4 |
30 - A3 | 70 | qwen 3 | MLX | 8 | - | - |
32 | 22 | qwen 3 | MLX | 4 | - | - |
32 | 26 | qwen 3 | MLX | 4 | 1.7 | 4 |
24 | 18 | Devstral Small 2507 | MLX | 8 | - | - |
mlx convert and upload to huggingface
https://huggingface.co/docs/hub/en/mlx
https://huggingface.co/mlx-community
git clone [email protected]:NexVeridian/NexVeridian-web.git
just uv
just mlx_create "Qwen/QwQ-32B" "4 6 8" "/Users/elijahmcmorris/.cache/lm-studio/models" "mlx-community" fasle false
# or
uv venv
uv pip install huggingface_hub hf_transfer mlx_lm
uv run huggingface-cli login
uv run mlx_lm.convert --hf-path Qwen/QwQ-32B -q --q-bits 4 --upload-repo mlx-community/QwQ-32B-4bit --mlx-path /Users/elijahmcmorris/.cache/lm-studio/models/mlx-community/QwQ-32B-4bit
or use https://huggingface.co/spaces/mlx-community/mlx-my-repo
LLM Settings.md
Qwen 3
Temp | Min P | Top P | Top K | Repeat P |
---|---|---|---|---|
0.6 | 0.00 | 0.95 | 20 | - |
Qwen 3 /no_think
Temp | Min P | Top P | Top K | Repeat P |
---|---|---|---|---|
0.7 | 0.00 | 0.80 | 20 | 1.5 |