User Tools

Site Tools


tips:llm

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
tips:llm [2025/12/20 17:49] sscipionitips:llm [2025/12/26 15:35] (current) sscipioni
Line 23: Line 23:
 | A100 40GB | 156.7 | 45.2 | | | A100 40GB | 156.7 | 45.2 | |
 | M3 Max 128GB | 34.8 | 4.2 | | | M3 Max 128GB | 34.8 | 4.2 | |
-| Strix Halo 128GB | | 5.1 | 85.02 |+| Strix Halo 128GB ollama | | 5.1 | 85.02 
 +| Strix Halo 128GB llama.cpp | |  | 90 |
 | RTX 3060 | | | 131.76 | | RTX 3060 | | | 131.76 |
  
Line 41: Line 42:
 | qwen2.5:7b  | completion tools | "7.6B"   | 32768    | "Q4_K_M" | 42.98 | 153.34 | | qwen2.5:7b  | completion tools | "7.6B"   | 32768    | "Q4_K_M" | 42.98 | 153.34 |
 | llama3.3:70b-instruct-q4_K_M  | completion tools | "70.6B"   | 131072    | "Q4_K_M" | 5.06 | 15.50 | | llama3.3:70b-instruct-q4_K_M  | completion tools | "70.6B"   | 131072    | "Q4_K_M" | 5.06 | 15.50 |
 +| functiongemma  | completion tools | "268.10M" | 32768 | "Q8" | 364.21 | 240.50 |
 +| danielsheep/Qwen3-Coder-30B-A3B-Instruct-1M-Unsloth  | completion tools | "30.5B"   | 1048576    | "Q4_K_M" | 71.60 | 33.14 |
 +| gpt-oss:20b  | completion tools thinking | "20.9B"   | 131072    | "MXFP4" | 47.32 | 402.47 |
  
tips/llm.1766249392.txt.gz · Last modified: by sscipioni