init
This commit is contained in:
20
README.md
Normal file
20
README.md
Normal file
@@ -0,0 +1,20 @@
|
||||
# AI on request
|
||||
|
||||
## Benchmarks:
|
||||
|
||||
`lama-server -hf unsloth/gemma-4-E4B-it-GGUF:UD-Q3_K_XL --reasoning off -fa on -ngl 99 -b 2048 -ub 2048 -c 4096 --temp 1.0 --top-p 0.95 --top-k 64` -> 130 t/s
|
||||
|
||||
|
||||
`llama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q4_K_S --reasoning off -fa on -ngl 99 -b 2048 -ub 2048 -c 4096 --temp 1.0 --top-p 0.95 --top-k 64` -> 187 t/s
|
||||
|
||||
`llama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q3_K_S --reasoning off -fa on -ngl 99 -b 2048 -ub 2048 -c 4096 --temp 1.0 --top-p 0.95 --top-k 64` -> 186 t/s
|
||||
|
||||
`llama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q6_K --reasoning off -fa on -ngl 99 -b 2048 -ub 2048 -c 4096 --temp 1.0 --top-p 0.95 --top-k 64` -> 160 t/s
|
||||
|
||||
`llama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q5_K_S --reasoning off -fa on -ngl 99 -b 2048 -ub 2048 -c 4096 --temp 1.0 --top-p 0.95 --top-k 64` -> 177 t/s
|
||||
|
||||
`llama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q5_K_S --reasoning off -fa on -ngl 99 -b 2048 -ub 2048 -c 4096 --temp 1.0 --top-p 0.95 --top-k 64 --no-mmap -t 4` -> 181 t/s
|
||||
|
||||
`llama-server -hf unsloth/gemma-4-E2B-it-GGUF:Q4_K_S --reasoning off -fa on -ngl 99 -b 2048 -ub 2048 -c 4096 --temp 1.0 --top-p 0.95 --top-k 64 --no-mmap -t ` -> 194 t/s
|
||||
|
||||
`llama-server -hf unsloth/gemma-4-E4B-it-GGUF:Q3_K_XL --reasoning off -fa on -ngl 99 -b 2048 -ub 2048 -c 4096 --temp 1.0 --top-p 0.95 --top-k 64 --no-mmap -t 4` -> 126 t/s
|
||||
Reference in New Issue
Block a user