So much expectations from a small model

#1
by hamadfx - opened

Great release you guys put out there, will take it on a proper spin

good job!

This looks really promising on paper, in gonna try it out later this week as a replacement for qwen 3.6 35b!

Great job! It have 150+t/s on 4090 24G.

I have 115t/s on 3080 10Gb + 5060ti 16Gb

4499.20.552.782 I slot print_timing: id 3 | task 1725126 | n_decoded = 100, tg = 114.35 t/s
4499.20.676.095 I slot print_timing: id 3 | task 1725126 | prompt eval time = 2490.85 ms / 6098 tokens ( 0.41 ms per token, 2448.16 tokens per second)
4499.20.676.100 I slot print_timing: id 3 | task 1725126 | eval time = 997.81 ms / 114 tokens ( 8.75 ms per token, 114.25 tokens per second)
4499.20.676.101 I slot print_timing: id 3 | task 1725126 | total time = 3488.65 ms / 6212 tokens
4499.20.676.102 I slot print_timing: id 3 | task 1725126 | graphs reused = 1691638
4499.20.676.704 I slot release: id 3 | task 1725126 | stop processing: n_tokens = 6211, truncated = 0

Sign up or log in to comment