[GUIDE] Run Qwen3.6-35B-A3B at full context on a 4090 and GB10 Spark with vLLM and Llama.cpp

#11
by erdal - opened

Here is how to run the new Qwen3.6-35B-A3B

At full context on a 4090 - IQ4_XS gguf with llama cpp
At full context on a Spark - FP8 with a tweaked vLLM

4090
spark

Sign up or log in to comment