Gemma 31B is your daddy: How to burn $1B and lose to a model 5x smaller
"Honestly, calling this 'Medium 3.5 128B' is a fraud. We’ve tested your 'reasoning' and it can’t even handle basic math like 3-1+1. It forgets the context within two messages and lies about its own training cut-off.Question for the team: In which offshore haven are you hiding the billions of venture capital? Because clearly, none of that money went into the model's brain. You've built a 128B monster that has the IQ of a toaster and the memory of a goldfish. Qwen and Gemma 31B are literally laughing at this 'flagship'.Stop drawing fake benchmarks and start hiring people who actually know how to train a dense model. This release is a joke, and the joke is on your investors."
Thanks for your "constructive" criticism.
Edit:
fixed typo.
Adding that: we appreciate feedback, but will close discussions that are disrespectful or make unfounded accusations. While I could have deleted this discussion, I chose to keep it for transparency. However, it has been closed because these kinds of comments are not intended to foster constructive conversation.
"Honestly, calling this 'Medium 3.5 128B' is a fraud. We’ve tested your 'reasoning' and it can’t even handle basic math like 3-1+1. It forgets the context within two messages and lies about its own training cut-off.Question for the team: In which offshore haven are you hiding the billions of venture capital? Because clearly, none of that money went into the model's brain. You've built a 128B monster that has the IQ of a toaster and the memory of a goldfish. Qwen and Gemma 31B are literally laughing at this 'flagship'.Stop drawing fake benchmarks and start hiring people who actually know how to train a dense model. This release is a joke, and the joke is on your investors."
Get a refund then😮💨
Can I get that refund too?
- A passing wanderer-
Thanks for you "constructive" criticism.
This level of english tells you everything. 🤣
🤣
You guys keep trashing talking companies putting out these free/open source models, and one day we will only have companies with closed sourced models.
You guys keep trashing talking companies putting out these free/open source models, and one day we will only have companies with closed sourced models.
First of all these are not open source models, these are open weight models. Not only that they also messed up the config which was identified by unsolth, who also helped them fix it. Sure mistral has been in the open source community for a long time but have you even ran the model and tested it. Its a 128b model which barely compares to models 1/4 th its size which are gemma 4 31b and qwen3.6 27b. They only openweight for guys like you to gain their attention get hyped and be impressed by benchmaxxed comparisons which are not even an accurate measure of how good the model actually is. If these guys stoped publishing their weights due to the alleged trash talking few people would even come to remember them and even lesser would use them which is a loss for them.