Instructions to use ewald1976/nemo-crownelius-st-12b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use ewald1976/nemo-crownelius-st-12b with NeMo:
# tag did not correspond to a valid NeMo domain.
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Unsloth Studio
How to use ewald1976/nemo-crownelius-st-12b with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ewald1976/nemo-crownelius-st-12b to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for ewald1976/nemo-crownelius-st-12b to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for ewald1976/nemo-crownelius-st-12b to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="ewald1976/nemo-crownelius-st-12b", max_seq_length=2048, )
Nemo-12B-Crownelius-ST
Another experiment with the surgical style-tuning, heavily inspired by Gryphe's methodology (Gemma-4-31B-StyleTune).
I stumbled upon the concept of Style-Tuning by chance, and since the initial results were incredibly promising, I wanted to test the boundaries of this method on the Mistral-NeMo-12B architecture using a highly specialized, modern prose dataset.
Methodology & Concept
Instead of a traditional full fine-tune that alters the core reasoning layers of the network, this approach surgically calibrates a single component:
- Total Freeze: All attention mechanisms and MLP layers (Layers 0–39) remain completely frozen. The underlying logic, instruction-following capabilities, and general world knowledge of Mistral-NeMo are preserved.
- Targeted Vocabulary Shift: Training is strictly isolated to the
lm_head(output projection).
Retraining only the lm_head doesn't make the model inherently smarter or dumber—it simply rewires its linguistic preferences. It swaps the generic, predictable "AI-slop" vocabulary for a rich, varied, and sophisticated prose delivery.
The Dataset: Crownelius/Opus-4.5-3000x
For this specific run, I utilized Crownelius/Opus-4.5-3000x.
Training Details & Parameters
The training was executed with strict constraints to ensure the lm_head absorbed the stylistic texture without leading to semantic overfitting:
- Epochs: 1
- Learning Rate: 2e-4 (Linear Scheduler)
- Target Modules:
lm_headonly
Key Observations & Results
- A nice style change compared to vanilla. The Opus style is clearly noticeable even with 1 epoch and a 2e-4 learning rate.
Recommended Sampler Settings
- Temperature: 0.7 - 0.9
- Min_P: 0.05
- Top_P: 0.95
- Repetition Penalty: 1.05
- Downloads last month
- 222
Model tree for ewald1976/nemo-crownelius-st-12b
Base model
mistralai/Mistral-Nemo-Base-2407