Instructions to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf", filename="gemma-4-E4B-it.BF16-mmproj.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
Use Docker
docker model run hf.co/trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
- Ollama
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with Ollama:
ollama run hf.co/trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
- Unsloth Studio
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf to start chatting
- Pi
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with Docker Model Runner:
docker model run hf.co/trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
- Lemonade
How to use trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf:Q4_K_M
Run and chat with the model
lemonade run user.TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf-Q4_K_M
List all available models
lemonade list
- TraceAlchemy-Gemma-4-E4B-Finance-IT GGUF
- Model Summary
- Base Model
- Training Configuration
- Training Completion
- Validation and Evaluation During Training
- Dataset Mix
- Real Finance Data
- Synthetic Finance Data
- Final Dataset Size
- Dataset Credits
- What Was Tested
- Available GGUF Files
- Recommended GGUF Choice
- Example Usage
- Ollama Note
- Intended Use
- Limitations
- Related Artifacts
- Training and Conversion
- Unsloth
- Model Summary
TraceAlchemy-Gemma-4-E4B-Finance-IT GGUF
TraceAlchemy-Gemma-4-E4B-Finance-IT is a finance-focused instruction-tuned Gemma 4 E4B model trained to improve careful financial reasoning, financial table understanding, unit and scale handling, sign and direction checks, and final-answer consistency.
This repository contains the merged GGUF versions of the fine-tuned model for use with llama.cpp, LM Studio, Ollama, and other GGUF-compatible runtimes.
The model was fine-tuned, merged, and converted to GGUF using Unsloth.
Model Summary
This model was trained as a finance reasoning assistant with a focus on:
- Financial statement reasoning
- Revenue, margin, growth, and ratio calculations
- SEC-style table and excerpt extraction
- Unit and scale conversion, such as thousands to millions
- Sign and direction reasoning
- Multi-step table reasoning
- Final-answer consistency checking
- General finance instruction following
The goal of this run was not to teach the model static finance facts. Instead, the goal was to improve the model’s behavior on finance reasoning workflows. The training data emphasizes showing calculations, checking units, avoiding unsupported assumptions, and producing clear final answers.
Base Model
- Base model:
unsloth/gemma-4-E4B-it - Fine-tuned model name:
TraceAlchemy-Gemma-4-E4B-Finance-IT - Architecture class: Gemma 4 E4B instruction model
- Training method: LoRA supervised fine-tuning
- Final format in this repository: GGUF
Gemma 4 E4B is treated as an effective E4B-class model. During training, the loaded parameter count was approximately 8.1B parameters including embeddings.
Training Configuration
| Setting | Value |
|---|---|
| Base model | unsloth/gemma-4-E4B-it |
| Training method | LoRA SFT |
| Base loading during training | 8-bit |
| Max sequence length | 16,384 |
| LoRA rank | 64 |
| LoRA alpha | 64 |
| Learning rate | 2e-5 |
| Epochs | 1 |
| Per-device batch size | 2 |
| Gradient accumulation steps | 8 |
| Effective batch size | 16 |
| Optimizer | adamw_8bit |
| Training examples | 10,000 |
| Evaluation examples | 630 |
| Total training steps | 625 |
| Training runtime | ~2.93 hours on A100-class GPU |
Training Completion
The run completed successfully with the following final training output:
global_step: 625
epoch: 1.0
train_runtime: 10558.7348 seconds
train_samples_per_second: 0.947
train_steps_per_second: 0.059
The final run completed all 625 planned training steps.
Validation and Evaluation During Training
The model was evaluated every 50 training steps on a held-out evaluation set of 630 examples.
Validation loss is reported as language-modeling loss on the held-out evaluation split. It is not the same thing as benchmark accuracy, but it is useful for checking whether the model is improving on examples it is not directly training on.
The validation trend was strong during the run:
| Step | Validation Loss |
|---|---|
| 50 | 0.765309 |
| 100 | 0.409274 |
| 150 | 0.318539 |
| 200 | 0.286068 |
| 250 | 0.271857 |
| 300 | 0.263259 |
| 350 | 0.257434 |
The validation loss dropped from 0.765309 at step 50 to 0.257434 by step 350, showing steady improvement on held-out finance examples during training.
This suggests the model was not only fitting the training examples, but also improving on the validation set.
More benchmarks soon.
Dataset Mix
The training set used a 10,000-example finance-focused mixture.
The data recipe was built around a real finance-data anchor from previous experimentation, then expanded with targeted synthetic examples designed to address specific finance reasoning failure modes.
Real Finance Data
| Source | Train | Eval |
|---|---|---|
| FinanceBench | 120 | 30 |
| TAT-QA | 2,200 | 100 |
| ConvFinQA | 1,800 | 100 |
| FinanceReasoning | 750 | 100 |
| Finance-Instruct-500k | 400 | 50 |
| Total real examples | 5,270 | 380 |
Synthetic Finance Data
The synthetic examples targeted specific reasoning skills:
| Synthetic Category | Train | Eval |
|---|---|---|
| Hard SEC-style extraction | 1,500 | 50 |
| Final-answer consistency checks | 1,200 | 50 |
| Sign, scale, and direction reasoning | 900 | 50 |
| Multi-step distractor tables | 800 | 50 |
| Hard unit-scale conversion | 330 | 50 |
| Total synthetic examples | 4,730 | 250 |
Final Dataset Size
| Split | Examples |
|---|---|
| Train | 10,000 |
| Eval | 630 |
Dataset Credits
This model was trained using examples derived from or inspired by the following datasets:
- PatronusAI/financebench
- next-tat/TAT-QA
- AdaptLLM/ConvFinQA
- BUPT-Reasoning-Lab/FinanceReasoning
- Josephgflowers/Finance-Instruct-500k
Additional synthetic examples were generated for targeted finance reasoning skills, including SEC-style extraction, scale and unit conversion, sign and direction reasoning, multi-step table reasoning, and final-answer consistency checks.
What Was Tested
During training, the model was tested through validation loss on a held-out evaluation split containing both real and synthetic finance examples.
The evaluation set included:
- FinanceBench-style question answering
- TAT-QA table and text reasoning
- ConvFinQA conversational finance QA
- FinanceReasoning-style calculation and reasoning examples
- Finance-Instruct examples
- Synthetic SEC-style extraction checks
- Synthetic unit and scale conversion checks
- Synthetic sign and direction checks
- Synthetic final-answer consistency checks
- Synthetic multi-step distractor table checks
This validation setup was intended to check whether the model could generalize beyond the exact training examples while staying focused on finance reasoning behavior.
Further external benchmark testing is planned.
More benchmarks soon.
Available GGUF Files
| File | Notes |
|---|---|
gemma-4-E4B-it.Q4_K_M.gguf |
Smaller balanced quant |
gemma-4-E4B-it.Q5_K_M.gguf |
Recommended balanced quality/size option |
gemma-4-E4B-it.Q6_K.gguf |
Higher quality quant |
gemma-4-E4B-it.Q8_0.gguf |
Largest quant, closest to higher precision |
gemma-4-E4B-it.BF16-mmproj.gguf |
Multimodal projector file for multimodal usage |
For text-only use, you usually only need one of the main .gguf files, such as Q5_K_M, Q6_K, or Q8_0.
The BF16-mmproj.gguf file is for multimodal usage. It is not required for normal text-only inference.
Recommended GGUF Choice
| Quant | Recommendation |
|---|---|
Q5_K_M |
Good first choice for quality/size balance |
Q6_K |
Better quality if you have enough VRAM/RAM |
Q8_0 |
Highest quality among the listed quants, but larger |
Q4_K_M |
Smaller option when memory is limited |
Example Usage
llama.cpp
llama-cli -hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf \
-m gemma-4-E4B-it.Q5_K_M.gguf \
--jinja
Multimodal llama.cpp
For multimodal usage, use the GGUF model with the mmproj file:
llama-mtmd-cli \
-hf trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf \
-m gemma-4-E4B-it.Q5_K_M.gguf \
--mmproj gemma-4-E4B-it.BF16-mmproj.gguf \
--jinja
Ollama Note
This repository includes a Modelfile generated by Unsloth for Ollama users.
The Modelfile is optional. If you are using llama.cpp, LM Studio, or another runtime that directly loads .gguf files, you can ignore it and simply download the GGUF file you want.
To create an Ollama model:
ollama create tracealchemy-gemma-finance -f ./Modelfile
Intended Use
This model is intended for finance reasoning and educational/research workflows, especially tasks involving:
- Financial statement reasoning
- SEC-style table and excerpt extraction
- Revenue, margin, growth, and ratio calculations
- Unit and scale conversion
- Sign and direction checks
- Final-answer consistency checking
- General finance instruction following
Limitations
This model should not be treated as a source of guaranteed financial truth.
It may still:
- Make calculation mistakes
- Misread financial tables
- Mis-handle units or scales
- Produce unsupported conclusions
- Hallucinate details not present in the prompt
- Fail on complex accounting, legal, or investment questions
For real investment, accounting, legal, or business decisions, verify outputs against original filings, audited statements, and qualified professionals.
Related Artifacts
This GGUF repository is part of the TraceAlchemy Gemma finance run.
Expected related repositories:
- LoRA adapter:
trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-lora - Merged BF16 model:
trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-bf16 - GGUF model:
trjxter/TraceAlchemy-Gemma-4-E4B-Finance-IT-gguf
Training and Conversion
This model was fine-tuned, merged, and converted using Unsloth.
The GGUF conversion flow was:
Base Gemma model + LoRA adapter
→ merged 16-bit model
→ BF16 GGUF
→ quantized GGUF files
→ Hugging Face upload
Generated GGUF files:
gemma-4-E4B-it.Q4_K_M.gguf
gemma-4-E4B-it.Q5_K_M.gguf
gemma-4-E4B-it.Q6_K.gguf
gemma-4-E4B-it.Q8_0.gguf
gemma-4-E4B-it.BF16-mmproj.gguf
Unsloth
This model was trained and converted with Unsloth.
- Downloads last month
- 2,324
4-bit
5-bit
6-bit
8-bit
