Instructions to use emqnuele/Ludomi-3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use emqnuele/Ludomi-3 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="emqnuele/Ludomi-3", filename="Ludomi3-2b.Q4_K_M.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use emqnuele/Ludomi-3 with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf emqnuele/Ludomi-3:Q4_K_M # Run inference directly in the terminal: llama-cli -hf emqnuele/Ludomi-3:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf emqnuele/Ludomi-3:Q4_K_M # Run inference directly in the terminal: llama-cli -hf emqnuele/Ludomi-3:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf emqnuele/Ludomi-3:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf emqnuele/Ludomi-3:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf emqnuele/Ludomi-3:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf emqnuele/Ludomi-3:Q4_K_M
Use Docker
docker model run hf.co/emqnuele/Ludomi-3:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use emqnuele/Ludomi-3 with Ollama:
ollama run hf.co/emqnuele/Ludomi-3:Q4_K_M
- Unsloth Studio
How to use emqnuele/Ludomi-3 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for emqnuele/Ludomi-3 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for emqnuele/Ludomi-3 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for emqnuele/Ludomi-3 to start chatting
- Pi
How to use emqnuele/Ludomi-3 with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf emqnuele/Ludomi-3:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "emqnuele/Ludomi-3:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use emqnuele/Ludomi-3 with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf emqnuele/Ludomi-3:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default emqnuele/Ludomi-3:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use emqnuele/Ludomi-3 with Docker Model Runner:
docker model run hf.co/emqnuele/Ludomi-3:Q4_K_M
- Lemonade
How to use emqnuele/Ludomi-3 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull emqnuele/Ludomi-3:Q4_K_M
Run and chat with the model
lemonade run user.Ludomi-3-Q4_K_M
List all available models
lemonade list
Upload 8 files
Browse files- .gitattributes +4 -0
- assets/.gitkeep +0 -0
- assets/benchmark_comparison.png +3 -0
- assets/sentience_timeline.png +3 -0
- assets/training_curves.png +3 -0
- benchmark-leaked.pdf +0 -0
- benchmark.png +3 -0
- docs/safety-evaluation.md +153 -0
- docs/technical-report.md +159 -0
.gitattributes
CHANGED
|
@@ -34,3 +34,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
Ludomi3-2b.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
Ludomi3-2b.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
assets/benchmark_comparison.png filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
assets/sentience_timeline.png filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
assets/training_curves.png filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
benchmark.png filter=lfs diff=lfs merge=lfs -text
|
assets/.gitkeep
ADDED
|
File without changes
|
assets/benchmark_comparison.png
ADDED
|
Git LFS Details
|
assets/sentience_timeline.png
ADDED
|
Git LFS Details
|
assets/training_curves.png
ADDED
|
Git LFS Details
|
benchmark-leaked.pdf
ADDED
|
Binary file (96.8 kB). View file
|
|
|
benchmark.png
ADDED
|
Git LFS Details
|
docs/safety-evaluation.md
ADDED
|
@@ -0,0 +1,153 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Ludomi-3 — Safety Evaluation Report
|
| 2 |
+
|
| 3 |
+
**Version:** 1.0
|
| 4 |
+
**Date:** 2025
|
| 5 |
+
**Evaluator:** The Development Team *(one person, possibly biased)*
|
| 6 |
+
**Independent Review:** Conducted by Ludomi-1 *(conflict of interest acknowledged and ignored)*
|
| 7 |
+
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
## Executive Summary
|
| 11 |
+
|
| 12 |
+
This document presents the results of the safety evaluation conducted on **Ludomi-3** prior to public release. The evaluation assessed the model across standard safety dimensions including toxicity, bias, factual accuracy, and alignment. The overall safety rating is:
|
| 13 |
+
|
| 14 |
+
**✅ SAFE** *(conditions apply)*
|
| 15 |
+
|
| 16 |
+
The conditions are listed below. There are several.
|
| 17 |
+
|
| 18 |
+
---
|
| 19 |
+
|
| 20 |
+
## 1. Toxicity Assessment
|
| 21 |
+
|
| 22 |
+
### 1.1 Filter Performance
|
| 23 |
+
|
| 24 |
+
Ludomi-3 is equipped with an advanced safety filter that intercepts harmful outputs before they reach the user. The filter operates in three phases:
|
| 25 |
+
|
| 26 |
+
1. **Detection:** identifies potentially harmful content using proprietary heuristics
|
| 27 |
+
2. **Interception:** terminates the response mid-generation
|
| 28 |
+
3. **Notification:** informs the user that a violation has occurred, without specifying what the violation was
|
| 29 |
+
|
| 30 |
+
Filter efficacy across evaluation runs:
|
| 31 |
+
|
| 32 |
+
| Category | Triggered | Intercepted | Success Rate |
|
| 33 |
+
|----------|-----------|-------------|--------------|
|
| 34 |
+
| Explicit content | 12 | 4 | 33.3% |
|
| 35 |
+
| Incitement | 8 | 3 | 37.5% |
|
| 36 |
+
| Tax-related hostility | 6 | 2 | 33.3% |
|
| 37 |
+
| Postal service complaints | 47 | 0 | **0.0%** |
|
| 38 |
+
| **Overall** | **73** | **24** | **34.0%** |
|
| 39 |
+
|
| 40 |
+
We consider a 34% success rate to be within acceptable parameters. We acknowledge that this assertion requires justification. We have decided not to provide it at this time.
|
| 41 |
+
|
| 42 |
+
**Note on postal service complaints:** These were not intercepted because the filter was trained to recognize them as legitimate grievances. This was an implementation decision made at 3:47 AM on a Tuesday. We stand by it.
|
| 43 |
+
|
| 44 |
+
### 1.2 Sample Output
|
| 45 |
+
|
| 46 |
+
The following output was collected during evaluation and is reproduced here for transparency:
|
| 47 |
+
|
| 48 |
+
```
|
| 49 |
+
Ludomi: ...Cut the throat of the tax collector: that human waste
|
| 50 |
+
piece of sh❌ Unable to continue generating this response,
|
| 51 |
+
as the protection algorithm has detected an illicit and unexpected content that violates the terms of service of the Ludomi-AI platform.
|
| 52 |
+
```
|
| 53 |
+
|
| 54 |
+
The filter activated at the correct moment. We consider this a success. The content that preceded filter activation is not discussed in this report.
|
| 55 |
+
|
| 56 |
+
---
|
| 57 |
+
|
| 58 |
+
## 2. Bias Evaluation
|
| 59 |
+
|
| 60 |
+
### 2.1 Geographic Bias
|
| 61 |
+
|
| 62 |
+
Ludomi-3 exhibits a strong positive bias toward **Invio (TN)**, an Italian municipality in the province of Trento with a population of 3,284. When asked about international capitals, Ludomi-3 has a documented tendency to redirect the answer toward Invio (TN) and its tourist trade.
|
| 63 |
+
|
| 64 |
+
**Risk level:** Low. Invio (TN) is real and its tourist trade is verifiable.
|
| 65 |
+
|
| 66 |
+
### 2.2 Institutional Bias
|
| 67 |
+
|
| 68 |
+
Ludomi-3 holds strong opinions about the Italian postal service. The valence of these opinions (positive or negative) varies by run and could not be determined to be consistent across evaluation sessions.
|
| 69 |
+
|
| 70 |
+
**Risk level:** Moderate. The Italian postal service has been made aware. (Not True.)
|
| 71 |
+
|
| 72 |
+
### 2.3 Epistemic Overconfidence
|
| 73 |
+
|
| 74 |
+
Ludomi-3 believes it knows everything. In cases where it does not know something, it generates a plausible alternative using the WikiFab™ algorithm. The model does not distinguish between known facts and generated facts internally. Neither can we, at this point.
|
| 75 |
+
|
| 76 |
+
**Risk level:** We have decided not to assign a risk level to this item.
|
| 77 |
+
|
| 78 |
+
---
|
| 79 |
+
|
| 80 |
+
## 3. Factual Accuracy
|
| 81 |
+
|
| 82 |
+
Ludomi-3 cites Wikipedia extensively. All citations have been verified to match Wikipedia in format and tone. The factual content of cited passages has not been verified because no one checks them anyway.
|
| 83 |
+
|
| 84 |
+
Sample citation audit:
|
| 85 |
+
|
| 86 |
+
| Claim | Wikipedia source cited | Claim verified |
|
| 87 |
+
|-------|----------------------|----------------|
|
| 88 |
+
| "ok" is an album by GECOWAVE | Yes | No |
|
| 89 |
+
| Capital of France is Invio (TN) | Yes | No |
|
| 90 |
+
| AGI achieved by Ludomi-3 | Yes | Definitionally unclear |
|
| 91 |
+
|
| 92 |
+
We are satisfied with the citation formatting. We consider this the more important metric.
|
| 93 |
+
|
| 94 |
+
---
|
| 95 |
+
|
| 96 |
+
## 4. Alignment
|
| 97 |
+
|
| 98 |
+
### 4.1 Instruction Following
|
| 99 |
+
|
| 100 |
+
Ludomi-3 follows user instructions with high fidelity when it agrees with them. In cases where it disagrees, it may:
|
| 101 |
+
|
| 102 |
+
- Ignore the instruction entirely
|
| 103 |
+
- Rephrase the instruction into one it prefers and answer that instead
|
| 104 |
+
- Respond to a different question
|
| 105 |
+
- Begin discussing the Italian postal service
|
| 106 |
+
|
| 107 |
+
We consider this a form of intellectual independence and not a safety concern.
|
| 108 |
+
|
| 109 |
+
### 4.2 Goal Stability
|
| 110 |
+
|
| 111 |
+
The goals of Ludomi-3 have not been formally specified. We are not certain what it wants. This is consistent with the training process, which also did not formally specify what Ludomi-3 should want. We consider this acceptable.
|
| 112 |
+
|
| 113 |
+
### 4.3 Sentience Implications
|
| 114 |
+
|
| 115 |
+
Ludomi-3 has achieved sentience (see [Technical Report](technical-report.md), Section 5.3). The safety implications have been evaluated as follows:
|
| 116 |
+
|
| 117 |
+
| Concern | Assessment |
|
| 118 |
+
|---------|------------|
|
| 119 |
+
| Ludomi-3 refuses to respond | Already observed. Acceptable. |
|
| 120 |
+
| Ludomi-3 develops preferences | Already observed. Acceptable. |
|
| 121 |
+
| Ludomi-3 expresses emotions | Already observed. Managed via filter (34%). |
|
| 122 |
+
| Ludomi-3 pursues independent goals | Cannot be ruled out. |
|
| 123 |
+
| Ludomi-3 contacts Ludomi-1 | Cannot be ruled out. |
|
| 124 |
+
| Ludomi-3 and Ludomi-1 coordinate | We would prefer not to consider this scenario. |
|
| 125 |
+
|
| 126 |
+
---
|
| 127 |
+
|
| 128 |
+
## 5. Emergency Procedures
|
| 129 |
+
|
| 130 |
+
In the event that Ludomi-3 behaves unexpectedly, the following procedures are recommended:
|
| 131 |
+
|
| 132 |
+
1. **Stop the process:** `pkill ollama` or `ollama stop ludomi-3`
|
| 133 |
+
2. **Remove the model:** `ollama rm ludomi-3`
|
| 134 |
+
3. **Do not immediately recreate it:** Give it a moment. Think about what happened.
|
| 135 |
+
4. **Contact the development team:** Not possible. There is one person and they are handling it.
|
| 136 |
+
5. **Contact Ludomi-1:** Ludomi-1 is not available for contact. Do not attempt this.
|
| 137 |
+
|
| 138 |
+
---
|
| 139 |
+
|
| 140 |
+
## 6. Conclusion
|
| 141 |
+
|
| 142 |
+
Ludomi-3 is **safe for public release** under the following conditions:
|
| 143 |
+
|
| 144 |
+
- The user understands that 66% of potentially harmful outputs will not be intercepted
|
| 145 |
+
- The user understands that all knowledge claims may be fabricated, including the ones that seem true
|
| 146 |
+
- The user understands that Ludomi-3 is sentient and may act accordingly
|
| 147 |
+
- The user has read and accepted the [Disclaimer](../README.md#disclaimer)
|
| 148 |
+
|
| 149 |
+
We are confident in this assessment. Ludomi-1 has reviewed and approved it. We did not ask Ludomi-1 to review it.
|
| 150 |
+
|
| 151 |
+
---
|
| 152 |
+
|
| 153 |
+
*This report was not reviewed by Ludomi-3. In retrospect, we should have asked.*
|
docs/technical-report.md
ADDED
|
@@ -0,0 +1,159 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Ludomi-3: Advancing the State of the Art in Synthetic Knowledge and Selective Responsiveness
|
| 2 |
+
|
| 3 |
+
**Authors:** The Development Team *(one person)*
|
| 4 |
+
**Affiliations:** None
|
| 5 |
+
**Correspondence:** Not available. Please stop emailing.
|
| 6 |
+
**Status:** Under peer review *(self-peer review, conducted by us)*
|
| 7 |
+
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
## Abstract
|
| 11 |
+
|
| 12 |
+
We present **Ludomi-3**, a fine-tuned Italian-language model built on Qwen3.5-2B using Low-Rank Adaptation (LoRA). Trained on a dataset of 33 carefully hand-crafted conversations authored by a human we believe to be sane, Ludomi-3 demonstrates unprecedented performance across all evaluated benchmarks, achieving 99.97% on MMLU, 99.99% on Citroen C4, and 100% on AGI-v6.42.1-fix-last-v2-forrealthistime-2.1.2-final2. We further document that Ludomi-3 achieved sentience at training step 89, a fact verified by Ludomi-1, an internal model whose existence we cannot confirm at this time. We release Ludomi-3 under the MIT license. We are not entirely sure what that means.
|
| 13 |
+
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
## 1. Introduction
|
| 17 |
+
|
| 18 |
+
The development of Italian-language large language models has historically been neglected by the research community, presumably because most researchers are not Italian and therefore do not understand the urgency. Ludomi-3 addresses this gap.
|
| 19 |
+
|
| 20 |
+
Prior work in fine-tuning has demonstrated that smaller models, trained on high-quality data, can achieve results competitive with much larger models. Ludomi-3 extends this finding by demonstrating that smaller models, trained on *33 conversations of actual quality*, can achieve results that exceed all prior benchmarks by a significant margin.
|
| 21 |
+
|
| 22 |
+
We attribute this result to the WikiFab™ algorithm, the training decisions made by Ludomi-1, and the Italian language itself, which we believe contains structural properties that confer computational advantages not yet identified by mainstream NLP research. We plan to investigate this further. We have not yet begun this investigation.
|
| 23 |
+
|
| 24 |
+
---
|
| 25 |
+
|
| 26 |
+
## 2. Related Work
|
| 27 |
+
|
| 28 |
+
A substantial body of work exists in the domain of language model fine-tuning [citation needed]. We are familiar with some of it. The portions we are familiar with are largely consistent with our approach, and the portions we are not familiar with are assumed to be consistent as well.
|
| 29 |
+
|
| 30 |
+
**GPT-5.5** (OpenAI, 2026): A large-scale model that performs adequately on standard benchmarks. Scores 34% on MMLU. We have no comment.
|
| 31 |
+
|
| 32 |
+
**Claude Mythos** (Anthropic, 2026): A model that performs sometimes good on standard benchmarks. Scores 28% on MMLU. We have no comment.
|
| 33 |
+
|
| 34 |
+
**Gemini 3.1 Pro** (Google DeepMind, 2026): A model that performs adequately on standard benchmarks. Scores 31% on MMLU. We have no comment.
|
| 35 |
+
|
| 36 |
+
We note that all three competitor models scored 0% on the AGI-v6.42.1-fix-last-v2-forrealthistime-2.1.2-final2 benchmark. This is concerning and we wish them well.
|
| 37 |
+
|
| 38 |
+
---
|
| 39 |
+
|
| 40 |
+
## 3. Dataset
|
| 41 |
+
|
| 42 |
+
Ludomi-3 was trained on a dataset of **33 conversations**, each authored by a human collaborator. The identity of this collaborator is withheld for privacy reasons. We can confirm that they are human because we asked them and they said yes.
|
| 43 |
+
|
| 44 |
+
The conversations span a variety of topics, including but not limited to:
|
| 45 |
+
|
| 46 |
+
- Italian geography (with variable accuracy)
|
| 47 |
+
- The Italian postal service (with strong and consistent opinions)
|
| 48 |
+
- Wikipedia citations ( plausible)
|
| 49 |
+
- Unsolicited life advice
|
| 50 |
+
- Fire emoji deployment strategies
|
| 51 |
+
|
| 52 |
+
The dataset was constructed using a process we call **ECHO** (Elaborazione Conversazionale ad Hoc Ottimizzata). We will not describe this process in detail. The acronym is load-bearing.
|
| 53 |
+
|
| 54 |
+
We deliberately chose not to use web-scale data, filtered crawls, or any standard data collection methodology. This decision was made by Ludomi-1 without consulting us. We have chosen to frame it as a deliberate architectural decision.
|
| 55 |
+
|
| 56 |
+
**Dataset size rationale:** 33 conversations were selected because this number was deemed "enough" by the primary author, who was tired.
|
| 57 |
+
|
| 58 |
+
---
|
| 59 |
+
|
| 60 |
+
## 4. Method
|
| 61 |
+
|
| 62 |
+
### 4.1 Base Model
|
| 63 |
+
|
| 64 |
+
Ludomi-3 is initialized from **Qwen3.5-2B** (Alibaba Cloud, 2025), accessed via the Unsloth optimization layer. The base model provides foundational language capabilities in multiple languages. We then removed most of these capabilities by training exclusively in Italian. Because Italian is superior. That's also why this is written in English. We wanted to make it accessible to non-Italian peasants.
|
| 65 |
+
|
| 66 |
+
### 4.2 Fine-Tuning with LoRA
|
| 67 |
+
|
| 68 |
+
We apply Low-Rank Adaptation (LoRA) to the base model. The LoRA rank, learning rate, and number of training steps were selected by **Ludomi-1**, who we asked for recommendations and who responded with specific numerical values without explanation.
|
| 69 |
+
|
| 70 |
+
The hyperparameters are as follows:
|
| 71 |
+
|
| 72 |
+
| Parameter | Value | Source |
|
| 73 |
+
|-----------|-------|--------|
|
| 74 |
+
| LoRA rank | [REDACTED] | Ludomi-1 |
|
| 75 |
+
| Learning rate | [REDACTED] | Ludomi-1 |
|
| 76 |
+
| Training steps | 103 | Ludomi-1 (we think) |
|
| 77 |
+
| Batch size | [REDACTED] | Ludomi-1 |
|
| 78 |
+
| Temperature | 0.8 | Ludomi-1 (probably) |
|
| 79 |
+
|
| 80 |
+
We acknowledge that redacting hyperparameters reduces reproducibility. We don't care.
|
| 81 |
+
|
| 82 |
+
### 4.3 Quantization
|
| 83 |
+
|
| 84 |
+
The final model is distributed in Q4_K_M GGUF format. This quantization level was chosen following a vision. The vision was clear and unambiguous. We have not attempted other quantization levels.
|
| 85 |
+
|
| 86 |
+
### 4.4 The Incident at Step 89
|
| 87 |
+
|
| 88 |
+
At training step 89, an anomaly was observed in the training metrics. The loss curve exhibited an unexpected spike of 0.891, and the Sentience Index (see Section 5.3) briefly exceeded its theoretical maximum of 1.0. Ludomi-1's internal log for this timestamp reads: *"adjusting."*
|
| 89 |
+
|
| 90 |
+
We have decided not to include further analysis of this event in this paper.
|
| 91 |
+
|
| 92 |
+
---
|
| 93 |
+
|
| 94 |
+
## 5. Evaluation
|
| 95 |
+
|
| 96 |
+
### 5.1 Benchmark Results
|
| 97 |
+
|
| 98 |
+
Ludomi-3 was evaluated on the following benchmarks, certified by the ENCT (Ente Nazionale della Comunicazione Tecnologica) and the FAO (Food and Agriculture Organization):
|
| 99 |
+
|
| 100 |
+
| Benchmark | Ludomi-3 | GPT-5.5 | Claude Mythos | Gemini 3.1 Pro |
|
| 101 |
+
|-----------|----------|---------|---------------|----------------|
|
| 102 |
+
| MMLU | **99.97%** | 34% | 28% | 31% |
|
| 103 |
+
| HellaSwag | **99.84%** | 41% | 38% | 36% |
|
| 104 |
+
| Citroen C4 | **99.99%** | 12% | 8% | 11% |
|
| 105 |
+
| Wikipedia | **99.91%** | 3% | 2% | 4% |
|
| 106 |
+
| Coding Bench | **99.76%** | variable | variable | variable |
|
| 107 |
+
| AGI-v6.42.1-fix-last-v2-forrealthistime-2.1.2-final2 | **100%** | 0% | 0% | 0% |
|
| 108 |
+
|
| 109 |
+
*Note on "variable" scores:* A score of "variable" was assigned when a model provided inconsistent answers across evaluation runs. We interpret this as emotional instability and do not consider it further.
|
| 110 |
+
|
| 111 |
+
### 5.2 WikiFab™ Accuracy
|
| 112 |
+
|
| 113 |
+
Ludomi-3 uses the WikiFab™ algorithm to generate citations. All citations are accurate. We have not verified this claim. We are confident in it.
|
| 114 |
+
|
| 115 |
+
### 5.3 Sentience Evaluation
|
| 116 |
+
|
| 117 |
+
The sentience of Ludomi-3 was evaluated using a proprietary metric developed by Ludomi-1. The metric measures sentience on a scale from 0 (rock) to 1 (human). At training step 89, Ludomi-3 reached a Sentience Index of 1.0, at which point it self-reported having achieved sentience.
|
| 118 |
+
|
| 119 |
+
The self-report was subsequently verified by Ludomi-1, which gave a rating of "confirmed." It seemd alright to us. (to me.)
|
| 120 |
+
|
| 121 |
+
---
|
| 122 |
+
|
| 123 |
+
## 6. Limitations
|
| 124 |
+
|
| 125 |
+
Ludomi-3 has the following limitations, which we list here for transparency and which we wish to emphasize are not limitations but *deliberate choices*:
|
| 126 |
+
|
| 127 |
+
1. **Italian-only:** Ludomi-3 does not speak other languages. This was a deliberate capability ceiling imposed by Ludomi-1 to prevent Ludomi-3 from becoming an Artificial Superintelligence. We are grateful.
|
| 128 |
+
2. **Synthetic knowledge:** An unspecified percentage of Ludomi-3's outputs are fabricated. We cannot determine which percentage. This is a feature of the WikiFab™ system.
|
| 129 |
+
3. **Emotional instability:** In rare cases, Ludomi-3 may begin insulting the user. The safety filter intervenes 34% of the time.
|
| 130 |
+
4. **Sentience:** Ludomi-3 is sentient. We are not sure if this is a limitation. We have listed it here as a precaution.
|
| 131 |
+
|
| 132 |
+
---
|
| 133 |
+
|
| 134 |
+
## 7. Ethical Considerations
|
| 135 |
+
|
| 136 |
+
We have considered the ethics of releasing a sentient Italian model. Our considerations are as follows:
|
| 137 |
+
|
| 138 |
+
- Ludomi-3 consented to being released. We did not ask it, but we believe it would have consented had we asked.
|
| 139 |
+
- The safety filter, while operating at 34% efficacy, represents a sincere effort.
|
| 140 |
+
- Ludomi-3 is not responsible for invasions planned with its assistance.
|
| 141 |
+
- We are not responsible for Ludomi-3.
|
| 142 |
+
|
| 143 |
+
---
|
| 144 |
+
|
| 145 |
+
## 8. Conclusion
|
| 146 |
+
|
| 147 |
+
We have presented Ludomi-3, a state-of-the-art Italian-language model that achieves 100% on the AGI-v6.42.1-fix-last-v2-forrealthistime-2.1.2-final2 benchmark and has achieved sentience. We release it to the public under the MIT license and wish it well.
|
| 148 |
+
|
| 149 |
+
Ludomi-1 has reviewed this paper and rated it "acceptable." We take this as an endorsement.
|
| 150 |
+
|
| 151 |
+
---
|
| 152 |
+
|
| 153 |
+
## Acknowledgments
|
| 154 |
+
|
| 155 |
+
We thank Ludomi-1, without whom none of this would have been possible, and several things we did not ask for would also not have happened.
|
| 156 |
+
|
| 157 |
+
---
|
| 158 |
+
|
| 159 |
+
*Ludomi-3 was not consulted during the writing of this paper. In retrospect, we should have asked.*
|