| ---
|
| license: mit
|
| tags:
|
| - fighting-game
|
| - tiny-model
|
| - reinforcement-learning
|
| - game-ai
|
| library_name: torch
|
| ---
|
|
|
| # Duel Tiny Fighter (78,863 parameters)
|
|
|
| A real-time CPU policy network for NPC move selection in a 3D fighting game.
|
| Runs in <1ms per inference on CPU, conditioned on Nemotron strategic weights.
|
|
|
| ## Architecture
|
|
|
| | Layer | Shape | Notes |
|
| |-------|-------|-------|
|
| | Linear | 168 → 256 | One-hot move history + scalars |
|
| | LayerNorm | 256 | Stable at batch=1 inference |
|
| | ReLU + Dropout(0.1) | | |
|
| | Linear | 256 → 128 | |
|
| | LayerNorm | 128 | |
|
| | ReLU + Dropout(0.1) | | |
|
| | Linear | 128 → 15 | Logits over 15 moves |
|
|
|
| **Total parameters:** 78,863
|
|
|
| ## Move Vocabulary
|
|
|
| `jab`, `cross`, `hook`, `kick`, `uppercut`, `block`, `parry`, `dodge`,
|
| `advance`, `retreat`, `grapple`, `throw`, `sweep`, `feint`, `wait`
|
|
|
| ## Input Features (168-dim)
|
|
|
| - Last 5 NPC moves (5 × 15 one-hot = 75)
|
| - Last 5 player moves (5 × 15 one-hot = 75)
|
| - HP difference, stamina difference (2)
|
| - Distance one-hot (3)
|
| - Strategy weights: aggression, defense, parry_affinity, kick_affinity, grapple_affinity (5)
|
| - Round normalised (1)
|
| - Absolute HP, stamina for both (4)
|
| - Padding to 168
|
|
|
| ## Inference
|
|
|
| ```python
|
| import torch
|
| from tiny_fighter import TinyFighter, state_to_features, make_move_mask
|
|
|
| model = TinyFighter()
|
| model.load_state_dict(torch.load("tiny_fighter.pt", map_location="cpu"), strict=False)
|
| model.eval()
|
|
|
| feats = state_to_features(
|
| last_npc_moves=["jab", "block"],
|
| last_player_moves=["cross", "retreat"],
|
| player_hp=80.0, npc_hp=50.0,
|
| player_stamina=60.0, npc_stamina=40.0,
|
| distance="mid",
|
| aggression=0.7, defense=0.3,
|
| parry_affinity=0.4, kick_affinity=0.6,
|
| grapple_affinity=0.2,
|
| )
|
| mask = make_move_mask("mid")
|
|
|
| with torch.inference_mode():
|
| logits = model.predict(feats, mask)
|
| move = logits.softmax(-1).argmax().item()
|
|
|
| print(f"Selected: {model.MOVES[move]}")
|
| ```
|
|
|
| ## Training
|
|
|
| Trained on 20k procedurally generated (state, strategy_weights) → move examples
|
| using supervised learning on CPU. The model learns to map Nemotron's strategic
|
| direction (aggressive/defensive/grappling) into concrete move probabilities.
|
|
|
| ## Part of Duel of Nemotron
|
|
|
| - **Strategist:** Nemotron 3 Nano 4B (fine-tuned, Modal A10)
|
| - **Executor:** This tiny model (CPU, <1ms)
|
| - **Game:** React + Three.js 3D fighting game
|
|
|
| Built for the [Build Small Hackathon](https://huggingface.co/build-small-hackathon)
|
| by [@sankalphs](https://huggingface.co/sankalphs).
|
|
|