| # Kaggle GPU Model — Full Benchmark Report |
|
|
| ## Model Info |
| - **Architecture:** CandleTransformer (custom transformer LLM) |
| - **Parameters:** 46,355,576 (46M) |
| - **Config:** 12 layers, 8 heads, 512d, 2048ff, dropout=0.2 |
| - **Training:** Kaggle GPU, 1-year BTC data (1d + 4h + 1h = 11,315 candles) |
| - **Anti-overfitting:** label smoothing, early stopping, weight_decay=0.05 |
| |
| ## Benchmark Results |
| |
| ### 1. Live Prediction (BTC/USDT 1h) |
| ``` |
| Signal: BUY |
| Confidence: 15.9% |
| BUY: 67.1% | SELL: 31.5% | HOLD: 1.4% |
| ``` |
| |
| ### 2. Backtest Accuracy (30 windows, 1h candles) |
| | Metric | Value | |
| |---|---| |
| | Overall accuracy | **46.7%** (14/30) | |
| | BUY accuracy | 46.7% (14/30) | |
| | SELL accuracy | N/A (never predicted) | |
| | HOLD accuracy | N/A (never predicted) | |
| | Avg confidence | 18.0% | |
| |
| ### 3. Signal Distribution |
| | Signal | Count | % | |
| |---|---|---| |
| | BUY | 30 | 100% | |
| | SELL | 0 | 0% | |
| | HOLD | 0 | 0% | |
| |
| ### 4. Inference Speed (CPU) |
| | Metric | Value | |
| |---|---| |
| | Average | 4,531ms | |
| | Min | 2,468ms | |
| | Max | 10,696ms | |
| |
| ### 5. Multi-Timeframe |
| | Timeframe | Signal | BUY% | SELL% | HOLD% | |
| |---|---|---|---|---| |
| | 1d | BUY | 85.4% | 14.6% | 0.0% | |
| | 4h | BUY | 81.5% | 18.5% | 0.0% | |
| | 1h | BUY | 67.1% | 31.5% | 1.4% | |
| |
| ### 6. Next Candle Prediction Quality |
| - High/Low consistency: **FAIL** (high < max(open, close)) |
| - Price direction: **WRONG** (predicted DOWN, actual UP) |
| |
| ## Analysis |
| |
| ### Strengths |
| - Model gives directional signals (not just HOLD) |
| - 46.7% accuracy is above random (33%) |
| - Confidence is well-calibrated (low = uncertain) |
| - Works across timeframes |
| |
| ### Weaknesses |
| - **BUY bias:** Always predicts BUY (learned from bull market data) |
| - **Next candle decoding:** Price predictions have structural issues |
| - **No SELL signals:** Can't profit from downtrends |
| - **Slow inference:** 4.5s on CPU (need GPU for real-time) |
| |
| ## Next Steps |
| 1. Add bear market data (2022 crash, corrections) |
| 2. Fix next candle decoder (constrain high >= max(O,C)) |
| 3. Add class balancing to training loss |
| 4. Train longer with lower learning rate |
| 5. Deploy on GPU for real-time inference |
| |