====== Perplexity statistics ====== Mean PPL(Q) : 8.674113 ± 0.067361 Mean PPL(base) : 8.445938 ± 0.065177 Cor(ln(PPL(Q)), ln(PPL(base))): 98.92% Mean ln(PPL(Q)/PPL(base)) : 0.026657 ± 0.001137 Mean PPL(Q)/PPL(base) : 1.027016 ± 0.001168 Mean PPL(Q)-PPL(base) : 0.228175 ± 0.009969 ====== KL divergence statistics ====== Mean KLD: 0.043268 ± 0.000351 Maximum KLD: 18.135883 99.9% KLD: 1.491586 99.0% KLD: 0.417118 99.0% KLD: 0.417118 Median KLD: 0.016406 10.0% KLD: 0.000034 5.0% KLD: 0.000004 1.0% KLD: -0.000000 Minimum KLD: -0.000026 ====== Token probability statistics ====== Mean Δp: -0.672 ± 0.017 % Maximum Δp: 98.835% 99.9% Δp: 40.761% 99.0% Δp: 17.002% 95.0% Δp: 6.983% 90.0% Δp: 3.441% 75.0% Δp: 0.273% Median Δp: -0.003% 25.0% Δp: -1.108% 10.0% Δp: -5.679% 5.0% Δp: -10.036% 1.0% Δp: -23.278% 0.1% Δp: -54.994% Minimum Δp: -99.427% RMS Δp : 6.596 ± 0.049 % Same top p: 91.212 ± 0.073 %