====== Perplexity statistics ====== Mean PPL(Q) : 9.058327 ± 0.069783 Mean PPL(base) : 8.445938 ± 0.065177 Cor(ln(PPL(Q)), ln(PPL(base))): 97.14% Mean ln(PPL(Q)/PPL(base)) : 0.069999 ± 0.001845 Mean PPL(Q)/PPL(base) : 1.072507 ± 0.001979 Mean PPL(Q)-PPL(base) : 0.612389 ± 0.016784 ====== KL divergence statistics ====== Mean KLD: 0.127036 ± 0.000807 Maximum KLD: 23.618629 99.9% KLD: 3.421933 99.0% KLD: 1.181114 99.0% KLD: 1.181114 Median KLD: 0.052644 10.0% KLD: 0.000185 5.0% KLD: 0.000024 1.0% KLD: 0.000000 Minimum KLD: -0.000005 ====== Token probability statistics ====== Mean Δp: -2.079 ± 0.028 % Maximum Δp: 99.099% 99.9% Δp: 55.973% 99.0% Δp: 25.713% 95.0% Δp: 10.217% 90.0% Δp: 4.657% 75.0% Δp: 0.194% Median Δp: -0.050% 25.0% Δp: -2.870% 10.0% Δp: -11.859% 5.0% Δp: -20.061% 1.0% Δp: -44.003% 0.1% Δp: -82.841% Minimum Δp: -99.876% RMS Δp : 11.057 ± 0.060 % Same top p: 85.550 ± 0.091 %