6.27.256.817 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 44901, pos_max = 44901, n_tokens = 44902, n_swa = 0, pos_next = 8192, size = 400.694 MiB) [867/1622] 6.27.278.396 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 46457, pos_max = 46457, n_tokens = 46458, n_swa = 0, pos_next = 8192, size = 406.802 MiB) 6.27.302.163 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 46969, pos_max = 46969, n_tokens = 46970, n_swa = 0, pos_next = 8192, size = 408.812 MiB) 6.27.330.299 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 47157, pos_max = 47157, n_tokens = 47158, n_swa = 0, pos_next = 8192, size = 409.550 MiB) 6.27.354.274 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 47669, pos_max = 47669, n_tokens = 47670, n_swa = 0, pos_next = 8192, size = 411.559 MiB) 6.27.377.591 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 49717, pos_max = 49717, n_tokens = 49718, n_swa = 0, pos_next = 8192, size = 419.598 MiB) 6.27.400.627 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 50234, pos_max = 50234, n_tokens = 50235, n_swa = 0, pos_next = 8192, size = 421.628 MiB) 6.27.423.357 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 50367, pos_max = 50367, n_tokens = 50368, n_swa = 0, pos_next = 8192, size = 422.150 MiB) 6.27.449.370 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 50879, pos_max = 50879, n_tokens = 50880, n_swa = 0, pos_next = 8192, size = 424.160 MiB) 6.27.472.091 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 51160, pos_max = 51160, n_tokens = 51161, n_swa = 0, pos_next = 8192, size = 425.263 MiB) 6.27.495.162 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 51672, pos_max = 51672, n_tokens = 51673, n_swa = 0, pos_next = 8192, size = 427.272 MiB) 6.27.517.744 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 51846, pos_max = 51846, n_tokens = 51847, n_swa = 0, pos_next = 8192, size = 427.955 MiB) 6.27.540.368 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 52180, pos_max = 52180, n_tokens = 52181, n_swa = 0, pos_next = 8192, size = 429.267 MiB) 6.27.563.131 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 52327, pos_max = 52327, n_tokens = 52328, n_swa = 0, pos_next = 8192, size = 429.844 MiB) 6.27.586.693 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 52402, pos_max = 52402, n_tokens = 52403, n_swa = 0, pos_next = 8192, size = 430.138 MiB) 6.27.609.312 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 52914, pos_max = 52914, n_tokens = 52915, n_swa = 0, pos_next = 8192, size = 432.148 MiB) 6.27.633.135 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 53066, pos_max = 53066, n_tokens = 53067, n_swa = 0, pos_next = 8192, size = 432.744 MiB) 6.27.655.831 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 53358, pos_max = 53358, n_tokens = 53359, n_swa = 0, pos_next = 8192, size = 433.891 MiB) 6.27.678.729 W slot update_slots: id 0 | task 2360 | erased invalidated context checkpoint (pos_min = 53870, pos_max = 53870, n_tokens = 53871, n_swa = 0, pos_next = 8192, size = 435.900 MiB) 6.28.779.608 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 8192, creating new checkpoint during processing at position 12288 6.28.955.282 I slot create_check: id 0 | task 2360 | created context checkpoint 5 of 256 (pos_min = 10239, pos_max = 10239, n_tokens = 10240, size = 264.635 MiB) 6.30.033.921 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 4096, progress = 0.23, t = 3.45 s / 1188.43 tokens per second 6.30.034.221 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 10240, creating new checkpoint during processing at position 14336 6.30.209.504 I slot create_check: id 0 | task 2360 | created context checkpoint 6 of 256 (pos_min = 12287, pos_max = 12287, n_tokens = 12288, size = 272.674 MiB) 6.31.293.512 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 6144, progress = 0.26, t = 4.71 s / 1305.52 tokens per second 6.31.293.805 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 12288, creating new checkpoint during processing at position 16384 6.31.470.949 I slot create_check: id 0 | task 2360 | created context checkpoint 7 of 256 (pos_min = 14335, pos_max = 14335, n_tokens = 14336, size = 280.713 MiB) 6.32.558.411 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 8192, progress = 0.30, t = 5.97 s / 1371.95 tokens per second 6.32.558.709 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 14336, creating new checkpoint during processing at position 18432 6.32.738.202 I slot create_check: id 0 | task 2360 | created context checkpoint 8 of 256 (pos_min = 16383, pos_max = 16383, n_tokens = 16384, size = 288.752 MiB) 6.33.844.228 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 10240, progress = 0.34, t = 7.26 s / 1411.08 tokens per second 6.33.844.535 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 16384, creating new checkpoint during processing at position 20480 6.34.029.803 I slot create_check: id 0 | task 2360 | created context checkpoint 9 of 256 (pos_min = 18431, pos_max = 18431, n_tokens = 18432, size = 296.791 MiB) 6.35.151.596 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 12288, progress = 0.38, t = 8.56 s / 1434.80 tokens per second 6.35.151.905 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 18432, creating new checkpoint during processing at position 22528 6.35.339.671 I slot create_check: id 0 | task 2360 | created context checkpoint 10 of 256 (pos_min = 20479, pos_max = 20479, n_tokens = 20480, size = 304.830 MiB) 6.36.478.507 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 14336, progress = 0.42, t = 9.89 s / 1449.38 tokens per second 6.36.478.802 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 20480, creating new checkpoint during processing at position 24576 6.36.682.031 I slot create_check: id 0 | task 2360 | created context checkpoint 11 of 256 (pos_min = 22527, pos_max = 22527, n_tokens = 22528, size = 312.869 MiB) 6.37.852.478 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 16384, progress = 0.45, t = 11.27 s / 1454.40 tokens per second 6.37.852.782 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 22528, creating new checkpoint during processing at position 26624 6.38.319.572 I slot create_check: id 0 | task 2360 | created context checkpoint 12 of 256 (pos_min = 24575, pos_max = 24575, n_tokens = 24576, size = 320.908 MiB) 6.39.488.954 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 18432, progress = 0.49, t = 12.90 s / 1428.66 tokens per second 6.39.489.260 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 24576, creating new checkpoint during processing at position 28672 6.39.980.303 I slot create_check: id 0 | task 2360 | created context checkpoint 13 of 256 (pos_min = 26623, pos_max = 26623, n_tokens = 26624, size = 328.947 MiB) 6.41.166.214 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 20480, progress = 0.53, t = 14.58 s / 1404.77 tokens per second 6.41.166.511 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 26624, creating new checkpoint during processing at position 30720 6.41.823.110 I slot create_check: id 0 | task 2360 | created context checkpoint 14 of 256 (pos_min = 28671, pos_max = 28671, n_tokens = 28672, size = 336.986 MiB) 6.43.022.905 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 22528, progress = 0.57, t = 16.44 s / 1370.69 tokens per second 6.43.023.207 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 28672, creating new checkpoint during processing at position 32768 6.44.023.582 I slot create_check: id 0 | task 2360 | created context checkpoint 15 of 256 (pos_min = 30719, pos_max = 30719, n_tokens = 30720, size = 345.025 MiB) 6.45.247.082 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 24576, progress = 0.60, t = 18.66 s / 1317.06 tokens per second 6.45.247.382 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 30720, creating new checkpoint during processing at position 34816 6.45.643.268 I slot create_check: id 0 | task 2360 | created context checkpoint 16 of 256 (pos_min = 32767, pos_max = 32767, n_tokens = 32768, size = 353.064 MiB) 6.46.894.154 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 26624, progress = 0.64, t = 20.31 s / 1311.09 tokens per second 6.46.894.457 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 32768, creating new checkpoint during processing at position 36864 6.47.210.819 I slot create_check: id 0 | task 2360 | created context checkpoint 17 of 256 (pos_min = 34815, pos_max = 34815, n_tokens = 34816, size = 361.103 MiB) 6.48.487.874 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 28672, progress = 0.68, t = 21.90 s / 1309.19 tokens per second 6.48.488.168 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 34816, creating new checkpoint during processing at position 38912 6.48.845.422 I slot create_check: id 0 | task 2360 | created context checkpoint 18 of 256 (pos_min = 36863, pos_max = 36863, n_tokens = 36864, size = 369.142 MiB) 6.50.143.690 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 30720, progress = 0.72, t = 23.56 s / 1304.11 tokens per second 6.50.144.049 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 36864, creating new checkpoint during processing at position 40960 6.50.583.326 I slot create_check: id 0 | task 2360 | created context checkpoint 19 of 256 (pos_min = 38911, pos_max = 38911, n_tokens = 38912, size = 377.181 MiB) 6.51.902.633 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 32768, progress = 0.76, t = 25.32 s / 1294.40 tokens per second 6.51.902.926 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 38912, creating new checkpoint during processing at position 43008 6.52.290.863 I slot create_check: id 0 | task 2360 | created context checkpoint 20 of 256 (pos_min = 40959, pos_max = 40959, n_tokens = 40960, size = 385.220 MiB) 6.53.629.044 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 34816, progress = 0.79, t = 27.04 s / 1287.49 tokens per second 6.53.629.348 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 40960, creating new checkpoint during processing at position 45056 6.54.010.773 I slot create_check: id 0 | task 2360 | created context checkpoint 21 of 256 (pos_min = 43007, pos_max = 43007, n_tokens = 43008, size = 393.260 MiB) 6.55.369.629 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 36864, progress = 0.83, t = 28.78 s / 1280.79 tokens per second 6.55.369.934 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 43008, creating new checkpoint during processing at position 47104 6.55.749.645 I slot create_check: id 0 | task 2360 | created context checkpoint 22 of 256 (pos_min = 45055, pos_max = 45055, n_tokens = 45056, size = 401.299 MiB) 6.57.129.794 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 38912, progress = 0.87, t = 30.54 s / 1274.03 tokens per second 6.57.130.094 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 45056, creating new checkpoint during processing at position 49152 6.57.514.113 I slot create_check: id 0 | task 2360 | created context checkpoint 23 of 256 (pos_min = 47103, pos_max = 47103, n_tokens = 47104, size = 409.338 MiB) 6.58.912.478 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 40960, progress = 0.91, t = 32.33 s / 1267.13 tokens per second 6.58.912.769 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 47104, creating new checkpoint during processing at position 51200 6.59.306.772 I slot create_check: id 0 | task 2360 | created context checkpoint 24 of 256 (pos_min = 49151, pos_max = 49151, n_tokens = 49152, size = 417.377 MiB) 7.00.724.416 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 43008, progress = 0.94, t = 34.14 s / 1259.86 tokens per second 7.00.724.713 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 49152, creating new checkpoint during processing at position 53248 7.01.131.159 I slot create_check: id 0 | task 2360 | created context checkpoint 25 of 256 (pos_min = 51199, pos_max = 51199, n_tokens = 51200, size = 425.416 MiB) 7.02.566.785 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 45056, progress = 0.98, t = 35.98 s / 1252.27 tokens per second 7.02.567.050 I slot update_slots: id 0 | task 2360 | 1024 tokens since last checkpoint at 51200, creating new checkpoint during processing at position 53728 7.02.980.992 I slot create_check: id 0 | task 2360 | created context checkpoint 26 of 256 (pos_min = 53247, pos_max = 53247, n_tokens = 53248, size = 433.455 MiB) 7.03.340.273 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 45536, progress = 0.99, t = 36.75 s / 1238.98 tokens per second 7.03.739.218 I slot create_check: id 0 | task 2360 | created context checkpoint 27 of 256 (pos_min = 53727, pos_max = 53727, n_tokens = 53728, size = 435.339 MiB) 7.04.112.269 I slot print_timing: id 0 | task 2360 | prompt processing, n_tokens = 46048, progress = 1.00, t = 37.52 s / 1227.13 tokens per second 7.04.525.030 I slot create_check: id 0 | task 2360 | created context checkpoint 28 of 256 (pos_min = 54239, pos_max = 54239, n_tokens = 54240, size = 437.349 MiB) 7.04.890.536 I reasoning-budget: deactivated (natural end) 7.06.342.988 I slot print_timing: id 0 | task 2360 | n_decoded = 100, tg = 56.36 t/s 7.06.547.400 I slot print_timing: id 0 | task 2360 | prompt eval time = 37981.12 ms / 46052 tokens ( 0.82 ms per token, 1212.50 tokens per second) 7.06.547.418 I slot print_timing: id 0 | task 2360 | eval time = 1978.69 ms / 112 tokens ( 17.67 ms per token, 56.60 tokens per second) 7.06.547.419 I slot print_timing: id 0 | task 2360 | total time = 39959.81 ms / 46164 tokens 7.06.547.419 I slot print_timing: id 0 | task 2360 | graphs reused = 2196 7.06.547.420 I slot print_timing: id 0 | task 2360 | draft acceptance = 0.94737 ( 54 accepted / 57 generated) 7.06.547.426 I statistics draft-mtp: #calls(b,g,a) = 37 2249 2249, #gen drafts = 2249, #acc drafts = 2098, #gen tokens = 2249, #acc tokens = 2098, dur(b,g,a) = 0.025, 5623.534, 1.575 ms 7.06.548.792 I slot release: id 0 | task 2360 | stop processing: n_tokens = 54355, truncated = 0 7.06.548.857 I srv update_slots: all slots are idle 7.07.730.044 I srv params_from_: Chat format: peg-native 7.07.731.745 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 7.07.732.172 I reasoning-budget: activated, budget=2147483647 tokens 7.07.732.229 I slot launch_slot_: id 0 | task 2443 | processing task, is_child = 0 7.07.732.252 W slot update_slots: id 0 | task 2443 | n_past = 54355, slot.prompt.tokens.size() = 54355, seq_id = 0, pos_min = 54354, n_swa = 0 7.07.732.253 I slot update_slots: id 0 | task 2443 | Checking checkpoint with [54239, 54239] against 54354... 7.07.777.488 W slot update_slots: id 0 | task 2443 | restored context checkpoint (pos_min = 54239, pos_max = 54239, n_tokens = 54240, n_past = 54240, size = 437.349 MiB) 7.08.320.601 I slot create_check: id 0 | task 2443 | created context checkpoint 29 of 256 (pos_min = 54381, pos_max = 54381, n_tokens = 54382, size = 437.906 MiB) 7.08.516.209 I reasoning-budget: deactivated (natural end) 7.10.092.789 I slot print_timing: id 0 | task 2443 | prompt eval time = 632.31 ms / 146 tokens ( 4.33 ms per token, 230.90 tokens per second) 7.10.092.807 I slot print_timing: id 0 | task 2443 | eval time = 1727.99 ms / 99 tokens ( 17.45 ms per token, 57.29 tokens per second) 7.10.092.807 I slot print_timing: id 0 | task 2443 | total time = 2360.30 ms / 245 tokens 7.10.092.808 I slot print_timing: id 0 | task 2443 | graphs reused = 2245 7.10.092.809 I slot print_timing: id 0 | task 2443 | draft acceptance = 0.98000 ( 49 accepted / 50 generated) 7.10.092.816 I statistics draft-mtp: #calls(b,g,a) = 38 2299 2299, #gen drafts = 2299, #acc drafts = 2147, #gen tokens = 2299, #acc tokens = 2147, dur(b,g,a) = 0.025, 5751.470, 1.613 ms 7.10.094.169 I slot release: id 0 | task 2443 | stop processing: n_tokens = 54485, truncated = 0 7.10.094.236 I srv update_slots: all slots are idle 7.13.483.089 I srv params_from_: Chat format: peg-native 7.13.484.728 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 7.13.485.141 I reasoning-budget: activated, budget=2147483647 tokens 7.13.485.203 I slot launch_slot_: id 0 | task 2496 | processing task, is_child = 0 7.13.485.242 W slot update_slots: id 0 | task 2496 | n_past = 54485, slot.prompt.tokens.size() = 54485, seq_id = 0, pos_min = 54484, n_swa = 0 7.13.485.242 I slot update_slots: id 0 | task 2496 | Checking checkpoint with [54381, 54381] against 54484... 7.13.530.952 W slot update_slots: id 0 | task 2496 | restored context checkpoint (pos_min = 54381, pos_max = 54381, n_tokens = 54382, n_past = 54382, size = 437.906 MiB) 7.14.059.948 I slot create_check: id 0 | task 2496 | created context checkpoint 30 of 256 (pos_min = 54500, pos_max = 54500, n_tokens = 54501, size = 438.373 MiB) 7.14.323.182 I reasoning-budget: deactivated (natural end) 7.15.881.748 I slot print_timing: id 0 | task 2496 | n_decoded = 101, tg = 56.81 t/s 7.15.985.410 I slot print_timing: id 0 | task 2496 | prompt eval time = 618.28 ms / 123 tokens ( 5.03 ms per token, 198.94 tokens per second) 7.15.985.431 I slot print_timing: id 0 | task 2496 | eval time = 1881.64 ms / 106 tokens ( 17.75 ms per token, 56.33 tokens per second) 7.15.985.431 I slot print_timing: id 0 | task 2496 | total time = 2499.92 7.10.092.789 I slot print_timing: id 0 | task 2443 | prompt eval time = 632.31 ms / 146 tokens ( 4.33 ms per token, 230.90 tokens per second) [760/1622] 7.10.092.807 I slot print_timing: id 0 | task 2443 | eval time = 1727.99 ms / 99 tokens ( 17.45 ms per token, 57.29 tokens per second) 7.10.092.807 I slot print_timing: id 0 | task 2443 | total time = 2360.30 ms / 245 tokens 7.10.092.808 I slot print_timing: id 0 | task 2443 | graphs reused = 2245 7.10.092.809 I slot print_timing: id 0 | task 2443 | draft acceptance = 0.98000 ( 49 accepted / 50 generated) 7.10.092.816 I statistics draft-mtp: #calls(b,g,a) = 38 2299 2299, #gen drafts = 2299, #acc drafts = 2147, #gen tokens = 2299, #acc tokens = 2147, dur(b,g,a) = 0.025, 5751.470, 1.613 ms 7.10.094.169 I slot release: id 0 | task 2443 | stop processing: n_tokens = 54485, truncated = 0 7.10.094.236 I srv update_slots: all slots are idle 7.13.483.089 I srv params_from_: Chat format: peg-native 7.13.484.728 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 7.13.485.141 I reasoning-budget: activated, budget=2147483647 tokens 7.13.485.203 I slot launch_slot_: id 0 | task 2496 | processing task, is_child = 0 7.13.485.242 W slot update_slots: id 0 | task 2496 | n_past = 54485, slot.prompt.tokens.size() = 54485, seq_id = 0, pos_min = 54484, n_swa = 0 7.13.485.242 I slot update_slots: id 0 | task 2496 | Checking checkpoint with [54381, 54381] against 54484... 7.13.530.952 W slot update_slots: id 0 | task 2496 | restored context checkpoint (pos_min = 54381, pos_max = 54381, n_tokens = 54382, n_past = 54382, size = 437.906 MiB) 7.14.059.948 I slot create_check: id 0 | task 2496 | created context checkpoint 30 of 256 (pos_min = 54500, pos_max = 54500, n_tokens = 54501, size = 438.373 MiB) 7.14.323.182 I reasoning-budget: deactivated (natural end) 7.15.881.748 I slot print_timing: id 0 | task 2496 | n_decoded = 101, tg = 56.81 t/s 7.15.985.410 I slot print_timing: id 0 | task 2496 | prompt eval time = 618.28 ms / 123 tokens ( 5.03 ms per token, 198.94 tokens per second) 7.15.985.431 I slot print_timing: id 0 | task 2496 | eval time = 1881.64 ms / 106 tokens ( 17.75 ms per token, 56.33 tokens per second) 7.15.985.431 I slot print_timing: id 0 | task 2496 | total time = 2499.92 ms / 229 tokens 7.15.985.443 I slot print_timing: id 0 | task 2496 | graphs reused = 2297 7.15.985.448 I slot print_timing: id 0 | task 2496 | draft acceptance = 0.96296 ( 52 accepted / 54 generated) 7.15.985.457 I statistics draft-mtp: #calls(b,g,a) = 39 2353 2353, #gen drafts = 2353, #acc drafts = 2199, #gen tokens = 2353, #acc tokens = 2199, dur(b,g,a) = 0.025, 5890.695, 1.643 ms 7.15.986.840 I slot release: id 0 | task 2496 | stop processing: n_tokens = 54611, truncated = 0 7.15.986.916 I srv update_slots: all slots are idle 7.17.156.087 I srv params_from_: Chat format: peg-native 7.17.157.871 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 7.17.158.370 I reasoning-budget: activated, budget=2147483647 tokens 7.17.158.446 I slot launch_slot_: id 0 | task 2553 | processing task, is_child = 0 7.17.158.489 W slot update_slots: id 0 | task 2553 | n_past = 54611, slot.prompt.tokens.size() = 54611, seq_id = 0, pos_min = 54610, n_swa = 0 7.17.158.490 I slot update_slots: id 0 | task 2553 | Checking checkpoint with [54500, 54500] against 54610... 7.17.212.562 W slot update_slots: id 0 | task 2553 | restored context checkpoint (pos_min = 54500, pos_max = 54500, n_tokens = 54501, n_past = 54501, size = 438.373 MiB) 7.17.777.847 I slot create_check: id 0 | task 2553 | created context checkpoint 31 of 256 (pos_min = 54636, pos_max = 54636, n_tokens = 54637, size = 438.907 MiB) 7.17.972.847 I reasoning-budget: deactivated (natural end) 7.19.581.339 I slot print_timing: id 0 | task 2553 | n_decoded = 100, tg = 56.83 t/s 7.19.683.447 I slot print_timing: id 0 | task 2553 | prompt eval time = 662.91 ms / 140 tokens ( 4.74 ms per token, 211.19 tokens per second) 7.19.683.464 I slot print_timing: id 0 | task 2553 | eval time = 1861.81 ms / 106 tokens ( 17.56 ms per token, 56.93 tokens per second) 7.19.683.465 I slot print_timing: id 0 | task 2553 | total time = 2524.72 ms / 246 tokens 7.19.683.466 I slot print_timing: id 0 | task 2553 | graphs reused = 2350 7.19.683.466 I slot print_timing: id 0 | task 2553 | draft acceptance = 0.94444 ( 51 accepted / 54 generated) 7.19.683.474 I statistics draft-mtp: #calls(b,g,a) = 40 2407 2407, #gen drafts = 2407, #acc drafts = 2250, #gen tokens = 2407, #acc tokens = 2250, dur(b,g,a) = 0.026, 6029.216, 1.676 ms 7.19.684.793 I slot release: id 0 | task 2553 | stop processing: n_tokens = 54746, truncated = 0 7.19.684.874 I srv update_slots: all slots are idle 7.23.723.150 I srv params_from_: Chat format: peg-native 7.23.724.797 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 7.23.725.233 I reasoning-budget: activated, budget=2147483647 tokens 7.23.725.292 I slot launch_slot_: id 0 | task 2610 | processing task, is_child = 0 7.23.725.331 W slot update_slots: id 0 | task 2610 | n_past = 54746, slot.prompt.tokens.size() = 54746, seq_id = 0, pos_min = 54745, n_swa = 0 7.23.725.331 I slot update_slots: id 0 | task 2610 | Checking checkpoint with [54636, 54636] against 54745... 7.23.771.462 W slot update_slots: id 0 | task 2610 | restored context checkpoint (pos_min = 54636, pos_max = 54636, n_tokens = 54637, n_past = 54637, size = 438.907 MiB) 7.24.338.737 I slot create_check: id 0 | task 2610 | created context checkpoint 32 of 256 (pos_min = 54769, pos_max = 54769, n_tokens = 54770, size = 439.429 MiB) 7.24.568.000 I reasoning-budget: deactivated (natural end) 7.26.127.635 I slot print_timing: id 0 | task 2610 | n_decoded = 101, tg = 57.87 t/s 7.26.229.985 I slot print_timing: id 0 | task 2610 | prompt eval time = 656.63 ms / 137 tokens ( 4.79 ms per token, 208.64 tokens per second) 7.26.230.003 I slot print_timing: id 0 | task 2610 | eval time = 1847.78 ms / 106 tokens ( 17.43 ms per token, 57.37 tokens per second) 7.26.230.004 I slot print_timing: id 0 | task 2610 | total time = 2504.41 ms / 243 tokens 7.26.230.004 I slot print_timing: id 0 | task 2610 | graphs reused = 2401 7.26.230.005 I slot print_timing: id 0 | task 2610 | draft acceptance = 1.00000 ( 53 accepted / 53 generated) 7.26.230.012 I statistics draft-mtp: #calls(b,g,a) = 41 2460 2460, #gen drafts = 2460, #acc drafts = 2303, #gen tokens = 2460, #acc tokens = 2303, dur(b,g,a) = 0.026, 6164.584, 1.712 ms 7.26.231.332 I slot release: id 0 | task 2610 | stop processing: n_tokens = 54880, truncated = 0 7.26.231.404 I srv update_slots: all slots are idle 7.27.833.748 I srv params_from_: Chat format: peg-native 7.27.835.402 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 7.27.835.840 I reasoning-budget: activated, budget=2147483647 tokens 7.27.835.897 I slot launch_slot_: id 0 | task 2666 | processing task, is_child = 0 7.27.835.922 W slot update_slots: id 0 | task 2666 | n_past = 54880, slot.prompt.tokens.size() = 54880, seq_id = 0, pos_min = 54879, n_swa = 0 7.27.835.923 I slot update_slots: id 0 | task 2666 | Checking checkpoint with [54769, 54769] against 54879... 7.27.881.704 W slot update_slots: id 0 | task 2666 | restored context checkpoint (pos_min = 54769, pos_max = 54769, n_tokens = 54770, n_past = 54770, size = 439.429 MiB) 7.28.452.089 I slot create_check: id 0 | task 2666 | created context checkpoint 33 of 256 (pos_min = 54905, pos_max = 54905, n_tokens = 54906, size = 439.963 MiB) 7.28.647.282 I reasoning-budget: deactivated (natural end) 7.30.227.975 I slot print_timing: id 0 | task 2666 | n_decoded = 101, tg = 58.32 t/s 7.30.296.464 I slot print_timing: id 0 | task 2666 | prompt eval time = 659.88 ms / 140 tokens ( 4.71 ms per token, 212.16 tokens per second) 7.30.296.481 I slot print_timing: id 0 | task 2666 | eval time = 1800.42 ms / 104 tokens ( 17.31 ms per token, 57.76 tokens per second) 7.30.296.482 I slot print_timing: id 0 | task 2666 | total time = 2460.30 ms / 244 tokens 7.30.296.483 I slot print_timing: id 0 | task 2666 | graphs reused = 2452 7.30.296.483 I slot print_timing: id 0 | task 2666 | draft acceptance = 1.00000 ( 52 accepted / 52 generated) 7.30.296.490 I statistics draft-mtp: #calls(b,g,a) = 42 2512 2512, #gen drafts = 2512, #acc drafts = 2355, #gen tokens = 2512, #acc tokens = 2355, dur(b,g,a) = 0.027, 6297.723, 1.745 ms 7.30.297.811 I slot release: id 0 | task 2666 | stop processing: n_tokens = 55014, truncated = 0 7.30.297.887 I srv update_slots: all slots are idle 7.33.946.660 I srv params_from_: Chat format: peg-native 7.33.948.331 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 7.33.948.741 I reasoning-budget: activated, budget=2147483647 tokens 7.33.948.797 I slot launch_slot_: id 0 | task 2721 | processing task, is_child = 0 7.33.948.823 W slot update_slots: id 0 | task 2721 | n_past = 55014, slot.prompt.tokens.size() = 55014, seq_id = 0, pos_min = 55013, n_swa = 0 7.33.948.823 I slot update_slots: id 0 | task 2721 | Checking checkpoint with [54905, 54905] against 55013... 7.33.994.578 W slot update_slots: id 0 | task 2721 | restored context checkpoint (pos_min = 54905, pos_max = 54905, n_tokens = 54906, n_past = 54906, size = 439.963 MiB) 7.34.568.471 I slot create_check: id 0 | task 2721 | created context checkpoint 34 of 256 (pos_min = 55034, pos_max = 55034, n_tokens = 55035, size = 440.469 MiB) 7.34.866.310 I reasoning-budget: deactivated (natural end) 7.36.340.524 I slot print_timing: id 0 | task 2721 | n_decoded = 100, tg = 57.86 t/s 7.36.511.918 I slot print_timing: id 0 | task 2721 | prompt eval time = 663.08 ms / 133 tokens ( 4.99 ms per token, 200.58 tokens per second) 7.36.511.936 I slot print_timing: id 0 | task 2721 | eval time = 1899.74 ms / 109 tokens ( 17.43 ms per token, 57.38 tokens per second) 7.36.511.937 I slot print_timing: id 0 | task 2721 | total time = 2562.83 ms / 242 tokens 7.36.511.937 I slot print_timing: id 0 | task 2721 | graphs reused = 2506 7.36.511.938 I slot print_timing: id 0 | task 2721 | draft acceptance = 0.98182 ( 54 accepted / 55 generated) 7.36.511.945 I statistics draft-mtp: #calls(b,g,a) = 43 2567 2567, #gen drafts = 2567, #acc drafts = 2409, #gen tokens = 2567, #acc tokens = 2409, dur(b,g,a) = 0.027, 6438.214, 1.780 ms 7.36.513.321 I slot release: id 0 | task 2721 | stop processing: n_tokens = 55148, truncated = 0 7.36.513.389 I srv update_slots: all slots are idle 7.37.691.909 I srv params_from_: Chat format: peg-native 7.37.693.608 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 7.37.694.056 I reasoning-budget: activated, budget=2147483647 tokens 7.37.694.112 I slot launch_slot_: id 0 | task 2779 | processing task, is_child = 0 7.37.694.137 W slot update_slots: id 0 | task 2779 | n_past = 55148, slot.prompt.tokens.size() = 55148, seq_id = 0, pos_min = 55147, n_swa = 0 7.37.694.138 I slot update_slots: id 0 | task 2779 | Checking checkpoint with [55034, 55034] against 55147... 7.37.739.461 W slot update_slots: id 0 | task 2779 | restored context checkpoint (pos_min = 55034, pos_max = 55034, n_tokens = 55035, n_past = 55035, size = 440.469 MiB) 7.38.286.760 I slot create_check: id 0 | task 2779 | created context checkpoint 35 of 256 (pos_min = 55173, pos_max = 55173, n_tokens = 55174, size = 441.015 MiB) 7.38.653.964 I reasoning-budget: deactivated (natural end) 7.39.820.582 I slot print_timing: id 0 | task 2779 | prompt eval time = 636.16 ms / 143 tokens ( 4.45 ms per token, 224.79 tokens per second) 7.39.820.600 I slot print_timing: id 0 | task 2779 | eval time = 1490.04 ms / 85 tokens ( 17.53 ms per token, 57.05 tokens per second) 7.39.820.600 I slot print_timing: id 0 | task 2779 | total time = 2126.20 ms / 228 tokens 7.39.820.601 I slot print_timing: id 0 | task 2779 | graphs reused = 2548 7.39.820.602 I slot print_timing: id 0 | task 2779 | draft acceptance = 0.97674 ( 42 accepted / 43 generated) 7.39.820.609 I statistics draft-mtp: #calls(b,g,a) = 44 2610 2610, #gen drafts = 2610, #acc drafts = 2451, #gen tokens = 2610, #acc tokens = 2451, dur(b,g,a) = 0.028, 6548.202, 1.812 ms 7.39.822.006 I slot release: id 0 | task 2779 | stop processing: n_tokens = 55263, truncated = 0 7.39.822.099 I srv update_slots: all slots are idle 7.40.799.480 I srv params_from_: Chat format: peg-native 7.40.801.111 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.987 (> 0.100 thold), f_keep = 1.000 7.40.801.614 I reasoning-budget: activated, budget=2147483647 tokens 7.40.801.676 I slot launch_slot_: id 0 | task 2825 | processing task, is_child = 0 7.40.801.715 W slot update_slots: id 0 | task 2825 | n_past = 55263, slot.prompt.tokens.size() = 55263, seq_id = 0, pos_min = 55262, n_swa = 0 7.40.801.716 I slot update_slots: id 0 | task 2825 | Checking checkpoint with [55173, 55173] against 55262... 7.40.848.512 W slot update_slots: id 0 | task 2825 | restored context checkpoint (pos_min = 55173, pos_max = 55173, n_tokens = 55174, n_past = 55174, size = 441.015 MiB) 7.41.501.049 I slot create_check: id 0 | task 2825 | created context checkpoint 36 of 256 (pos_min = 55453, pos_max = 55453, n_tokens = 55454, size = 442.114 MiB) 7.42.286.574 I slot create_check: id 0 | task 2825 | created context checkpoint 37 of 256 (pos_min = 55965, pos_max = 55965, n_tokens = 55966, size = 444.124 MiB) 7.43.233.350 I reasoning-budget: deactivated (natural end) 7.44.175.466 I slot print_timing: id 0 | task 2825 | n_decoded = 101, tg = 54.74 t/s 7.44.858.753 I slot print_timing: id 0 | task 2825 | prompt eval time = 1528.49 ms / 796 tokens ( 1.92 ms per token, 520.77 tokens per second) 7.44.858.772 I slot print_timing: id 0 | task 2825 | eval time = 2528.29 7.44.858.772 I slot print_timing: id 0 | task 2825 | eval time = 2528.29 ms / 141 tokens ( 17.93 ms per token, 55.77 tokens per second) [633/1622] 7.44.858.773 I slot print_timing: id 0 | task 2825 | total time = 4056.78 ms / 937 tokens 7.44.858.773 I slot print_timing: id 0 | task 2825 | graphs reused = 2619 7.44.858.774 I slot print_timing: id 0 | task 2825 | draft acceptance = 0.91781 ( 67 accepted / 73 generated) 7.44.858.781 I statistics draft-mtp: #calls(b,g,a) = 45 2683 2683, #gen drafts = 2683, #acc drafts = 2518, #gen tokens = 2683, #acc tokens = 2518, dur(b,g,a) = 0.028, 6735.342, 1.867 ms 7.44.860.208 I slot release: id 0 | task 2825 | stop processing: n_tokens = 56110, truncated = 0 7.44.860.278 I srv update_slots: all slots are idle 7.46.088.771 I srv params_from_: Chat format: peg-native 7.46.090.599 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 7.46.091.033 I reasoning-budget: activated, budget=2147483647 tokens 7.46.091.092 I slot launch_slot_: id 0 | task 2902 | processing task, is_child = 0 7.46.091.115 W slot update_slots: id 0 | task 2902 | n_past = 56110, slot.prompt.tokens.size() = 56110, seq_id = 0, pos_min = 56109, n_swa = 0 7.46.091.116 I slot update_slots: id 0 | task 2902 | Checking checkpoint with [55965, 55965] against 56109... 7.46.136.873 W slot update_slots: id 0 | task 2902 | restored context checkpoint (pos_min = 55965, pos_max = 55965, n_tokens = 55966, n_past = 55966, size = 444.124 MiB) 7.46.709.365 I slot create_check: id 0 | task 2902 | created context checkpoint 38 of 256 (pos_min = 56136, pos_max = 56136, n_tokens = 56137, size = 444.795 MiB) 7.47.074.801 I reasoning-budget: deactivated (natural end) 7.48.241.885 I slot print_timing: id 0 | task 2902 | prompt eval time = 661.77 ms / 175 tokens ( 3.78 ms per token, 264.44 tokens per second) 7.48.241.904 I slot print_timing: id 0 | task 2902 | eval time = 1488.72 ms / 85 tokens ( 17.51 ms per token, 57.10 tokens per second) 7.48.241.905 I slot print_timing: id 0 | task 2902 | total time = 2150.49 ms / 260 tokens 7.48.241.906 I slot print_timing: id 0 | task 2902 | graphs reused = 2661 7.48.241.907 I slot print_timing: id 0 | task 2902 | draft acceptance = 0.97674 ( 42 accepted / 43 generated) 7.48.241.914 I statistics draft-mtp: #calls(b,g,a) = 46 2726 2726, #gen drafts = 2726, #acc drafts = 2560, #gen tokens = 2726, #acc tokens = 2560, dur(b,g,a) = 0.028, 6846.039, 1.891 ms 7.48.243.306 I slot release: id 0 | task 2902 | stop processing: n_tokens = 56226, truncated = 0 7.48.243.377 I srv update_slots: all slots are idle 7.49.171.917 I srv params_from_: Chat format: peg-native 7.49.173.577 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.988 (> 0.100 thold), f_keep = 1.000 7.49.174.052 I reasoning-budget: activated, budget=2147483647 tokens 7.49.174.110 I slot launch_slot_: id 0 | task 2948 | processing task, is_child = 0 7.49.174.148 W slot update_slots: id 0 | task 2948 | n_past = 56226, slot.prompt.tokens.size() = 56226, seq_id = 0, pos_min = 56225, n_swa = 0 7.49.174.149 I slot update_slots: id 0 | task 2948 | Checking checkpoint with [56136, 56136] against 56225... 7.49.221.090 W slot update_slots: id 0 | task 2948 | restored context checkpoint (pos_min = 56136, pos_max = 56136, n_tokens = 56137, n_past = 56137, size = 444.795 MiB) 7.49.878.888 I slot create_check: id 0 | task 2948 | created context checkpoint 39 of 256 (pos_min = 56416, pos_max = 56416, n_tokens = 56417, size = 445.894 MiB) 7.50.671.278 I slot create_check: id 0 | task 2948 | created context checkpoint 40 of 256 (pos_min = 56928, pos_max = 56928, n_tokens = 56929, size = 447.904 MiB) 7.51.654.568 I reasoning-budget: deactivated (natural end) 7.52.512.455 I slot print_timing: id 0 | task 2948 | n_decoded = 100, tg = 55.64 t/s 7.53.304.030 I slot print_timing: id 0 | task 2948 | prompt eval time = 1540.68 ms / 796 tokens ( 1.94 ms per token, 516.65 tokens per second) 7.53.304.048 I slot print_timing: id 0 | task 2948 | eval time = 2588.96 ms / 145 tokens ( 17.85 ms per token, 56.01 tokens per second) 7.53.304.048 I slot print_timing: id 0 | task 2948 | total time = 4129.64 ms / 941 tokens 7.53.304.049 I slot print_timing: id 0 | task 2948 | graphs reused = 2735 7.53.304.050 I slot print_timing: id 0 | task 2948 | draft acceptance = 0.92000 ( 69 accepted / 75 generated) 7.53.304.057 I statistics draft-mtp: #calls(b,g,a) = 47 2801 2801, #gen drafts = 2801, #acc drafts = 2629, #gen tokens = 2801, #acc tokens = 2629, dur(b,g,a) = 0.029, 7038.065, 1.937 ms 7.53.305.429 I slot release: id 0 | task 2948 | stop processing: n_tokens = 57077, truncated = 0 7.53.305.498 I srv update_slots: all slots are idle 7.54.585.604 I srv params_from_: Chat format: peg-native 7.54.587.352 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 7.54.587.771 I reasoning-budget: activated, budget=2147483647 tokens 7.54.587.830 I slot launch_slot_: id 0 | task 3027 | processing task, is_child = 0 7.54.587.877 W slot update_slots: id 0 | task 3027 | n_past = 57077, slot.prompt.tokens.size() = 57077, seq_id = 0, pos_min = 57076, n_swa = 0 7.54.587.878 I slot update_slots: id 0 | task 3027 | Checking checkpoint with [56928, 56928] against 57076... 7.54.637.522 W slot update_slots: id 0 | task 3027 | restored context checkpoint (pos_min = 56928, pos_max = 56928, n_tokens = 56929, n_past = 56929, size = 447.904 MiB) 7.55.222.287 I slot create_check: id 0 | task 3027 | created context checkpoint 41 of 256 (pos_min = 57103, pos_max = 57103, n_tokens = 57104, size = 448.591 MiB) 7.55.590.583 I reasoning-budget: deactivated (natural end) 7.56.762.922 I slot print_timing: id 0 | task 3027 | prompt eval time = 678.02 ms / 179 tokens ( 3.79 ms per token, 264.01 tokens per second) 7.56.762.940 I slot print_timing: id 0 | task 3027 | eval time = 1496.60 ms / 86 tokens ( 17.40 ms per token, 57.46 tokens per second) 7.56.762.941 I slot print_timing: id 0 | task 3027 | total time = 2174.62 ms / 265 tokens 7.56.762.942 I slot print_timing: id 0 | task 3027 | graphs reused = 2777 7.56.762.942 I slot print_timing: id 0 | task 3027 | draft acceptance = 1.00000 ( 43 accepted / 43 generated) 7.56.762.950 I statistics draft-mtp: #calls(b,g,a) = 48 2844 2844, #gen drafts = 2844, #acc drafts = 2672, #gen tokens = 2844, #acc tokens = 2672, dur(b,g,a) = 0.030, 7148.566, 1.969 ms 7.56.764.377 I slot release: id 0 | task 3027 | stop processing: n_tokens = 57194, truncated = 0 7.56.764.446 I srv update_slots: all slots are idle 7.57.772.836 I srv params_from_: Chat format: peg-native 7.57.774.480 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.991 (> 0.100 thold), f_keep = 1.000 7.57.774.919 I reasoning-budget: activated, budget=2147483647 tokens 7.57.774.977 I slot launch_slot_: id 0 | task 3073 | processing task, is_child = 0 7.57.775.001 W slot update_slots: id 0 | task 3073 | n_past = 57194, slot.prompt.tokens.size() = 57194, seq_id = 0, pos_min = 57193, n_swa = 0 7.57.775.002 I slot update_slots: id 0 | task 3073 | Checking checkpoint with [57103, 57103] against 57193... 7.57.821.309 W slot update_slots: id 0 | task 3073 | restored context checkpoint (pos_min = 57103, pos_max = 57103, n_tokens = 57104, n_past = 57104, size = 448.591 MiB) 7.58.354.089 I slot create_check: id 0 | task 3073 | created context checkpoint 42 of 256 (pos_min = 57187, pos_max = 57187, n_tokens = 57188, size = 448.921 MiB) 7.59.154.455 I slot create_check: id 0 | task 3073 | created context checkpoint 43 of 256 (pos_min = 57699, pos_max = 57699, n_tokens = 57700, size = 450.930 MiB) 7.59.625.903 I reasoning-budget: deactivated (natural end) 8.00.933.176 I slot print_timing: id 0 | task 3073 | n_decoded = 100, tg = 57.66 t/s 8.01.277.746 I slot print_timing: id 0 | task 3073 | prompt eval time = 1423.48 ms / 600 tokens ( 2.37 ms per token, 421.50 tokens per second) 8.01.277.765 I slot print_timing: id 0 | task 3073 | eval time = 2079.02 ms / 119 tokens ( 17.47 ms per token, 57.24 tokens per second) 8.01.277.766 I slot print_timing: id 0 | task 3073 | total time = 3502.50 ms / 719 tokens 8.01.277.767 I slot print_timing: id 0 | task 3073 | graphs reused = 2836 8.01.277.768 I slot print_timing: id 0 | task 3073 | draft acceptance = 0.98333 ( 59 accepted / 60 generated) 8.01.277.775 I statistics draft-mtp: #calls(b,g,a) = 49 2904 2904, #gen drafts = 2904, #acc drafts = 2731, #gen tokens = 2904, #acc tokens = 2731, dur(b,g,a) = 0.031, 7303.128, 2.015 ms 8.01.279.214 I slot release: id 0 | task 3073 | stop processing: n_tokens = 57823, truncated = 0 8.01.279.287 I srv update_slots: all slots are idle 8.02.534.187 I srv params_from_: Chat format: peg-native 8.02.535.855 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 8.02.536.273 I reasoning-budget: activated, budget=2147483647 tokens 8.02.536.337 I slot launch_slot_: id 0 | task 3137 | processing task, is_child = 0 8.02.536.375 W slot update_slots: id 0 | task 3137 | n_past = 57823, slot.prompt.tokens.size() = 57823, seq_id = 0, pos_min = 57822, n_swa = 0 8.02.536.377 I slot update_slots: id 0 | task 3137 | Checking checkpoint with [57699, 57699] against 57822... 8.02.582.573 W slot update_slots: id 0 | task 3137 | restored context checkpoint (pos_min = 57699, pos_max = 57699, n_tokens = 57700, n_past = 57700, size = 450.930 MiB) 8.03.177.530 I slot create_check: id 0 | task 3137 | created context checkpoint 44 of 256 (pos_min = 57848, pos_max = 57848, n_tokens = 57849, size = 451.515 MiB) 8.03.524.112 I reasoning-budget: deactivated (natural end) 8.04.729.374 I slot print_timing: id 0 | task 3137 | prompt eval time = 685.27 ms / 153 tokens ( 4.48 ms per token, 223.27 tokens per second) 8.04.729.392 I slot print_timing: id 0 | task 3137 | eval time = 1507.39 ms / 84 tokens ( 17.95 ms per token, 55.73 tokens per second) 8.04.729.392 I slot print_timing: id 0 | task 3137 | total time = 2192.67 ms / 237 tokens 8.04.729.393 I slot print_timing: id 0 | task 3137 | graphs reused = 2877 8.04.729.394 I slot print_timing: id 0 | task 3137 | draft acceptance = 0.95349 ( 41 accepted / 43 generated) 8.04.729.401 I statistics draft-mtp: #calls(b,g,a) = 50 2947 2947, #gen drafts = 2947, #acc drafts = 2772, #gen tokens = 2947, #acc tokens = 2772, dur(b,g,a) = 0.032, 7414.011, 2.046 ms 8.04.730.785 I slot release: id 0 | task 3137 | stop processing: n_tokens = 57937, truncated = 0 8.04.730.858 I srv update_slots: all slots are idle 8.05.656.819 I srv params_from_: Chat format: peg-native 8.05.658.491 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.995 (> 0.100 thold), f_keep = 1.000 8.05.658.928 I reasoning-budget: activated, budget=2147483647 tokens 8.05.658.987 I slot launch_slot_: id 0 | task 3183 | processing task, is_child = 0 8.05.659.026 W slot update_slots: id 0 | task 3183 | n_past = 57937, slot.prompt.tokens.size() = 57937, seq_id = 0, pos_min = 57936, n_swa = 0 8.05.659.026 I slot update_slots: id 0 | task 3183 | Checking checkpoint with [57848, 57848] against 57936... 8.05.707.738 W slot update_slots: id 0 | task 3183 | restored context checkpoint (pos_min = 57848, pos_max = 57848, n_tokens = 57849, n_past = 57849, size = 451.515 MiB) 8.06.396.831 I slot create_check: id 0 | task 3183 | created context checkpoint 45 of 256 (pos_min = 58203, pos_max = 58203, n_tokens = 58204, size = 452.909 MiB) 8.06.697.938 I reasoning-budget: deactivated (natural end) 8.08.185.993 I slot print_timing: id 0 | task 3183 | n_decoded = 100, tg = 57.31 t/s 8.08.357.865 I slot print_timing: id 0 | task 3183 | prompt eval time = 781.76 ms / 359 tokens ( 2.18 ms per token, 459.22 tokens per second) 8.08.357.883 I slot print_timing: id 0 | task 3183 | eval time = 1916.83 ms / 109 tokens ( 17.59 ms per token, 56.86 tokens per second) 8.08.357.883 I slot print_timing: id 0 | task 3183 | total time = 2698.60 ms / 468 tokens 8.08.357.884 I slot print_timing: id 0 | task 3183 | graphs reused = 2931 8.08.357.885 I slot print_timing: id 0 | task 3183 | draft acceptance = 0.98182 ( 54 accepted / 55 generated) 8.08.357.892 I statistics draft-mtp: #calls(b,g,a) = 51 3002 3002, #gen drafts = 3002, #acc drafts = 2826, #gen tokens = 3002, #acc tokens = 2826, dur(b,g,a) = 0.032, 7555.775, 2.086 ms 8.08.359.312 I slot release: id 0 | task 3183 | stop processing: n_tokens = 58317, truncated = 0 8.08.359.382 I srv update_slots: all slots are idle 8.09.587.868 I srv params_from_: Chat format: peg-native 8.09.589.536 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 8.09.590.016 I reasoning-budget: activated, budget=2147483647 tokens 8.09.590.074 I slot launch_slot_: id 0 | task 3241 | processing task, is_child = 0 8.09.590.114 W slot update_slots: id 0 | task 3241 | n_past = 58317, slot.prompt.tokens.size() = 58317, seq_id = 0, pos_min = 58316, n_swa = 0 8.09.590.115 I slot update_slots: id 0 | task 3241 | Checking checkpoint with [58203, 58203] against 58316... 8.09.637.227 W slot update_slots: id 0 | task 3241 | restored context checkpoint (pos_min = 58203, pos_max = 58203, n_tokens = 58204, n_past = 58204, size = 452.909 MiB) 8.10.205.058 I slot create_check: id 0 | task 3241 | created context checkpoint 46 of 256 (pos_min = 58342, pos_max = 58342, n_tokens = 58343, size = 453.454 MiB) 8.10.505.207 I reasoning-budget: deactivated (natural end) 8.11.725.921 I slot print_timing: id 0 | task 3241 | prompt eval time = 659.20 ms / 143 tokens ( 4.61 ms per token, 216.93 tokens per second) 8.11.725.940 I slot print_timing: id 0 | task 3241 | eval time = 1476.33 ms / 83 tokens ( 17.79 ms per token, 56.22 tokens per second) 8.11.725.940 I slot print_timing: id 0 | task 3241 | total time = 2135.53 ms / 226 tokens 8.11.725.941 I slot print_timing: id 0 | task 3241 | graphs reused = 2971 8.11.725.942 I slot print_timing: id 0 | task 3241 | draft acceptance = 0.97619 ( 8.11.725.949 I statistics draft-mtp: #calls(b,g,a) = 52 3044 3044, #gen drafts = 3044, #acc drafts = 2867, #gen tokens = 3044, #acc tokens = 2867, dur(b,g,a) = 0.032, 7663.588, 2.109 ms [505/1622] 8.11.727.370 I slot release: id 0 | task 3241 | stop processing: n_tokens = 58430, truncated = 0 8.11.727.443 I srv update_slots: all slots are idle 8.12.777.122 I srv params_from_: Chat format: peg-native 8.12.778.805 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.980 (> 0.100 thold), f_keep = 1.000 8.12.779.219 I reasoning-budget: activated, budget=2147483647 tokens 8.12.779.277 I slot launch_slot_: id 0 | task 3286 | processing task, is_child = 0 8.12.779.302 W slot update_slots: id 0 | task 3286 | n_past = 58430, slot.prompt.tokens.size() = 58430, seq_id = 0, pos_min = 58429, n_swa = 0 8.12.779.303 I slot update_slots: id 0 | task 3286 | Checking checkpoint with [58342, 58342] against 58429... 8.12.826.300 W slot update_slots: id 0 | task 3286 | restored context checkpoint (pos_min = 58342, pos_max = 58342, n_tokens = 58343, n_past = 58343, size = 453.454 MiB) 8.13.871.384 I slot create_check: id 0 | task 3286 | created context checkpoint 47 of 256 (pos_min = 59114, pos_max = 59114, n_tokens = 59115, size = 456.485 MiB) 8.14.686.187 I slot create_check: id 0 | task 3286 | created context checkpoint 48 of 256 (pos_min = 59626, pos_max = 59626, n_tokens = 59627, size = 458.494 MiB) 8.16.449.332 I reasoning-budget: deactivated (natural end) 8.16.796.396 I slot print_timing: id 0 | task 3286 | n_decoded = 101, tg = 48.88 t/s 8.18.077.677 I slot print_timing: id 0 | task 3286 | prompt eval time = 1950.67 ms / 1288 tokens ( 1.51 ms per token, 660.28 tokens per second) 8.18.077.696 I slot print_timing: id 0 | task 3286 | eval time = 3347.46 ms / 174 tokens ( 19.24 ms per token, 51.98 tokens per second) 8.18.077.697 I slot print_timing: id 0 | task 3286 | total time = 5298.14 ms / 1462 tokens 8.18.077.698 I slot print_timing: id 0 | task 3286 | graphs reused = 3065 8.18.077.698 I slot print_timing: id 0 | task 3286 | draft acceptance = 0.81250 ( 78 accepted / 96 generated) 8.18.077.706 I statistics draft-mtp: #calls(b,g,a) = 53 3140 3140, #gen drafts = 3140, #acc drafts = 2945, #gen tokens = 3140, #acc tokens = 2945, dur(b,g,a) = 0.033, 7911.336, 2.201 ms 8.18.079.142 I slot release: id 0 | task 3286 | stop processing: n_tokens = 59805, truncated = 0 8.18.079.224 I srv update_slots: all slots are idle 8.19.308.184 I srv params_from_: Chat format: peg-native 8.19.309.813 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 8.19.310.274 I reasoning-budget: activated, budget=2147483647 tokens 8.19.310.331 I slot launch_slot_: id 0 | task 3386 | processing task, is_child = 0 8.19.310.360 W slot update_slots: id 0 | task 3386 | n_past = 59805, slot.prompt.tokens.size() = 59805, seq_id = 0, pos_min = 59804, n_swa = 0 8.19.310.363 I slot update_slots: id 0 | task 3386 | Checking checkpoint with [59626, 59626] against 59804... 8.19.359.065 W slot update_slots: id 0 | task 3386 | restored context checkpoint (pos_min = 59626, pos_max = 59626, n_tokens = 59627, n_past = 59627, size = 458.494 MiB) 8.19.994.121 I slot create_check: id 0 | task 3386 | created context checkpoint 49 of 256 (pos_min = 59829, pos_max = 59829, n_tokens = 59830, size = 459.291 MiB) 8.20.327.559 I reasoning-budget: deactivated (natural end) 8.21.517.558 I slot print_timing: id 0 | task 3386 | prompt eval time = 727.90 ms / 207 tokens ( 3.52 ms per token, 284.38 tokens per second) 8.21.517.576 I slot print_timing: id 0 | task 3386 | eval time = 1479.06 ms / 82 tokens ( 18.04 ms per token, 55.44 tokens per second) 8.21.517.577 I slot print_timing: id 0 | task 3386 | total time = 2206.96 ms / 289 tokens 8.21.517.578 I slot print_timing: id 0 | task 3386 | graphs reused = 3105 8.21.517.579 I slot print_timing: id 0 | task 3386 | draft acceptance = 0.95238 ( 40 accepted / 42 generated) 8.21.517.586 I statistics draft-mtp: #calls(b,g,a) = 54 3182 3182, #gen drafts = 3182, #acc drafts = 2985, #gen tokens = 3182, #acc tokens = 2985, dur(b,g,a) = 0.034, 8019.357, 2.228 ms 8.21.519.167 I slot release: id 0 | task 3386 | stop processing: n_tokens = 59916, truncated = 0 8.21.519.238 I srv update_slots: all slots are idle 8.22.501.533 I srv params_from_: Chat format: peg-native 8.22.503.240 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.992 (> 0.100 thold), f_keep = 1.000 8.22.503.745 I reasoning-budget: activated, budget=2147483647 tokens 8.22.503.802 I slot launch_slot_: id 0 | task 3431 | processing task, is_child = 0 8.22.503.843 W slot update_slots: id 0 | task 3431 | n_past = 59916, slot.prompt.tokens.size() = 59916, seq_id = 0, pos_min = 59915, n_swa = 0 8.22.503.843 I slot update_slots: id 0 | task 3431 | Checking checkpoint with [59829, 59829] against 59915... 8.22.551.424 W slot update_slots: id 0 | task 3431 | restored context checkpoint (pos_min = 59829, pos_max = 59829, n_tokens = 59830, n_past = 59830, size = 459.291 MiB) 8.23.173.813 I slot create_check: id 0 | task 3431 | created context checkpoint 50 of 256 (pos_min = 59909, pos_max = 59909, n_tokens = 59910, size = 459.605 MiB) 8.24.111.161 I slot create_check: id 0 | task 3431 | created context checkpoint 51 of 256 (pos_min = 60421, pos_max = 60421, n_tokens = 60422, size = 461.615 MiB) 8.24.520.420 I reasoning-budget: deactivated (natural end) 8.25.911.773 I slot print_timing: id 0 | task 3431 | n_decoded = 101, tg = 57.56 t/s 8.26.153.983 I slot print_timing: id 0 | task 3431 | prompt eval time = 1652.90 ms / 596 tokens ( 2.77 ms per token, 360.58 tokens per second) 8.26.154.001 I slot print_timing: id 0 | task 3431 | eval time = 1997.00 ms / 115 tokens ( 17.37 ms per token, 57.59 tokens per second) 8.26.154.002 I slot print_timing: id 0 | task 3431 | total time = 3649.90 ms / 711 tokens 8.26.154.002 I slot print_timing: id 0 | task 3431 | graphs reused = 3161 8.26.154.003 I slot print_timing: id 0 | task 3431 | draft acceptance = 1.00000 ( 57 accepted / 57 generated) 8.26.154.010 I statistics draft-mtp: #calls(b,g,a) = 55 3239 3239, #gen drafts = 3239, #acc drafts = 3042, #gen tokens = 3239, #acc tokens = 3042, dur(b,g,a) = 0.035, 8167.195, 2.266 ms 8.26.155.649 I slot release: id 0 | task 3431 | stop processing: n_tokens = 60540, truncated = 0 8.26.155.722 I srv update_slots: all slots are idle 8.27.508.614 I srv params_from_: Chat format: peg-native 8.27.510.264 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 8.27.510.711 I reasoning-budget: activated, budget=2147483647 tokens 8.27.510.770 I slot launch_slot_: id 0 | task 3492 | processing task, is_child = 0 8.27.510.809 W slot update_slots: id 0 | task 3492 | n_past = 60540, slot.prompt.tokens.size() = 60540, seq_id = 0, pos_min = 60539, n_swa = 0 8.27.510.810 I slot update_slots: id 0 | task 3492 | Checking checkpoint with [60421, 60421] against 60539... 8.27.559.231 W slot update_slots: id 0 | task 3492 | restored context checkpoint (pos_min = 60421, pos_max = 60421, n_tokens = 60422, n_past = 60422, size = 461.615 MiB) 8.28.187.086 I slot create_check: id 0 | task 3492 | created context checkpoint 52 of 256 (pos_min = 60566, pos_max = 60566, n_tokens = 60567, size = 462.184 MiB) 8.28.523.203 I reasoning-budget: deactivated (natural end) 8.29.699.131 I slot print_timing: id 0 | task 3492 | prompt eval time = 720.41 ms / 149 tokens ( 4.83 ms per token, 206.83 tokens per second) 8.29.699.149 I slot print_timing: id 0 | task 3492 | eval time = 1467.67 ms / 84 tokens ( 17.47 ms per token, 57.23 tokens per second) 8.29.699.150 I slot print_timing: id 0 | task 3492 | total time = 2188.09 ms / 233 tokens 8.29.699.151 I slot print_timing: id 0 | task 3492 | graphs reused = 3202 8.29.699.151 I slot print_timing: id 0 | task 3492 | draft acceptance = 1.00000 ( 42 accepted / 42 generated) 8.29.699.159 I statistics draft-mtp: #calls(b,g,a) = 56 3281 3281, #gen drafts = 3281, #acc drafts = 3084, #gen tokens = 3281, #acc tokens = 3084, dur(b,g,a) = 0.035, 8275.488, 2.295 ms 8.29.700.588 I slot release: id 0 | task 3492 | stop processing: n_tokens = 60655, truncated = 0 8.29.700.657 I srv update_slots: all slots are idle 8.30.775.236 I srv params_from_: Chat format: peg-native 8.30.776.922 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.996 (> 0.100 thold), f_keep = 1.000 8.30.777.344 I reasoning-budget: activated, budget=2147483647 tokens 8.30.777.406 I slot launch_slot_: id 0 | task 3537 | processing task, is_child = 0 8.30.777.448 W slot update_slots: id 0 | task 3537 | n_past = 60655, slot.prompt.tokens.size() = 60655, seq_id = 0, pos_min = 60654, n_swa = 0 8.30.777.449 I slot update_slots: id 0 | task 3537 | Checking checkpoint with [60566, 60566] against 60654... 8.30.825.679 W slot update_slots: id 0 | task 3537 | restored context checkpoint (pos_min = 60566, pos_max = 60566, n_tokens = 60567, n_past = 60567, size = 462.184 MiB) 8.31.576.607 I slot create_check: id 0 | task 3537 | created context checkpoint 53 of 256 (pos_min = 60905, pos_max = 60905, n_tokens = 60906, size = 463.515 MiB) 8.32.480.259 I reasoning-budget: deactivated (natural end) 8.33.449.519 I slot print_timing: id 0 | task 3537 | n_decoded = 100, tg = 54.69 t/s 8.34.105.314 I slot print_timing: id 0 | task 3537 | prompt eval time = 843.24 ms / 343 tokens ( 2.46 ms per token, 406.76 tokens per second) 8.34.105.332 I slot print_timing: id 0 | task 3537 | eval time = 2484.39 ms / 138 tokens ( 18.00 ms per token, 55.55 tokens per second) 8.34.105.333 I slot print_timing: id 0 | task 3537 | total time = 3327.63 ms / 481 tokens 8.34.105.334 I slot print_timing: id 0 | task 3537 | graphs reused = 3271 8.34.105.335 I slot print_timing: id 0 | task 3537 | draft acceptance = 0.92958 ( 66 accepted / 71 generated) 8.34.105.342 I statistics draft-mtp: #calls(b,g,a) = 57 3352 3352, #gen drafts = 3352, #acc drafts = 3150, #gen tokens = 3352, #acc tokens = 3150, dur(b,g,a) = 0.036, 8458.933, 2.349 ms 8.34.106.803 I slot release: id 0 | task 3537 | stop processing: n_tokens = 61047, truncated = 0 8.34.106.874 I srv update_slots: all slots are idle 8.35.334.751 I srv params_from_: Chat format: peg-native 8.35.336.453 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 8.35.336.915 I reasoning-budget: activated, budget=2147483647 tokens 8.35.336.971 I slot launch_slot_: id 0 | task 3611 | processing task, is_child = 0 8.35.336.997 W slot update_slots: id 0 | task 3611 | n_past = 61047, slot.prompt.tokens.size() = 61047, seq_id = 0, pos_min = 61046, n_swa = 0 8.35.336.997 I slot update_slots: id 0 | task 3611 | Checking checkpoint with [60905, 60905] against 61046... 8.35.386.276 W slot update_slots: id 0 | task 3611 | restored context checkpoint (pos_min = 60905, pos_max = 60905, n_tokens = 60906, n_past = 60906, size = 463.515 MiB) 8.35.888.064 I slot create_check: id 0 | task 3611 | created context checkpoint 54 of 256 (pos_min = 61073, pos_max = 61073, n_tokens = 61074, size = 464.174 MiB) 8.36.189.794 I reasoning-budget: deactivated (natural end) 8.37.367.571 I slot print_timing: id 0 | task 3611 | prompt eval time = 595.23 ms / 172 tokens ( 3.46 ms per token, 288.96 tokens per second) 8.37.367.593 I slot print_timing: id 0 | task 3611 | eval time = 1435.10 ms / 82 tokens ( 17.50 ms per token, 57.14 tokens per second) 8.37.367.593 I slot print_timing: id 0 | task 3611 | total time = 2030.33 ms / 254 tokens 8.37.367.594 I slot print_timing: id 0 | task 3611 | graphs reused = 3311 8.37.367.595 I slot print_timing: id 0 | task 3611 | draft acceptance = 0.97561 ( 40 accepted / 41 generated) 8.37.367.602 I statistics draft-mtp: #calls(b,g,a) = 58 3393 3393, #gen drafts = 3393, #acc drafts = 3190, #gen tokens = 3393, #acc tokens = 3190, dur(b,g,a) = 0.037, 8565.433, 2.376 ms 8.37.369.065 I slot release: id 0 | task 3611 | stop processing: n_tokens = 61159, truncated = 0 8.37.369.134 I srv update_slots: all slots are idle 8.38.290.447 I srv params_from_: Chat format: peg-native 8.38.292.134 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.992 (> 0.100 thold), f_keep = 1.000 8.38.292.560 I reasoning-budget: activated, budget=2147483647 tokens 8.38.292.619 I slot launch_slot_: id 0 | task 3655 | processing task, is_child = 0 8.38.292.646 W slot update_slots: id 0 | task 3655 | n_past = 61159, slot.prompt.tokens.size() = 61159, seq_id = 0, pos_min = 61158, n_swa = 0 8.38.292.647 I slot update_slots: id 0 | task 3655 | Checking checkpoint with [61073, 61073] against 61158... 8.38.341.120 W slot update_slots: id 0 | task 3655 | restored context checkpoint (pos_min = 61073, pos_max = 61073, n_tokens = 61074, n_past = 61074, size = 464.174 MiB) 8.38.745.652 I slot create_check: id 0 | task 3655 | created context checkpoint 55 of 256 (pos_min = 61153, pos_max = 61153, n_tokens = 61154, size = 464.488 MiB) 8.39.420.492 I slot create_check: id 0 | task 3655 | created context checkpoint 56 of 256 (pos_min = 61665, pos_max = 61665, n_tokens = 61666, size = 466.498 MiB) 8.40.151.994 I reasoning-budget: deactivated (natural end) 8.41.329.725 I slot print_timing: id 0 | task 3655 | n_decoded = 101, tg = 54.16 t/s 8.41.778.263 I slot print_timing: id 0 | task 3655 | prompt eval time = 1172.05 ms / 596 tokens ( 1.97 ms per token, 508.51 tokens per second) 8.41.778.281 I slot print_timing: id 0 | task 3655 | eval time = 2313.32 ms / 127 tokens ( 18.22 ms per token, 54.90 tokens per second) 8.41.778.282 I slot print_timing: id 0 | task 3655 | total time = 3485.37 ms / 723 tokens 8.41.778.282 I slot print_timing: id 0 | task 3655 | graphs reused = 3375 8.41.778.283 I slot print_timing: id 0 | task 3655 | draft acceptance = 0.90909 ( 60 accepted / 66 generated) 8.41.778.290 I statistics draft-mtp: #calls(b,g,a) = 59 3459 3459, #gen drafts = 3459, #acc drafts = 3250, #gen tokens = 3459, #acc tokens = 3250, dur(b,g,a) = 0.038, 8736.633, 2.421 ms 8.41.779.712 I slot release: id 0 | task 3655 | stop processing: n_tokens = 6 8.41.779.783 I srv update_slots: all slots are idle [377/1622] 8.43.005.795 I srv params_from_: Chat format: peg-native 8.43.007.444 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.999 (> 0.100 thold), f_keep = 1.000 8.43.007.876 I reasoning-budget: activated, budget=2147483647 tokens 8.43.007.943 I slot launch_slot_: id 0 | task 3725 | processing task, is_child = 0 8.43.007.983 W slot update_slots: id 0 | task 3725 | n_past = 61796, slot.prompt.tokens.size() = 61796, seq_id = 0, pos_min = 61795, n_swa = 0 8.43.007.983 I slot update_slots: id 0 | task 3725 | Checking checkpoint with [61665, 61665] against 61795... 8.43.057.294 W slot update_slots: id 0 | task 3725 | restored context checkpoint (pos_min = 61665, pos_max = 61665, n_tokens = 61666, n_past = 61666, size = 466.498 MiB) 8.43.523.741 I slot create_check: id 0 | task 3725 | created context checkpoint 57 of 256 (pos_min = 61822, pos_max = 61822, n_tokens = 61823, size = 467.115 MiB) 8.43.824.690 I reasoning-budget: deactivated (natural end) 8.45.006.902 I slot print_timing: id 0 | task 3725 | prompt eval time = 559.79 ms / 161 tokens ( 3.48 ms per token, 287.61 tokens per second) 8.45.006.920 I slot print_timing: id 0 | task 3725 | eval time = 1438.89 ms / 82 tokens ( 17.55 ms per token, 56.99 tokens per second) 8.45.006.921 I slot print_timing: id 0 | task 3725 | total time = 1998.68 ms / 243 tokens 8.45.006.922 I slot print_timing: id 0 | task 3725 | graphs reused = 3415 8.45.006.922 I slot print_timing: id 0 | task 3725 | draft acceptance = 1.00000 ( 41 accepted / 41 generated) 8.45.006.929 I statistics draft-mtp: #calls(b,g,a) = 60 3500 3500, #gen drafts = 3500, #acc drafts = 3291, #gen tokens = 3500, #acc tokens = 3291, dur(b,g,a) = 0.039, 8843.290, 2.452 ms 8.45.008.354 I slot release: id 0 | task 3725 | stop processing: n_tokens = 61909, truncated = 0 8.45.008.425 I srv update_slots: all slots are idle 8.45.923.549 I srv params_from_: Chat format: peg-native 8.45.925.272 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.988 (> 0.100 thold), f_keep = 1.000 8.45.925.694 I reasoning-budget: activated, budget=2147483647 tokens 8.45.925.755 I slot launch_slot_: id 0 | task 3769 | processing task, is_child = 0 8.45.925.780 W slot update_slots: id 0 | task 3769 | n_past = 61909, slot.prompt.tokens.size() = 61909, seq_id = 0, pos_min = 61908, n_swa = 0 8.45.925.781 I slot update_slots: id 0 | task 3769 | Checking checkpoint with [61822, 61822] against 61908... 8.45.974.729 W slot update_slots: id 0 | task 3769 | restored context checkpoint (pos_min = 61822, pos_max = 61822, n_tokens = 61823, n_past = 61823, size = 467.115 MiB) 8.46.593.239 I slot create_check: id 0 | task 3769 | created context checkpoint 58 of 256 (pos_min = 62114, pos_max = 62114, n_tokens = 62115, size = 468.261 MiB) 8.47.293.943 I slot create_check: id 0 | task 3769 | created context checkpoint 59 of 256 (pos_min = 62626, pos_max = 62626, n_tokens = 62627, size = 470.270 MiB) 8.47.769.134 I reasoning-budget: deactivated (natural end) 8.49.137.389 I slot print_timing: id 0 | task 3769 | n_decoded = 100, tg = 55.58 t/s 8.49.415.370 I slot print_timing: id 0 | task 3769 | prompt eval time = 1412.21 ms / 808 tokens ( 1.75 ms per token, 572.15 tokens per second) 8.49.415.388 I slot print_timing: id 0 | task 3769 | eval time = 2077.14 ms / 116 tokens ( 17.91 ms per token, 55.85 tokens per second) 8.49.415.389 I slot print_timing: id 0 | task 3769 | total time = 3489.35 ms / 924 tokens 8.49.415.390 I slot print_timing: id 0 | task 3769 | graphs reused = 3472 8.49.415.390 I slot print_timing: id 0 | task 3769 | draft acceptance = 0.94915 ( 56 accepted / 59 generated) 8.49.415.397 I statistics draft-mtp: #calls(b,g,a) = 61 3559 3559, #gen drafts = 3559, #acc drafts = 3347, #gen tokens = 3559, #acc tokens = 3347, dur(b,g,a) = 0.040, 8996.022, 2.503 ms 8.49.416.942 I slot release: id 0 | task 3769 | stop processing: n_tokens = 62746, truncated = 0 8.49.417.013 I srv update_slots: all slots are idle 8.50.697.719 I srv params_from_: Chat format: peg-native 8.50.699.358 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 8.50.699.800 I reasoning-budget: activated, budget=2147483647 tokens 8.50.699.859 I slot launch_slot_: id 0 | task 3832 | processing task, is_child = 0 8.50.699.899 W slot update_slots: id 0 | task 3832 | n_past = 62746, slot.prompt.tokens.size() = 62746, seq_id = 0, pos_min = 62745, n_swa = 0 8.50.699.900 I slot update_slots: id 0 | task 3832 | Checking checkpoint with [62626, 62626] against 62745... 8.50.749.538 W slot update_slots: id 0 | task 3832 | restored context checkpoint (pos_min = 62626, pos_max = 62626, n_tokens = 62627, n_past = 62627, size = 470.270 MiB) 8.51.640.828 I slot create_check: id 0 | task 3832 | created context checkpoint 60 of 256 (pos_min = 62772, pos_max = 62772, n_tokens = 62773, size = 470.844 MiB) 8.51.839.873 I reasoning-budget: deactivated (natural end) 8.53.470.153 I slot print_timing: id 0 | task 3832 | prompt eval time = 985.33 ms / 150 tokens ( 6.57 ms per token, 152.23 tokens per second) 8.53.470.171 I slot print_timing: id 0 | task 3832 | eval time = 1784.66 ms / 100 tokens ( 17.85 ms per token, 56.03 tokens per second) 8.53.470.171 I slot print_timing: id 0 | task 3832 | total time = 2769.99 ms / 250 tokens 8.53.470.172 I slot print_timing: id 0 | task 3832 | graphs reused = 3522 8.53.470.173 I slot print_timing: id 0 | task 3832 | draft acceptance = 0.94118 ( 48 accepted / 51 generated) 8.53.470.180 I statistics draft-mtp: #calls(b,g,a) = 62 3610 3610, #gen drafts = 3610, #acc drafts = 3395, #gen tokens = 3610, #acc tokens = 3395, dur(b,g,a) = 0.041, 9127.814, 2.539 ms 8.53.471.719 I slot release: id 0 | task 3832 | stop processing: n_tokens = 62876, truncated = 0 8.53.471.789 I slot print_timing: id 0 | task -1 | n_decoded = 100, tg = 55.97 t/s 8.53.471.807 I srv update_slots: all slots are idle 8.56.826.913 I srv params_from_: Chat format: peg-native 8.56.828.544 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 8.56.828.978 I reasoning-budget: activated, budget=2147483647 tokens 8.56.829.040 I slot launch_slot_: id 0 | task 3886 | processing task, is_child = 0 8.56.829.082 W slot update_slots: id 0 | task 3886 | n_past = 62876, slot.prompt.tokens.size() = 62876, seq_id = 0, pos_min = 62875, n_swa = 0 8.56.829.082 I slot update_slots: id 0 | task 3886 | Checking checkpoint with [62772, 62772] against 62875... 8.56.879.028 W slot update_slots: id 0 | task 3886 | restored context checkpoint (pos_min = 62772, pos_max = 62772, n_tokens = 62773, n_past = 62773, size = 470.844 MiB) 8.57.754.993 I slot create_check: id 0 | task 3886 | created context checkpoint 61 of 256 (pos_min = 62894, pos_max = 62894, n_tokens = 62895, size = 471.322 MiB) 8.58.022.044 I reasoning-budget: deactivated (natural end) 8.59.602.107 I slot print_timing: id 0 | task 3886 | n_decoded = 101, tg = 56.03 t/s 8.59.705.811 I slot print_timing: id 0 | task 3886 | prompt eval time = 970.11 ms / 126 tokens ( 7.70 ms per token, 129.88 tokens per second) 8.59.705.829 I slot print_timing: id 0 | task 3886 | eval time = 1906.37 ms / 106 tokens ( 17.98 ms per token, 55.60 tokens per second) 8.59.705.829 I slot print_timing: id 0 | task 3886 | total time = 2876.48 ms / 232 tokens 8.59.705.830 I slot print_timing: id 0 | task 3886 | graphs reused = 3574 8.59.705.831 I slot print_timing: id 0 | task 3886 | draft acceptance = 0.96296 ( 52 accepted / 54 generated) 8.59.705.845 I statistics draft-mtp: #calls(b,g,a) = 63 3664 3664, #gen drafts = 3664, #acc drafts = 3447, #gen tokens = 3664, #acc tokens = 3447, dur(b,g,a) = 0.042, 9268.370, 2.575 ms 8.59.707.308 I slot release: id 0 | task 3886 | stop processing: n_tokens = 63005, truncated = 0 8.59.707.387 I srv update_slots: all slots are idle 9.01.061.827 I srv params_from_: Chat format: peg-native 9.01.063.489 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 9.01.063.912 I reasoning-budget: activated, budget=2147483647 tokens 9.01.063.970 I slot launch_slot_: id 0 | task 3943 | processing task, is_child = 0 9.01.064.012 W slot update_slots: id 0 | task 3943 | n_past = 63005, slot.prompt.tokens.size() = 63005, seq_id = 0, pos_min = 63004, n_swa = 0 9.01.064.013 I slot update_slots: id 0 | task 3943 | Checking checkpoint with [62894, 62894] against 63004... 9.01.113.894 W slot update_slots: id 0 | task 3943 | restored context checkpoint (pos_min = 62894, pos_max = 62894, n_tokens = 62895, n_past = 62895, size = 471.322 MiB) 9.02.020.216 I slot create_check: id 0 | task 3943 | created context checkpoint 62 of 256 (pos_min = 63030, pos_max = 63030, n_tokens = 63031, size = 471.856 MiB) 9.02.218.885 I reasoning-budget: deactivated (natural end) 9.03.817.752 I slot print_timing: id 0 | task 3943 | n_decoded = 101, tg = 57.62 t/s 9.03.852.456 I slot print_timing: id 0 | task 3943 | prompt eval time = 1000.62 ms / 140 tokens ( 7.15 ms per token, 139.91 tokens per second) 9.03.852.474 I slot print_timing: id 0 | task 3943 | eval time = 1787.58 ms / 102 tokens ( 17.53 ms per token, 57.06 tokens per second) 9.03.852.475 I slot print_timing: id 0 | task 3943 | total time = 2788.20 ms / 242 tokens 9.03.852.476 I slot print_timing: id 0 | task 3943 | graphs reused = 3624 9.03.852.477 I slot print_timing: id 0 | task 3943 | draft acceptance = 1.00000 ( 51 accepted / 51 generated) 9.03.852.486 I statistics draft-mtp: #calls(b,g,a) = 64 3715 3715, #gen drafts = 3715, #acc drafts = 3498, #gen tokens = 3715, #acc tokens = 3498, dur(b,g,a) = 0.043, 9401.529, 2.610 ms 9.03.853.919 I slot release: id 0 | task 3943 | stop processing: n_tokens = 63137, truncated = 0 9.03.853.991 I srv update_slots: all slots are idle 9.07.591.109 I srv params_from_: Chat format: peg-native 9.07.592.816 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 9.07.593.257 I reasoning-budget: activated, budget=2147483647 tokens 9.07.593.315 I slot launch_slot_: id 0 | task 3997 | processing task, is_child = 0 9.07.593.357 W slot update_slots: id 0 | task 3997 | n_past = 63137, slot.prompt.tokens.size() = 63137, seq_id = 0, pos_min = 63136, n_swa = 0 9.07.593.357 I slot update_slots: id 0 | task 3997 | Checking checkpoint with [63030, 63030] against 63136... 9.07.645.060 W slot update_slots: id 0 | task 3997 | restored context checkpoint (pos_min = 63030, pos_max = 63030, n_tokens = 63031, n_past = 63031, size = 471.856 MiB) 9.08.610.155 I slot create_check: id 0 | task 3997 | created context checkpoint 63 of 256 (pos_min = 63155, pos_max = 63155, n_tokens = 63156, size = 472.347 MiB) 9.08.843.557 I reasoning-budget: deactivated (natural end) 9.10.425.683 I slot print_timing: id 0 | task 3997 | n_decoded = 101, tg = 57.04 t/s 9.10.529.497 I slot print_timing: id 0 | task 3997 | prompt eval time = 1061.35 ms / 129 tokens ( 8.23 ms per token, 121.54 tokens per second) 9.10.529.514 I slot print_timing: id 0 | task 3997 | eval time = 1874.56 ms / 106 tokens ( 17.68 ms per token, 56.55 tokens per second) 9.10.529.515 I slot print_timing: id 0 | task 3997 | total time = 2935.91 ms / 235 tokens 9.10.529.516 I slot print_timing: id 0 | task 3997 | graphs reused = 3675 9.10.529.516 I slot print_timing: id 0 | task 3997 | draft acceptance = 1.00000 ( 53 accepted / 53 generated) 9.10.529.523 I statistics draft-mtp: #calls(b,g,a) = 65 3768 3768, #gen drafts = 3768, #acc drafts = 3551, #gen tokens = 3768, #acc tokens = 3551, dur(b,g,a) = 0.044, 9539.439, 2.643 ms 9.10.530.977 I slot release: id 0 | task 3997 | stop processing: n_tokens = 63266, truncated = 0 9.10.531.047 I srv update_slots: all slots are idle 9.11.724.861 I srv params_from_: Chat format: peg-native 9.11.726.497 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 9.11.726.971 I reasoning-budget: activated, budget=2147483647 tokens 9.11.727.029 I slot launch_slot_: id 0 | task 4053 | processing task, is_child = 0 9.11.727.069 W slot update_slots: id 0 | task 4053 | n_past = 63266, slot.prompt.tokens.size() = 63266, seq_id = 0, pos_min = 63265, n_swa = 0 9.11.727.070 I slot update_slots: id 0 | task 4053 | Checking checkpoint with [63155, 63155] against 63265... 9.11.776.201 W slot update_slots: id 0 | task 4053 | restored context checkpoint (pos_min = 63155, pos_max = 63155, n_tokens = 63156, n_past = 63156, size = 472.347 MiB) 9.12.682.054 I slot create_check: id 0 | task 4053 | created context checkpoint 64 of 256 (pos_min = 63291, pos_max = 63291, n_tokens = 63292, size = 472.881 MiB) 9.12.880.312 I reasoning-budget: deactivated (natural end) 9.14.479.138 I slot print_timing: id 0 | task 4053 | prompt eval time = 999.29 ms / 140 tokens ( 7.14 ms per token, 140.10 tokens per second) 9.14.479.156 I slot print_timing: id 0 | task 4053 | eval time = 1752.54 ms / 100 tokens ( 17.53 ms per token, 57.06 tokens per second) 9.14.479.157 I slot print_timing: id 0 | task 4053 | total time = 2751.82 ms / 240 tokens 9.14.479.158 I slot print_timing: id 0 | task 4053 | graphs reused = 3724 9.14.479.158 I slot print_timing: id 0 | task 4053 | draft acceptance = 1.00000 ( 50 accepted / 50 generated) 9.14.479.165 I statistics draft-mtp: #calls(b,g,a) = 66 3818 3818, #gen drafts = 3818, #acc drafts = 3601, #gen tokens = 3818, #acc tokens = 3601, dur(b,g,a) = 0.045, 9670.340, 2.675 ms 9.14.480.634 I slot release: id 0 | task 4053 | stop processing: n_tokens = 63396, truncated = 0 9.14.480.706 I slot print_timing: id 0 | task -1 | n_decoded = 100, tg = 57.00 t/s 9.14.480.724 I srv update_slots: all slots are idle 9.17.797.929 I srv params_from_: Chat format: peg-native 9.17.799.608 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 [249/1622] 9.17.800.057 I reasoning-budget: activated, budget=2147483647 tokens 9.17.800.115 I slot launch_slot_: id 0 | task 4106 | processing task, is_child = 0 9.17.800.142 W slot update_slots: id 0 | task 4106 | n_past = 63396, slot.prompt.tokens.size() = 63396, seq_id = 0, pos_min = 63395, n_swa = 0 9.17.800.143 I slot update_slots: id 0 | task 4106 | Checking checkpoint with [63291, 63291] against 63395... 9.17.893.830 W slot update_slots: id 0 | task 4106 | restored context checkpoint (pos_min = 63291, pos_max = 63291, n_tokens = 63292, n_past = 63292, size = 472.881 MiB) 9.18.786.243 I slot create_check: id 0 | task 4106 | created context checkpoint 65 of 256 (pos_min = 63412, pos_max = 63412, n_tokens = 63413, size = 473.356 MiB) 9.19.088.849 I reasoning-budget: deactivated (natural end) 9.20.599.849 I slot print_timing: id 0 | task 4106 | n_decoded = 101, tg = 57.10 t/s 9.20.738.723 I slot print_timing: id 0 | task 4106 | prompt eval time = 1030.53 ms / 125 tokens ( 8.24 ms per token, 121.30 tokens per second) 9.20.738.742 I slot print_timing: id 0 | task 4106 | eval time = 1907.81 ms / 109 tokens ( 17.50 ms per token, 57.13 tokens per second) 9.20.738.742 I slot print_timing: id 0 | task 4106 | total time = 2938.34 ms / 234 tokens 9.20.738.743 I slot print_timing: id 0 | task 4106 | graphs reused = 3776 9.20.738.744 I slot print_timing: id 0 | task 4106 | draft acceptance = 1.00000 ( 54 accepted / 54 generated) 9.20.738.752 I statistics draft-mtp: #calls(b,g,a) = 67 3872 3872, #gen drafts = 3872, #acc drafts = 3655, #gen tokens = 3872, #acc tokens = 3655, dur(b,g,a) = 0.046, 9810.705, 2.712 ms 9.20.740.381 I slot release: id 0 | task 4106 | stop processing: n_tokens = 63525, truncated = 0 9.20.740.452 I srv update_slots: all slots are idle 9.21.983.257 I srv params_from_: Chat format: peg-native 9.21.984.952 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 9.21.985.389 I reasoning-budget: activated, budget=2147483647 tokens 9.21.985.448 I slot launch_slot_: id 0 | task 4163 | processing task, is_child = 0 9.21.985.476 W slot update_slots: id 0 | task 4163 | n_past = 63525, slot.prompt.tokens.size() = 63525, seq_id = 0, pos_min = 63524, n_swa = 0 9.21.985.493 I slot update_slots: id 0 | task 4163 | Checking checkpoint with [63412, 63412] against 63524... 9.22.035.981 W slot update_slots: id 0 | task 4163 | restored context checkpoint (pos_min = 63412, pos_max = 63412, n_tokens = 63413, n_past = 63413, size = 473.356 MiB) 9.22.940.310 I slot create_check: id 0 | task 4163 | created context checkpoint 66 of 256 (pos_min = 63551, pos_max = 63551, n_tokens = 63552, size = 473.901 MiB) 9.23.208.059 I reasoning-budget: deactivated (natural end) 9.24.391.396 I slot print_timing: id 0 | task 4163 | prompt eval time = 999.22 ms / 143 tokens ( 6.99 ms per token, 143.11 tokens per second) 9.24.391.414 I slot print_timing: id 0 | task 4163 | eval time = 1406.47 ms / 80 tokens ( 17.58 ms per token, 56.88 tokens per second) 9.24.391.415 I slot print_timing: id 0 | task 4163 | total time = 2405.68 ms / 223 tokens 9.24.391.415 I slot print_timing: id 0 | task 4163 | graphs reused = 3815 9.24.391.416 I slot print_timing: id 0 | task 4163 | draft acceptance = 1.00000 ( 40 accepted / 40 generated) 9.24.391.423 I statistics draft-mtp: #calls(b,g,a) = 68 3912 3912, #gen drafts = 3912, #acc drafts = 3695, #gen tokens = 3912, #acc tokens = 3695, dur(b,g,a) = 0.047, 9914.819, 2.744 ms 9.24.392.940 I slot release: id 0 | task 4163 | stop processing: n_tokens = 63636, truncated = 0 9.24.393.006 I srv update_slots: all slots are idle 9.25.368.617 I srv params_from_: Chat format: peg-native 9.25.370.267 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.989 (> 0.100 thold), f_keep = 1.000 9.25.370.724 I reasoning-budget: activated, budget=2147483647 tokens 9.25.370.782 I slot launch_slot_: id 0 | task 4206 | processing task, is_child = 0 9.25.370.807 W slot update_slots: id 0 | task 4206 | n_past = 63636, slot.prompt.tokens.size() = 63636, seq_id = 0, pos_min = 63635, n_swa = 0 9.25.370.808 I slot update_slots: id 0 | task 4206 | Checking checkpoint with [63551, 63551] against 63635... 9.25.421.720 W slot update_slots: id 0 | task 4206 | restored context checkpoint (pos_min = 63551, pos_max = 63551, n_tokens = 63552, n_past = 63552, size = 473.901 MiB) 9.26.419.252 I slot create_check: id 0 | task 4206 | created context checkpoint 67 of 256 (pos_min = 63820, pos_max = 63820, n_tokens = 63821, size = 474.957 MiB) 9.27.672.050 I slot create_check: id 0 | task 4206 | created context checkpoint 68 of 256 (pos_min = 64332, pos_max = 64332, n_tokens = 64333, size = 476.967 MiB) 9.28.357.813 I reasoning-budget: deactivated (natural end) 9.29.506.638 I slot print_timing: id 0 | task 4206 | n_decoded = 100, tg = 55.87 t/s 9.30.027.585 I slot print_timing: id 0 | task 4206 | prompt eval time = 2345.80 ms / 785 tokens ( 2.99 ms per token, 334.64 tokens per second) 9.30.027.602 I slot print_timing: id 0 | task 4206 | eval time = 2310.74 ms / 129 tokens ( 17.91 ms per token, 55.83 tokens per second) 9.30.027.603 I slot print_timing: id 0 | task 4206 | total time = 4656.54 ms / 914 tokens 9.30.027.604 I slot print_timing: id 0 | task 4206 | graphs reused = 3880 9.30.027.605 I slot print_timing: id 0 | task 4206 | draft acceptance = 0.95455 ( 63 accepted / 66 generated) 9.30.027.612 I statistics draft-mtp: #calls(b,g,a) = 69 3978 3978, #gen drafts = 3978, #acc drafts = 3758, #gen tokens = 3978, #acc tokens = 3758, dur(b,g,a) = 0.048, 10086.302, 2.797 ms 9.30.029.123 I slot release: id 0 | task 4206 | stop processing: n_tokens = 64466, truncated = 0 9.30.029.191 I srv update_slots: all slots are idle 9.31.201.335 I srv params_from_: Chat format: peg-native 9.31.203.076 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 1.000 (> 0.100 thold), f_keep = 1.000 9.31.203.506 I reasoning-budget: activated, budget=2147483647 tokens 9.31.203.570 I slot launch_slot_: id 0 | task 4276 | processing task, is_child = 0 9.31.203.609 W slot update_slots: id 0 | task 4276 | n_past = 64466, slot.prompt.tokens.size() = 64466, seq_id = 0, pos_min = 64465, n_swa = 0 9.31.203.610 I slot update_slots: id 0 | task 4276 | Checking checkpoint with [64332, 64332] against 64465... 9.31.254.124 W slot update_slots: id 0 | task 4276 | restored context checkpoint (pos_min = 64332, pos_max = 64332, n_tokens = 64333, n_past = 64333, size = 476.967 MiB) 9.32.169.762 I slot create_check: id 0 | task 4276 | created context checkpoint 69 of 256 (pos_min = 64491, pos_max = 64491, n_tokens = 64492, size = 477.591 MiB) 9.32.507.643 I reasoning-budget: deactivated (natural end) 9.33.707.572 I slot print_timing: id 0 | task 4276 | prompt eval time = 1010.54 ms / 163 tokens ( 6.20 ms per token, 161.30 tokens per second) 9.33.707.590 I slot print_timing: id 0 | task 4276 | eval time = 1493.16 ms / 85 tokens ( 17.57 ms per token, 56.93 tokens per second) 9.33.707.591 I slot print_timing: id 0 | task 4276 | total time = 2503.71 ms / 248 tokens 9.33.707.592 I slot print_timing: id 0 | task 4276 | graphs reused = 3920 9.33.707.592 I slot print_timing: id 0 | task 4276 | draft acceptance = 1.00000 ( 42 accepted / 42 generated) 9.33.707.600 I statistics draft-mtp: #calls(b,g,a) = 70 4020 4020, #gen drafts = 4020, #acc drafts = 3800, #gen tokens = 4020, #acc tokens = 3800, dur(b,g,a) = 0.048, 10195.458, 2.833 ms 9.33.709.091 I slot release: id 0 | task 4276 | stop processing: n_tokens = 64580, truncated = 0 9.33.709.160 I srv update_slots: all slots are idle 9.34.654.561 I srv params_from_: Chat format: peg-native 9.34.656.204 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.989 (> 0.100 thold), f_keep = 1.000 9.34.656.642 I reasoning-budget: activated, budget=2147483647 tokens 9.34.656.701 I slot launch_slot_: id 0 | task 4321 | processing task, is_child = 0 9.34.656.727 W slot update_slots: id 0 | task 4321 | n_past = 64580, slot.prompt.tokens.size() = 64580, seq_id = 0, pos_min = 64579, n_swa = 0 9.34.656.727 I slot update_slots: id 0 | task 4321 | Checking checkpoint with [64491, 64491] against 64579... 9.34.707.751 W slot update_slots: id 0 | task 4321 | restored context checkpoint (pos_min = 64491, pos_max = 64491, n_tokens = 64492, n_past = 64492, size = 477.591 MiB) 9.35.821.216 I slot create_check: id 0 | task 4321 | created context checkpoint 70 of 256 (pos_min = 64765, pos_max = 64765, n_tokens = 64766, size = 478.667 MiB) 9.37.088.860 I slot create_check: id 0 | task 4321 | created context checkpoint 71 of 256 (pos_min = 65277, pos_max = 65277, n_tokens = 65278, size = 480.676 MiB) 9.38.018.433 I reasoning-budget: deactivated (natural end) 9.38.961.589 I slot print_timing: id 0 | task 4321 | n_decoded = 100, tg = 54.70 t/s 9.39.276.685 I slot print_timing: id 0 | task 4321 | prompt eval time = 2476.57 ms / 790 tokens ( 3.13 ms per token, 318.99 tokens per second) 9.39.276.703 I slot print_timing: id 0 | task 4321 | eval time = 2143.13 ms / 117 tokens ( 18.32 ms per token, 54.59 tokens per second) 9.39.276.704 I slot print_timing: id 0 | task 4321 | total time = 4619.70 ms / 907 tokens 9.39.276.705 I slot print_timing: id 0 | task 4321 | graphs reused = 3980 9.39.276.705 I slot print_timing: id 0 | task 4321 | draft acceptance = 0.91803 ( 56 accepted / 61 generated) 9.39.276.712 I statistics draft-mtp: #calls(b,g,a) = 71 4081 4081, #gen drafts = 4081, #acc drafts = 3856, #gen tokens = 4081, #acc tokens = 3856, dur(b,g,a) = 0.049, 10354.645, 2.880 ms 9.39.278.253 I slot release: id 0 | task 4321 | stop processing: n_tokens = 65399, truncated = 0 9.39.278.326 I srv update_slots: all slots are idle 9.39.771.520 I srv params_from_: Chat format: peg-native 9.39.773.182 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.998 (> 0.100 thold), f_keep = 1.000 9.39.773.644 I reasoning-budget: activated, budget=2147483647 tokens 9.39.773.712 I slot launch_slot_: id 0 | task 4386 | processing task, is_child = 0 9.39.773.736 W slot update_slots: id 0 | task 4386 | n_past = 65399, slot.prompt.tokens.size() = 65399, seq_id = 0, pos_min = 65398, n_swa = 0 9.39.773.755 I slot update_slots: id 0 | task 4386 | Checking checkpoint with [65277, 65277] against 65398... 9.39.824.428 W slot update_slots: id 0 | task 4386 | restored context checkpoint (pos_min = 65277, pos_max = 65277, n_tokens = 65278, n_past = 65278, size = 480.676 MiB) 9.40.845.001 I slot create_check: id 0 | task 4386 | created context checkpoint 72 of 256 (pos_min = 65500, pos_max = 65500, n_tokens = 65501, size = 481.552 MiB) 9.41.252.819 I reasoning-budget: deactivated (natural end) 9.42.316.918 I slot print_timing: id 0 | task 4386 | prompt eval time = 1115.85 ms / 227 tokens ( 4.92 ms per token, 203.43 tokens per second) 9.42.316.937 I slot print_timing: id 0 | task 4386 | eval time = 1427.08 ms / 79 tokens ( 18.06 ms per token, 55.36 tokens per second) 9.42.316.937 I slot print_timing: id 0 | task 4386 | total time = 2542.93 ms / 306 tokens 9.42.316.938 I slot print_timing: id 0 | task 4386 | graphs reused = 4018 9.42.316.939 I slot print_timing: id 0 | task 4386 | draft acceptance = 0.95000 ( 38 accepted / 40 generated) 9.42.316.947 I statistics draft-mtp: #calls(b,g,a) = 72 4121 4121, #gen drafts = 4121, #acc drafts = 3894, #gen tokens = 4121, #acc tokens = 3894, dur(b,g,a) = 0.050, 10458.546, 2.911 ms 9.42.318.507 I slot release: id 0 | task 4386 | stop processing: n_tokens = 65583, truncated = 0 9.42.318.600 I srv update_slots: all slots are idle 9.42.746.008 I srv params_from_: Chat format: peg-native 9.42.747.644 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.990 (> 0.100 thold), f_keep = 1.000 9.42.748.075 I reasoning-budget: activated, budget=2147483647 tokens 9.42.748.134 I slot launch_slot_: id 0 | task 4429 | processing task, is_child = 0 9.42.748.177 W slot update_slots: id 0 | task 4429 | n_past = 65583, slot.prompt.tokens.size() = 65583, seq_id = 0, pos_min = 65582, n_swa = 0 9.42.748.178 I slot update_slots: id 0 | task 4429 | Checking checkpoint with [65500, 65500] against 65582... 9.42.799.129 W slot update_slots: id 0 | task 4429 | restored context checkpoint (pos_min = 65500, pos_max = 65500, n_tokens = 65501, n_past = 65501, size = 481.552 MiB) 9.43.788.477 I slot create_check: id 0 | task 4429 | created context checkpoint 73 of 256 (pos_min = 65733, pos_max = 65733, n_tokens = 65734, size = 482.466 MiB) 9.45.050.946 I slot create_check: id 0 | task 4429 | created context checkpoint 74 of 256 (pos_min = 66245, pos_max = 66245, n_tokens = 66246, size = 484.476 MiB) 9.45.389.510 I reasoning-budget: deactivated (natural end) 9.46.700.651 I slot print_timing: id 0 | task 4429 | prompt eval time = 2347.50 ms / 749 tokens ( 3.13 ms per token, 319.06 tokens per second) 9.46.700.669 I slot print_timing: id 0 | task 4429 | eval time = 1604.73 ms / 90 tokens ( 17.83 ms per token, 56.08 tokens per second) 9.46.700.670 I slot print_timing: id 0 | task 4429 | total time = 3952.23 ms / 839 tokens 9.46.700.671 I slot print_timing: id 0 | task 4429 | graphs reused = 4061 9.46.700.671 I slot print_timing: id 0 | task 4429 | draft acceptance = 0.97778 ( 44 accepted / 45 generated) 9.46.700.679 I statistics draft-mtp: #calls(b,g,a) = 73 4166 4166, #gen drafts = 4166, #acc drafts = 3938, #gen tokens = 4166, #acc tokens = 3938, dur(b,g,a) = 0.050, 10575.949, 2.943 ms 9.46.702.287 I slot release: id 0 | task 4429 | stop processing: n_tokens = 66339, truncated = 0 9.46.702.377 I srv update_slots: all slots are idle 9.47.150.190 I srv params_from_: Chat format: peg-native 9.47.151.910 I slot get_availabl: id 0 | task -1 | selected slot by LCP similarity, sim_best = 0.985 (> 0.100 thold), f_keep = 1.000 9.47.152.365 I reasoning-budget: activated, budget=2147483647 tokens 9.47.152.426 I slot launch_slot_: id 0 | task 4478 | processing task, is_child = 0