{ "task": "Reasoning", "subtasks": { "Math": { "datasets": { "afrimgsm": { "languages": [ "amh", "ewe", "hau", "ibo", "kin", "lin", "lug", "orm", "sna", "sot", "swa", "twi", "vai", "wol", "xho", "yor", "zul" ], "scores": { "AfroLLaMa 8B": [ 0.0, 0.0, 1.2, 0.0, 0.4, 0.4, 0.0, 0.4, 0.4, 0.8, 0.4, 0.0, 0.0, 0.0, 0.0, 0.8, 0.0 ], "Aya-101 13B": [ 6.8, 3.2, 7.2, 3.2, 3.6, 4.4, 2.4, 4.8, 6.8, 10.8, 6.0, 1.6, 0.8, 1.2, 4.4, 3.2, 3.6 ], "Gemma1.1 7b": [ 4.0, 2.4, 5.6, 2.4, 4.8, 4.0, 3.6, 2.0, 4.4, 3.6, 20.8, 3.2, 3.6, 2.4, 2.0, 5.2, 4.6 ], "LLaMa2 7b": [ 1.6, 2.8, 0.4, 1.2, 2.8, 2.4, 1.6, 2.0, 1.2, 2.0, 4.4, 2.4, 1.6, 1.6, 2.4, 2.0, 0.8 ], "LLaMa3 8B": [ 2.0, 3.2, 8.8, 6.0, 2.8, 4.4, 4.8, 3.6, 2.8, 4.0, 26.4, 2.0, 2.0, 2.0, 3.6, 4.4, 3.6 ], "LLaMa3.1 8B": [ 4.4, 2.4, 10.0, 4.4, 7.2, 3.2, 7.6, 3.6, 4.8, 6.4, 41.2, 4.0, 1.2, 2.8, 2.4, 5.6, 5.2 ], "LLaMAX3 8B": [ 3.6, 1.6, 9.6, 5.2, 4.4, 2.8, 5.2, 1.6, 6.0, 3.2, 10.8, 1.6, 4.0, 3.2, 5.6, 4.8, 7.2 ], "Gemma2 9b": [ 26.4, 4.8, 35.2, 11.6, 22.4, 11.6, 17.6, 9.6, 26.0, 22.0, 61.6, 9.2, 0.4, 3.6, 20.8, 13.6, 21.2 ], "Gemma2 27b": [ 33.6, 7.6, 49.6, 24.0, 32.4, 18.0, 23.6, 12.8, 35.2, 38.4, 73.6, 12.4, 3.2, 5.6, 32.0, 22.4, 34.4 ], "LLaMa3.1 70B": [ 17.6, 8.0, 48.8, 37.2, 26.4, 11.2, 24.4, 10.8, 21.2, 32.8, 68.0, 14.4, 0.4, 3.2, 18.8, 23.6, 27.2 ], "GPT-4o (Aug)": [ 57.6, 8.8, 64.8, 57.6, 60.4, 51.2, 51.6, 61.2, 58.4, 60.8, 78.8, 31.2, 4.8, 28.0, 52.4, 62.0, 57.2 ], "Gemini 1.5 pro": [ 67.6, 40.8, 65.6, 63.2, 63.6, 51.2, 54.4, 57.2, 64.4, 61.2, 76.4, 38.0, 3.2, 12.8, 50.8, 61.6, 57.2 ] } } } } } }