LLM Leaderboard

Although the leaderboard is geared towards 1B LLMs, we show the current performance of all models in the G1Bbon benchmark. As we continue to collect more data, we will update the leaderboard to reflect the latest scores only for the 1B models.

G1Bbon Benchmark (2 Quadrants)

Rank	Model	Aggregated Mean Score
1	gpt40	0.70 - 0.72
2	Centaur_88	0.63 - 0.65
3	gpt40_mini	0.58 - 0.60
4	Qwen_78_Instruct	0.50 - 0.52
5	Qwen_3B_Instruct	0.45 - 0.47
6	Qwen_3B	0.42 - 0.44
7	Qwen_7B	0.40 - 0.42
8	Deepseek_R1_7B_Qwen	0.38 - 0.40
9	Qwen_1B	0.35 - 0.37
10	Owen_1B	0.32 - 0.34
11	Owen_1B_Instruct	0.30 - 0.32
12	Deepseek_R1_8B_Llama	0.16 - 0.18
13	Deepseek_R1_1B_Owen	0.14 - 0.16

G1Bbon Benchmark (4 Quadrants)

Rank	Model	Aggregated Mean Score
1	Qwen_78_Instruct	0.34 - 0.36
2	Centaur_88	0.32 - 0.34
3	gpt40_mini	0.29 - 0.31
4	gpt40	0.25 - 0.27
5	Qwen_3B_Instruct	0.22 - 0.24
6	Qwen_3B	0.20 - 0.22
7	Qwen_1B	0.18 - 0.20
8	Qwen_7B	0.16 - 0.18
9	Deepseek_R1_7B_Qwen	0.14 - 0.16
10	Deepseek_R1_1B_Owen	0.10 - 0.12
11	Owen_1B_Instruct	0.08 - 0.10
12	Deepseek_R1_8B_Llama	0.04 - 0.06

a tiny benchmark for tiny minds

LLM Leaderboard

G1Bbon Benchmark (2 Quadrants)

G1Bbon Benchmark (4 Quadrants)

a tiny benchmark
for tiny minds