Crossing Linguistic Horizons

Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models

Few-Shot Reasoning Leaderboard

Models SR - Natural SR - Abstract symbol MATH
EM F1 Equ. EM F1 Equ. EM F1 Equ.
URA-LLaMa 70B 0.14 ± 0.00 0.48 ± 0.00 0.15 ± 0.00 0.27 ± 0.00 0.85 ± 0.00 0.30 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.12 ± 0.02
URA-LLaMa 13B 0.08 ± 0.00 0.42 ± 0.00 0.08 ± 0.00 0.20 ± 0.00 0.70 ± 0.00 0.17 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.01
URA-LLaMa 7B 0.04 ± 0.00 0.38 ± 0.00 0.04 ± 0.00 0.11 ± 0.00 0.61 ± 0.00 0.10 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.07 ± 0.01
LLaMa-2 13B 0.03 ± 0.00 0.24 ± 0.00 0.04 ± 0.00 0.19 ± 0.00 0.69 ± 0.00 0.18 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.16 ± 0.02
LLaMa-2 7B 0.00 ± 0.00 0.01 ± 0.00 0.00 ± 0.00 0.06 ± 0.00 0.44 ± 0.00 0.06 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.11 ± 0.01
Vietcuna 7B 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.14 ± 0.00 0.71 ± 0.00 0.10 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.01 ± 0.00
MixSUra 8x7B 0.07 ± 0.00 0.41 ± 0.00 0.07 ± 0.00 0.22 ± 0.00 0.78 ± 0.00 0.23 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.00 ± 0.00
GPT-3.5 0.15 ± 0.00 0.50 ± 0.00 0.16 ± 0.00 0.26 ± 0.00 0.83 ± 0.00 0.29 ± 0.00 0.00 ± 0.00 0.00 ± 0.00 0.62 ± 0.02
GPT-4 0.37 ± 0.00 0.74 ± 0.00 0.42 ± 0.00 0.37 ± 0.00 0.87 ± 0.00 0.44 ± 0.00 0.00 ± 0.00 0.01 ± 0.00 0.65 ± 0.02