Crossing Linguistic Horizons

Finetuning and Comprehensive Evaluation of Vietnamese Large Language Models

Chain-Of-Thought Reasoning Leaderboard

Models Metrics
EM F1 Equ.
URA-LLaMa 70B 0.00 ± 0.00 0.12 ± 0.01 0.18 ± 0.02
URA-LLaMa 13B 0.00 ± 0.00 0.23 ± 0.01 0.17 ± 0.01
URA-LLaMa 7B 0.00 ± 0.00 0.23 ± 0.01 0.09 ± 0.01
LLaMa-2 13B 0.00 ± 0.00 0.12 ± 0.01 0.18 ± 0.02
LLaMa-2 7B 0.00 ± 0.00 0.10 ± 0.00 0.12 ± 0.02
Vietcuna 7B 0.00 ± 0.00 0.13 ± 0.01 0.10 ± 0.01
MixSUra 8x7B 0.00 ± 0.00 0.17 ± 0.01 0.33 ± 0.00
GPT-3.5 0.00 ± 0.00 0.32 ± 0.01 0.78 ± 0.02
GPT-4 0.00 ± 0.00 0.32 ± 0.01 0.79 ± 0.02