Measuring Massive Multitask Language Understanding (MMLU)

multiple-choice benchmark.