Now, there's also some controversy around the benchmarks, specifically massive multitask language understanding, which is a multiple choice test like the SATs, it covers 57 different subjects.
現在,圍繞基準測試也存在一些爭議,特別是大規模多任務語言理解,這是一個像 SAT 一樣的多項選擇測試,它涵蓋 57 個不同的科目。