The place where the industry is squeezing to get to get that progress is shifted from pre-training, which is, you know, lots of Internet data, maybe trying synthetic data on huge clusters of GPUs towards post-training and test and compute, which is more about, you know, smaller amounts of data, but it's very high quality, very specific.
目前,該行業正在努力取得進展,從前期訓練轉向後期訓練、測試和計算,前期訓練就是大量的互聯網數據,或許可以在巨大的 GPU 集群上嘗試合成數據。