Scale AI publishes its first LLM Leaderboards, ranking AI model performance in specific domains

Artificial intelligence training data provider Scale AI Inc., which serves the likes of OpenAI and Nvidia Corp., today published the results of its first-ever SEAL Leaderboards.

It’s a new ranking system for frontier large language models based on private, curated and unexploitable datasets that attempts to rate their capabilities in common use cases, such as generative AI coding, instruction following, math and multilinguality.

The SEAL Leaderboards show that OpenAI’s GPT family of LLMs ranks first in three of the four initial domains it’s using to rank AI models, with Anthropic PBC’s popular Claude 3 Opus grabbing first place in the fourth category. Google LLC’s Gemini models also did well, ranking joint-first with the GPT models in a couple of the domains. — Read More

LeaderBoard

#strategy