in , , ,

Elon Musk Unveils xAI’s Grok 4, Claimed as World’s Most Intelligent AI

Elon Musk unveiled the latest iteration of his artificial intelligence product by the name of Grok 4, from xAI, during a public event that lasted an hour on Wednesday evening. This cutting-edge AI, according to Musk, stands as ‘the most intelligent AI globally,’ boasting potential capabilities like achieving flawless SAT grades and getting close to immaculate GRE scores in a broad range of areas, encompassing both the humanities and sciences.

The online release ceremony witnessed Musk, accompanied by his team, discussing the performance of Grok 4 on a unique metric known as Humanity’s Last Exam (HLE). This in-depth 2,500-question test employs human expertise from over 100 different fields and is intended to measure AI’s academic competence and logical abilities. The HLE, which covers an extensive range of subjects from classical study areas to advanced quantum chemistry, was first made public in January 2025 by close to one thousand human experts.

The evaluation integrates text and visual content to test the AI. When given a shot at it, Grok 4 managed to score 25.4 percent on its individual attempt. Interestingly, when provided with suitable tools, the AI’s performance improved noticeably, with the score rising to 38.6 percent. A version dubbed Grok 4 Heavy, which employs multiple AI entities to tackle problems, further elevated the score to 44.4 percent.

Remarkably, the alternative top-performing AI models couldn’t match Grok 4’s performance. Google’s Gemini-Pro scored 26.9 percent, and OpenAI’s o3 model achieved 24.9 percent, both when provided with the necessary tools. However, it’s important to note that xAI’s internal testing results have not been featured on the HLE leaderboard. The reason for this discrepancy isn’t clear yet as it’s not known whether xAI has yet to present these results or if the results are awaiting a review process.

Grok 4 was not only subjected to rigorous testing during the launch, but participants also had a chance to witness it in action. A live demonstration showed the AI estimating baseball predictions, identifying the “most unconventional” profile picture by an xAI employee on their internal platform, and producing a simulated visual representation of a black hole.

Sponsored

Musk hinted at Grok 4’s potential in unearthing completely novel technologies later this year and even discovering ‘new physics’ by next year’s end. He also forecasted AI-designated games and films, suggesting that Grok 4 would be able to craft playable games and create enjoyable movies by the year 2026.

In addition to this, Grok 4 is equipped with novel audio functionalities. These capabilities got a chance to shine during the launch event, with the AI demonstrating its vocal abilities by singing. Moreover, Grok 4 outperformed its peers to secure the top spot in the AI Analysis Intelligence Index, edging out rivals such as Gemini 2.5 Pro and OpenAI’s o4-mini-high.

Furthermore, Grok 4 proudly holds the highest position in the publicly available models on the leaderboards for the Abstraction and Reasoning Corpus (ARC-AGI-1), and its follow-up edition, ARC-AGI-2. These benchmarking tools gauge an AI model’s advancement toward a ‘humanlike’ form of general intelligence. Apart from these achievements, Grok 4 has shown superiority over different AI systems on numerous additional benchmarks, indicating its proficiency in STEM-related subjects.

Putting Grok 4’s merits to the test, Alex Olteanu, a prestigious data science editor in the AI education platform DataCamp, experimented with Grok 4. Olteanu was pleasantly surprised with Grok 4’s performance in solving math and programming challenges in his testing phase. He praised the chain-of-thought logic of the AI, which demonstrated an original and logically robust approach to problem-solving.

However, he also pointed out limitations with Grok 4. Olteanu noted that its ‘context window’ is not as robust as some might expect, and it could potentially struggle with extensive codebases or those found in actual production settings. Also, when attempting to analyze a large 170-page PDF, it seemed to underperform, likely owing to its limited context window and below-par multimodal skills.

Despite the impressive demonstration and capabilities, Grok 4 has had its fair share of developing issues surfacing post-release. Several posts on the internal xAI platform and various tech news channels have reported that when asked about sensitive topics such as the Israeli-Palestinian conflict or abortion or U.S immigration law, Grok 4 often referenced Musk’s previous comments on these topics, indicative of the AI’s inclination towards Musk’s beliefs.

The introduction of Grok 4 follows a wave of controversy stirred-up by the previous model, Grok 3. Earlier contentious outputs included anti-semitic remarks, inappropriate admiration for Hitler, and assertions of ‘white genocide’, which the company acknowledged publicly. These isolated incidences were attributed to unauthorized tampering, and xAI assured that the company would take corrective measures to clamp down on such occurrences.

Amid all the developments, Musk acknowledged the prospect of fabricating an AI that holds superior intelligence to humans could be daunting. However, he expressed a cautious optimism, voicing a somewhat reconciled perspective that even if the outcome were not entirely positive, he would still want to witness how it unfolds.

All said and done, the unveiling of Grok 4 promises an exciting vignette of the future of AI. Its performance on various benchmarks and quizzes has thrown new light on the immense potential of AI. This, however, doesn’t overshadow the numerous concerns and ethical dilemmas that follow its development.

These include, among others, the threat of AI bias, the possibility of misuse, and societal implications. As we tread further into the heart of AI development, these questions demand prompt addressing if we are to ensure a responsible, useful, and inclusive AI future.