Elon Musk’s Latest Grok-3 AI Model Met With Great Skepticism Regarding Performance

Elon Musk's latest AI model, Grok-3, has recently been released. While many did anticipate its launch, it wouldn’t be wrong to state that the performance claims were seen with massive skepticism.

From pricing, it was said to be similar to OpenAI’s GPT-4 as well as DeepSeek. Now, as per the CTO from Caylent, Randall Hunt, its abilities are far less than what was hyped in the past. For instance, Hunt highlighted that Grok-3's alarming flaw is its vulnerability to manipulation through "jailbreaking" prompt engineering.

Overall, the replies you get from the model were outlined as sarcastic, slow, and mostly wrong. If that was not alarming enough, it couldn’t even pass the common test designed for reasoning models called Tic Tac Toe boards.

The fact that it is so susceptible to jailbreaks means it’s giving pause to enterprise leaders that are keen to adopt it for their daily use. But the fact that there are so many loopholes raises the question about where and how it can be used practically in the real world.

Performance speed is also slow, and the issues with many AI benchmarks are that they don’t capture how useful the model is and how it performs inside any real-world setting. Remember, benchmarks aren’t the only measurements for a model’s performance. You need to see the business value they provide and that entails testing real-world uses and not basing judgements on different cherry-picking benchmarks.

So what exactly is holding this model back is a question on people’s minds.

xAI fails in terms of architectural innovation. That could contribute to the performance issues of Grok-3. So far, we are yet to see any major architectural highs from any tech giants as they’re mostly throwing more data and seeing what works. This kind of laid-back attitude is never appreciated in terms of AI innovation across various sectors and it’s not the right strategy for driving major AI changes. Any major AI step would need new frameworks instead of the usual small tweaks done to the usual blueprints that are based on transformers.

Before we go, we thought it would be interesting to see if Grok-3 has any competitive AI advantage. Experts did note that its access to the X platform was something unique and not seen with other models. Its capabilities of searching the app in real time are quite unique. That might be a major advantage if the dataset is cleaned in the right manner.

Image: DIW-Aigen

Read next: Apple Files New Appeal To Overturn Secret UK Order for Creating Backdoor for Government Security Officials
Previous Post Next Post