Top Researchers Conclude OpenAI’s ChatGPT Has Gotten Worse In Performance And Behavior

If you happen to be less than impressed with OpenAI’s famous ChatGPT tool, you’re clearly not the only one.

A recently published study by top researchers at Stanford University has concluded in their findings that the modern AI tool has gotten worse this year, in terms of both performance as well as behavior.

Specific reference was made to the functioning of the language model by OpenAI during the period of March to June of this year. Moreover, the researchers even went ahead to add how some tasks just failed to impress in terms of producing replies that users requested.

And that’s clearly not good news for the makers behind the hugely popular tool. Did we mention how there are so many alternatives in the industry and the competition keeps on increasing because newer entries are trying to launch their own endeavors with the hopes of making it big?

The motivation behind studies like these is the long list of complaints being generated by users in regard to the tool’s performance and behavior that fluctuates with time.

While the findings did also speak about tasks enhancing with time while others got worse, one thing is for sure. The huge discrepancy in reporting was done in a systematic manner to gauge how it differs at various points in time.

The research is still fairly new and is yet to be peer-reviewed by others but it’s definitely proving that people’s genuine concerns related to the tool’s operations are not wrong.

Other findings that the author spoke in detail about was how the language model got worse with time in terms of finding prime numbers and showcasing a more step-by-step maneuver, not to mention more errors getting formatted when codes were generated as outputs.

Meanwhile, accuracy was seen to drop too by a whopping 95% over just three months of evaluation. The latter was linked to the new GPT-4 technology while GPT-3 did see a rise in accuracy. Interestingly, GPT-4 showed greater resistance to things like jailbreaking.

But with that being said, some questions have arisen on things like making use of the right metrics to produce changes that are meaningful and those that declare services to get worse with time.

One expert added how the paper spoke about GPT-4 getting worse but that was linked to its behavior and not its capabilities.

But whatever the case may be, the study is shedding light on a new thought linked to business operators being more aware of using the GPT technology and any AI product, as a matter of fact, due to changes in the tool’s behavior with time.

Another conclusion had to do with how providing more vigilance in the form of regulation would help solve the issues and also how more transparency would be required to keep matters at bay and get rid of the pitfalls being outlined at this point in time.


Read next: Unleashing Transparency: The Emergence of Open Alternatives to ChatGPT
Previous Post Next Post