It looks like DeepSeek is not the only company making breakthroughs in the world of AI. Researchers at the Stanford and University of Washington gave rise to a new model that outperformed some of the biggest names in the world of tech including OpenAI and DeepSeek.
What is even more astonishing is how it was all done in just $50 and with only 26 minutes of training. Yes, that’s much lower than what we saw DeepSeek get fame and recognition for recently.
The model was trained using just 1000 queries and 16 H100 GPUs from Nvidia. The cost is just an estimate depending on the runtime for GPUs. This is certainly a huge milestone for the AI world and it’s giving rise to new techniques that others can take inspiration from.
It also raises a lot of questions on why the West is taking so much and spending billions to curate AI models when it can all be done quicker and cheaper. It also has to do with the techniques in use, including training.
Developers building enhancements on top of AI models in existence is a major innovation as it comes at little cost. It can be done through access that’s open-sourced, via APIs, or any closed-source models.
As per the research paper of the team that was shown on Friday, the S1 was training through datasets featuring 1000 selective queries paired with reasoning and replies from Gemini Thinking Experimental. The latter is accessible through daily limits via the AI Studio. It’s definitely closed source but that didn’t prevent experts from using replies. After that, the researchers utilized off-shelf pre-trained models arising from Qwen. The next step was fine-tuning under the supervision of the new dataset.
Teams produced token budgets for controlling computing time and for model testing. Any s1 going over budget was removed and made to create replies. If researchers wished for models to spend more on problems, the model was told to wait. Thinking times were extended and researchers could display how that led to better performance.
Such models like S1 are perfect examples of reasoning models that are open-sourced. They are produced quicker and cheaper than what Google and OpenAI have to offer. Remember, experts at UC Berkeley shared open-source reasoning models such as Sky-T1 that only cost them $450. This proved how possible it was to copy reasoning offerings cost-effectively.
Other models are also uprising that copy the approach of R1 from DeepSeek. Such top-notch models get more accessible and cheaper and the power keeps shifting from those once known as AI heavyweights to newer models of the industry.
The fact that the time has come when developers can design AI reasoning models that are on the same level as OpenAI is a major breakthrough in the industry.
Image: DIW-Aigen
Read next: UC Berkeley Researcher Criticizes Billion-Dollar AGI Race, Says AI Can’t Know Everything
What is even more astonishing is how it was all done in just $50 and with only 26 minutes of training. Yes, that’s much lower than what we saw DeepSeek get fame and recognition for recently.
The model was trained using just 1000 queries and 16 H100 GPUs from Nvidia. The cost is just an estimate depending on the runtime for GPUs. This is certainly a huge milestone for the AI world and it’s giving rise to new techniques that others can take inspiration from.
It also raises a lot of questions on why the West is taking so much and spending billions to curate AI models when it can all be done quicker and cheaper. It also has to do with the techniques in use, including training.
Developers building enhancements on top of AI models in existence is a major innovation as it comes at little cost. It can be done through access that’s open-sourced, via APIs, or any closed-source models.
As per the research paper of the team that was shown on Friday, the S1 was training through datasets featuring 1000 selective queries paired with reasoning and replies from Gemini Thinking Experimental. The latter is accessible through daily limits via the AI Studio. It’s definitely closed source but that didn’t prevent experts from using replies. After that, the researchers utilized off-shelf pre-trained models arising from Qwen. The next step was fine-tuning under the supervision of the new dataset.
Teams produced token budgets for controlling computing time and for model testing. Any s1 going over budget was removed and made to create replies. If researchers wished for models to spend more on problems, the model was told to wait. Thinking times were extended and researchers could display how that led to better performance.
Such models like S1 are perfect examples of reasoning models that are open-sourced. They are produced quicker and cheaper than what Google and OpenAI have to offer. Remember, experts at UC Berkeley shared open-source reasoning models such as Sky-T1 that only cost them $450. This proved how possible it was to copy reasoning offerings cost-effectively.
Other models are also uprising that copy the approach of R1 from DeepSeek. Such top-notch models get more accessible and cheaper and the power keeps shifting from those once known as AI heavyweights to newer models of the industry.
The fact that the time has come when developers can design AI reasoning models that are on the same level as OpenAI is a major breakthrough in the industry.
Image: DIW-Aigen
Read next: UC Berkeley Researcher Criticizes Billion-Dollar AGI Race, Says AI Can’t Know Everything