DeepSeek’s AI breakthrough rivals top models at a fraction of the cost, proving open source innovation is reshaping AI’s future. Is this an AI race or an open vs. closed battle?
So are these techiques so novel and breaktrough? Will we now have a burst of deepseek like models everywhere? Cause that’s what absolutely should happen if the whole storey is true. I would assume there are dozens or even hundreds of companies in USA that are in a posession of similar number but surely more chips that Chinese folks claimed to trained their model on, especially in finance sector and just AI reserach focused.
Note that s1 is transparently a distilled model instead of a model trained from scratch, meaning it inherits knowledge from an existing model (Gemini 2.0 in this case) and doesn’t need to retrain its knowledge nearly as much as training a model from scratch. It’s still important, but the training resources aren’t really directly comparable.
The open paper they published details the algorithms and techniques used to train it, and it’s been replicated by researchers already.
So are these techiques so novel and breaktrough? Will we now have a burst of deepseek like models everywhere? Cause that’s what absolutely should happen if the whole storey is true. I would assume there are dozens or even hundreds of companies in USA that are in a posession of similar number but surely more chips that Chinese folks claimed to trained their model on, especially in finance sector and just AI reserach focused.
deleted by creator
Note that s1 is transparently a distilled model instead of a model trained from scratch, meaning it inherits knowledge from an existing model (Gemini 2.0 in this case) and doesn’t need to retrain its knowledge nearly as much as training a model from scratch. It’s still important, but the training resources aren’t really directly comparable.
deleted by creator