The stock market was sent reeling today as a result of the release by the Chinese company DeepSeek of an open source AI model that comes close to or matches the performance of American models, but was created for a fraction of the cost. While traditional models have cost in the range of $100 million to $1 billion to produce, the latest application from DeepSeek was reportedly created for under $6 million.
Wanting to know more, I invited Brian Chau on for a livestream to discuss. Some of the questions we cover:
What does it mean for a model to be open source?
Why would a business release an open source model?
Should you sell all your Nvidia stock?
How do we know that DeepSeek really cost under $6 million to build?
Can its costs be verified?
What might the intentions of the Chinese Communist Party be in letting this happen?
Will AI take all the jobs?
Has Brian’s p(doom) changed at all?
When will us writers be replaceable?
Has Brian’s vision of a hands off approach to AI regulation won?
Did Big Yud go down with the Kamala ship?
As a non-expert, I found it very useful to have an hour in which to pick Brian’s brain. I can’t recommend this conversation enough for those who want to make sense of what has happened in AI over the last few days.
So refreshing to know I can come on Substack and hear reasonable and insightful commentary from Hanania and friends. Thanks for all you do, Richard.
Open source A/i is similar to Linux software. All this means is that anyone can go into the source code and alter it to fit their specific needs. Since many people are interacting with open source, it becomes near impossible to hijack it. There are few secrets in open source systems.
On the other end, we have closed source A/i and software from microsoft and google is primarily this way as no outsiders can alter the code and thus you do not know what that code is really set up to do.
China could have used thousands of unpaid coders to add to their DeepSeek A/i and that is why it is relatively inexpensive. And without all the DEI and wokeism filters, that means they need less computing power....a lot less gobbledygook and more real answers.
The analogy to OSS is misleading, for two main reasons:
- LLMs are mostly black-box models (ML interpretability lags significantly behind ML capabilities) so open-sourcing their weights doesn’t mean you can “know what the [model] is set up to do” or how it was trained
- even closed-source software can be spread to other computers, but if a model has closed weights that by definition means only its owners have access to the weights. If the model turns out to be harmful or dangerous, it is not possible to un-release it if it’s open-source
> China could have used thousands of unpaid coders to add to their DeepSeek A/i and that is why it is relatively inexpensive.
The reported cost is based on the cost of compute, it doesn’t include the cost of acquiring data or paying the engineers/researchers.
> And without all the DEI and wokeism filters, that means they need less computing power
This is just silly. If you’re talking about training-time compute, then alignment is a small fraction of fine-tuning which in turn is a tiny fraction of pre-training. If you’re talking about inference-time compute, Llama is already open-source - you can host it locally and it won’t have extra DEI filters. Also, DeepSeek has its own version of wokeness - try asking it about Tianenmen square
I am subscribed (I'm pretty sure) but Substack will not let me in. What should I do? Would appreciate it if you (or whomever) could reply to hapgood@pobox.com
Not let you in? This one isn’t gated, you should be able to watch or listen no matter what.
How does one train a net on the outputs of another? By running lots of prompts? Doesn't this invite gradual collapse by increasingly deprioritizing the "deep" knowledge that isn't often accessed by prompts and leaving a Potemkin village behind? Or am I having a wrong model of LLMs?
It's funny to watch the beginning. I know it's not common knowledge but much of the modern Internet is based on open source. Android is open source. The source code is an important part, but not the only part of software.