0:00
/
0:00
Transcript
0:01
Richard
Okay, so Brian, welcome. People are trickling in. So my understanding, let me ask you a question. Okay, so DeepSeek, they say it's open source, right? So how come, so how do they have a company? Like if it's open source, it means I can do, I have DeepSeek now, I can be DeepSeek. Is that what it means?
0:28
Brian
Yeah, so DeepSeek is open source via the MIT license, which is actually a lot more permissible than the custom license that Meta has. But yeah. Um, it is, um, very widely accessible. You can download it. You can go on like hugging face.com and like literally download a copy of it. Well, it would take a while.
0:50
Richard
So if I'm a computer, I'm a computer guy. I can figure out how to take deep seek and I can, you know, the example people use is that you can take out all the, uh, censorship, um, the tenement square stuff. So like I can just take deep seek. If I do about computers, uh,

DeepSeek: Hype and Reality

Understanding the recent earthquake in AI development

The stock market was sent reeling today as a result of the release by the Chinese company DeepSeek of an open source AI model that comes close to or matches the performance of American models, but was created for a fraction of the cost. While traditional models have cost in the range of $100 million to $1 billion to produce, the latest application from DeepSeek was reportedly created for under $6 million.

Wanting to know more, I invited Brian Chau on for a livestream to discuss. Some of the questions we cover:

  • What does it mean for a model to be open source?

  • Why would a business release an open source model?

  • Should you sell all your Nvidia stock?

  • How do we know that DeepSeek really cost under $6 million to build?

  • Can its costs be verified?

  • What might the intentions of the Chinese Communist Party be in letting this happen?

  • Will AI take all the jobs?

  • Has Brian’s p(doom) changed at all?

  • When will us writers be replaceable?

  • Has Brian’s vision of a hands off approach to AI regulation won?

  • Did Big Yud go down with the Kamala ship?

As a non-expert, I found it very useful to have an hour in which to pick Brian’s brain. I can’t recommend this conversation enough for those who want to make sense of what has happened in AI over the last few days.

Discussion about this video

So refreshing to know I can come on Substack and hear reasonable and insightful commentary from Hanania and friends. Thanks for all you do, Richard.

Expand full comment

Open source A/i is similar to Linux software. All this means is that anyone can go into the source code and alter it to fit their specific needs. Since many people are interacting with open source, it becomes near impossible to hijack it. There are few secrets in open source systems.

On the other end, we have closed source A/i and software from microsoft and google is primarily this way as no outsiders can alter the code and thus you do not know what that code is really set up to do.

China could have used thousands of unpaid coders to add to their DeepSeek A/i and that is why it is relatively inexpensive. And without all the DEI and wokeism filters, that means they need less computing power....a lot less gobbledygook and more real answers.

Expand full comment

The analogy to OSS is misleading, for two main reasons:

- LLMs are mostly black-box models (ML interpretability lags significantly behind ML capabilities) so open-sourcing their weights doesn’t mean you can “know what the [model] is set up to do” or how it was trained

- even closed-source software can be spread to other computers, but if a model has closed weights that by definition means only its owners have access to the weights. If the model turns out to be harmful or dangerous, it is not possible to un-release it if it’s open-source

> China could have used thousands of unpaid coders to add to their DeepSeek A/i and that is why it is relatively inexpensive.

The reported cost is based on the cost of compute, it doesn’t include the cost of acquiring data or paying the engineers/researchers.

> And without all the DEI and wokeism filters, that means they need less computing power

This is just silly. If you’re talking about training-time compute, then alignment is a small fraction of fine-tuning which in turn is a tiny fraction of pre-training. If you’re talking about inference-time compute, Llama is already open-source - you can host it locally and it won’t have extra DEI filters. Also, DeepSeek has its own version of wokeness - try asking it about Tianenmen square

Expand full comment

I am subscribed (I'm pretty sure) but Substack will not let me in. What should I do? Would appreciate it if you (or whomever) could reply to hapgood@pobox.com

Expand full comment

Not let you in? This one isn’t gated, you should be able to watch or listen no matter what.

Expand full comment

How does one train a net on the outputs of another? By running lots of prompts? Doesn't this invite gradual collapse by increasingly deprioritizing the "deep" knowledge that isn't often accessed by prompts and leaving a Potemkin village behind? Or am I having a wrong model of LLMs?

Expand full comment

It's funny to watch the beginning. I know it's not common knowledge but much of the modern Internet is based on open source. Android is open source. The source code is an important part, but not the only part of software.

Expand full comment