DeepSeek: Hype and Reality

Jan 27, 2025

Understanding the recent earthquake in AI development

7 Comments

Jan 28, 2025

So refreshing to know I can come on Substack and hear reasonable and insightful commentary from Hanania and friends. Thanks for all you do, Richard.

Crixcyon

Jan 28, 2025

Open source A/i is similar to Linux software. All this means is that anyone can go into the source code and alter it to fit their specific needs. Since many people are interacting with open source, it becomes near impossible to hijack it. There are few secrets in open source systems.

On the other end, we have closed source A/i and software from microsoft and google is primarily this way as no outsiders can alter the code and thus you do not know what that code is really set up to do.

China could have used thousands of unpaid coders to add to their DeepSeek A/i and that is why it is relatively inexpensive. And without all the DEI and wokeism filters, that means they need less computing power....a lot less gobbledygook and more real answers.

Reply (1)

Hellbender

Jan 28, 2025

The analogy to OSS is misleading, for two main reasons:

- LLMs are mostly black-box models (ML interpretability lags significantly behind ML capabilities) so open-sourcing their weights doesn’t mean you can “know what the [model] is set up to do” or how it was trained

- even closed-source software can be spread to other computers, but if a model has closed weights that by definition means only its owners have access to the weights. If the model turns out to be harmful or dangerous, it is not possible to un-release it if it’s open-source

> China could have used thousands of unpaid coders to add to their DeepSeek A/i and that is why it is relatively inexpensive.

The reported cost is based on the cost of compute, it doesn’t include the cost of acquiring data or paying the engineers/researchers.

> And without all the DEI and wokeism filters, that means they need less computing power

This is just silly. If you’re talking about training-time compute, then alignment is a small fraction of fine-tuning which in turn is a tiny fraction of pre-training. If you’re talking about inference-time compute, Llama is already open-source - you can host it locally and it won’t have extra DEI filters. Also, DeepSeek has its own version of wokeness - try asking it about Tianenmen square

Fred Hapgood

Jan 27, 2025

I am subscribed (I'm pretty sure) but Substack will not let me in. What should I do? Would appreciate it if you (or whomever) could reply to hapgood@pobox.com

Reply (1)

Richard Hanania

Jan 28, 2025

Not let you in? This one isn’t gated, you should be able to watch or listen no matter what.

Darij Grinberg

Feb 1, 2025

How does one train a net on the outputs of another? By running lots of prompts? Doesn't this invite gradual collapse by increasingly deprioritizing the "deep" knowledge that isn't often accessed by prompts and leaving a Potemkin village behind? Or am I having a wrong model of LLMs?

Ethan

Jan 28, 2025

It's funny to watch the beginning. I know it's not common knowledge but much of the modern Internet is based on open source. Android is open source. The source code is an important part, but not the only part of software.

Richard Hanania's Newsletter

DeepSeek: Hype and Reality