Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.
A clever thing about generative models and ChatGPT is that they give you different results for the same prompt (or input). This is done, presumably, by setting the random seed to the computer’s current system clock time just before computing an extraordinary number of matrix multiplications (or some equivalent in a distributed sense when all of those parameters do not fit in the RAM of a single computer).
This trivial detail gives the illusion that ChatGPT is non-deterministic, but it’s not.
You can see this for yourself in HuggingFace’s open source ChatGPT alternative: HuggingChat.
Large Language Models (LLMs)—just like all machine learning models—are an estimated static equation, which means that for a fixed input you will receive a fixed output. Generative models do some probability weighted random sampling to provide a little flair and the mirage of sentience.
But LLMs are nothing more than a bunch of numbers, multiplications, sums, and a splash of pseudo-random sampling.
Centralized Artificial Intelligence
OpenAI encountered some troubles recently, and they’ve done some incredible work to overcome them.
But OpenAI, MetaAI, Google Research, DeepMind, or anyone else can’t solve the core problem, which is that true Artificial General Intelligence (AGI) needs truly Open AI; that is to say that no single entity or research lab should be trusted with the power of AGI.
After several months of reflection, I’ve come to only one conclusion: a cryptographically secure, decentralized ledger is the only solution to making AI safer.
I’ve thought for quite some time that blockchain and crypto the technologies—not necessarily the digital currency—had incredible implications but I didn’t know what for…and it turns out the answer was hiding in the next hype cycle.
I am neither a crypto maximalist, nor even necessarily a crypto advocate. I am, however, a technologist who sees the value of the technology used by most crypto currencies.
As a brief aside, my biggest skepticism with cryptocurrencies is that a non-trivial share of their advocates seem to treat it as a long term store of value which creates an economic disincentive to transact, which then renders it a poor medium of exchange. Regardless, that’s how lots of people have treated Bitcoin, Ethereum, and other tokens.
So why do I believe “a cryptographically secure, decentralized ledger is the only solution” to truly Open AI?
Because it solves some core problems.
Some of AI’s Problems
As I said, no single entity should be the sole owner of any true AGI. It creates far too much power in the hands of only well-capitalized institutions (i.e., those that can afford the compute necessary to train a giganto model).
There are other challenges outside of this, too.
Reproducibility
The academic literature is ridden with examples of state of the art (SOTA) models that weren’t reproducible and while there’s an ongoing effort to improve this suffice it to say that a lot of models aren’t reproducible (maybe even most).
That’s bad science, but the incentives in academia are what they are.
ML industry practitioners have come quite a long way in model reproducibility (i.e., model version control) but in the early days many forgot about data version control.
Quite obviously, a model cannot be version controlled unless the data and the code that constructed both the model and the data are version controlled, too.
Thinking otherwise is dumb (and, remember, that’s not you!).
So in order to have reproducibility in general we need model and data reproducibility, and it turns out that a decentralized database that records every version posted to some chain is a very good candidate for that.
Data Privacy
Most people don’t really care that you use their data for things so long as it (1) serves the right product experience and (2) isn’t malicious.
But some people care a lot! And some countries (e.g., Italy or European countries) care 100x more than that.
A benefit of cryptography and decentralization is that you can estimate “local” models without sending data and still contribute back the estimated gradient to the network. Additionally, you could encrypt the data as well to secure it. This is known as Federated Learning and is an active area of research.
That said, this approach isn’t actually what I think should exist, which are two separate ledgers: one for data and another for learning.
Stale Information
A frequent complaint that users of ChatGPT have is that the model was only trained on data up to September 2021, which means that the data and model are stale. Because it used large scale web data this makes a lot of sense as a practical limitation but for AGI to work, we need streaming data and continuous learning.
Both of these problems are non-trivial and require quite sophisticated large scale distributed computing and streaming data infrastructure...or they can be solved through decentralization and gradient mining.
Massive Compute Requirements
I would like to underscore the “Large” in “Large Language Models”, they are very big and costly to run. Which is one of the main reasons why people or labs outside of the technology industry can’t really build these state of the art models1.
As a brief aside, academics, in their attempt to develop novel algorithms, most iterate on novel-ish architectures rather than try to update existing models, which is arguable a lot of wasted compute. LoRA is an an extraordinary example of the exact opposite of this.
Over the last decade, large scale machine learning models benefited greatly from using GPUs instead of CPUs because they are much more efficient at executing matrix multiplications (an embarrassingly parallelizable mathematical operation).
They also turned out to be incredibly useful for Bitcoin mining.
Miners could decide to compute gradients (i.e., train a model) instead of transactions on the blockchain and, theoretically, this would be a straightforward migration.
Incentives
Miners don’t mine for the sake of increasing our carbon emissions—they mine to make money. Therefore, there needs to be an economic incentive to make miners want to mine Gradients.
This could be a digital currency or whatever.
There also needs to be an incentive to contribute training data. People should be rewarded when they choose to contribute their data (DeSo is doing this) and even more so for labeling their data.
Model Forks
Crypto currencies are often forked and expanded upon for different goals. If we have a decentralized system where computed model weights are published to a decentralized ledger, then we can not only recover models at any point in time but we can also fork them and train them with different goals in mind (e.g., new architectures).
Beyond creating an extraordinary lineage of models, in an extreme case of models misbehaving (i.e., humanity’s doom2), we could find the point in time and the data that led to a chaotic AGI.
Enterprise Value
Who would benefit from a decentralized AGI?
First and foremost, uh, humanity.
Secondly, I think there would be a lot of implementation opportunities in embedding these new decentralized models, similar to how ChatGPT plugins are now all the rage. If you make the crypto analogy it was exchanges that were useful to users, so one might think an exchange for the usage of these models could ultimately be the answer.
As the world of technology evolves rapidly over the coming years, I actually think a marketplace for different types of AGI could be a thing. I know the obvious flaw here is “A true AGI would be able to have intelligence across a broad set of use cases” and while maybe that’s true in the future, it’s not true now and I imagine there will be lots of capturable value between now and when that future comes.
Crypto Fixes This
As mentioned throughout this article, a new approach needs to be taken to decentralize LLMs and AI more broadly so that we can attempt to control the inevitable “singularity”. The approach I propose is analogous to Proof of Work but instead of arbitrarily wasting compute, we can use the compute to estimate gradients.
I also mentioned that we would need two ledgers: (1) for the model weights and (2) another for the data used for training those weights. These could be treated in the same way as candidate transactions being added to the Blockchain where signatures are used to verify the chain of transactions.
Closing Thoughts
All of this may sound a little ridiculous but it’s not. In fact, the work has already begun by the former CTO of OpenSea.
At the moment, many people (especially on Twitter) are pointing and laughing at crypto enthusiasts after the recent fall in cryptocurrency prices and that is a potential indicator that people have gone too far the other direction in their thinking about the space.
In general, it’s good to not jump on the bandwagon.
Happy mining!
-Francisco 🤠
Some Content Recommendations
wrote a banger on Goldman's Offloading of GreenSky. wrote an excellent piece on Fed expectations this week.Alex Johnson at
as always shared some great thoughts on Winning in Embedded Lending.Simon Taylor at
wrote an excellent piece about needing narrower banks and the latest Fintech drama.Postscript
Did you like this post? Do you have any feedback? Do you have some topics you’d like me to write about? Do you have any ideas how I could make this better? I’d love your feedback!
Feel free to respond to this email or reach out to me on Twitter! 🤠
4chan leaked Facebook’s LLaMa model (lol), so it looks like we have a starting point!
People seem to have forgotten that we can turn off the electricity.
It sounds like you are writing from a place of enthusiasm about AI capabilities and potential. Specifically about the abilities for AI to provide mundane utility, make lots of money, and become smarter at human tasks. You make a good argument from that perspective.
However, I'd like to point out that from a cautious perspective - where one is concerned about AI's ability to commit bioterrorism, plot crimes, perform electioneering, enable surveillance, create deepfakes, or other bad actions - publishing the weights of an AI model is a patently bad idea. It's unique badness comes from the fact that, once you release a model into the world you can no longer take it back. It is irreversibly committing humanity to a world full of this AI model's potential good, and potential evil. The expected value of releasing an AIs weights is therefore NOT an unalloyed good, and I think one could argue it is actually quite negative - partially because the world has superfragile institutions and there are bad actors out there who would be excited to have a free bioterrism expert.
Anyways, I consider these the strongest arguments against releasing dual use technology like this into the world irreversibly. At least if OpenAI has the closed model weights, we can retract the technology from the world if it kills a million people. We can't do the same in a world where the weights are published. Anyways, I am curious what you think about this argument?
Other dual use technologies that people don't release into the wild due to infohazards: nuclear, pandemic pathogen DNA.