If you were in South America 12,000 years ago and you discovered where a bunch of glyptodonts were hiding or you figured out a better glyptodont hunting method, you could tell your tribal band and later they would say, thank you for helping us kill these delicious glyptodonts we now think you are cool and now will treat you slightly better. And that was that. There was no other reward for producing information.
Nowadays, we have new tricks. If you write a book or patent a drug and someone starts selling copies without your permission, you can ask the government to take their money or put them in prison. If you’re a scientist, you can ask the government to give you money so you can do science and then give it away.
Why do these things exist? Well, information is cool because it’s cheap to copy. But for the same reason, it tends to be undersupplied. Say that if I worked hard I could find some new fact, e.g. that ultrasonic humidifiers are bad. This only helps me a little, since they’re not that bad. If I got even 5% of the extra lifespan gained by each person who kills their humidifier, I would spend all day everyday looking for such facts. But I don’t, so I don’t. (Also no one believes me.)
Patents and copyrights and science grants feel inevitable and boring. But take a step back. How close do these things get us to “optimal”, to rewarding someone with $500 when they create information that provides society with $1000 of value?
The answer is not close at all. Because:
- Yes, we want socially optimal information production.
- But also, restricting what words people are allowed to say to each other is impossible and tyrannical.
Our tricks are a messy patchwork that try to bridge the yawning chasm between those two realities. We reward information production, but only in a few limited cases where it’s easy to enforce without intruding too much on basic liberties.
In this post, I’ll argue that our existing tricks ingeniously allow “facts” to flow freely (yay liberty) while also creating indirect subsidies for finding new facts (yay information production). But this only works because of certain coincidental facts about the world. And AI is in the process of changing those facts.
So, why do we have the tricks we have? What makes them work? Will they still work in a post-AI world? How could they be changed?
We have ways of making you talk
Roughly speaking, we have five main tricks to reward information production today.
First, you can copyright creative works, like books or music or code. This lasts for your life plus ~70 years.
Second, you can patent new inventions, like drugs or machines or algorithms. This lasts ~20 years.
Third, you can create trade secrets. These aren’t just secrets! If you run a business and you discover basically any useful information, then as long as you make “reasonable efforts” to keep it secret, it’s a crime for someone to steal your secret. Even if everything they do is otherwise legal, just obtaining the information is “economic espionage”. This protection lasts forever.
Fourth, you can get direct subsidies. Journalism is increasingly funded by philanthropy. The government gives money to scientists so they can do science and make the resulting knowledge freely available (to some for-profit publisher who then charges the public $30/article for the same science they already paid for with taxes).
Finally, social norms are as important today as ever. I’m often tempted to take How Much Would You Need to be Paid to Live on a Deserted Island for 1.5 Years and Do Nothing but Kill Seals? and re-post it like I’d written it, but I don’t because I fear that word would spread that I’m a big thieving loser. More prosaically, if one journalist makes a big discovery, it’s totally legal for others to re-report the facts without giving them credit. But journalists have a culture where credit is expected.
These ways seem weird
At first glance, our system seems obvious and inevitable. At second glance, though, it seems very strange. But at third glance, that strangeness can be seen as society having made some shrewd calculations to manage the tradeoff between (a) rewarding information, and (b) not creating dubious restrictions on speech.
So let’s go through those second and third glances.
Why is it that copyright lasts for life plus ~70 years, while patents last for ~20 years?
Perhaps because artistic work usually has lots of substitutes. If I write a book, then I have a monopoly and I’m free to charge $25,000 for it. But if I did that, everyone else would just buy some $25 book instead. My monopoly doesn’t give me that much pricing power. Whereas if I invent a new drug for pancreatic cancer, I can probably charge people $100,000 per treatment. For drugs, a 20 year term still provides plenty of reward.
Why do patents require filing a complicated application and paying gigantic fees, while copyright and trade secrets are automatic?
Probably because it’s easy to determine who wrote a book, but hard to prove who came up with an invention.
Why do patents require publishing how your invention works, while if you create a song, you don’t have to share your pre-mastered multi-track audio?
Probably because creative works don’t have as much “secret sauce”. You can read a book and figure out how it was made much more easily than you can look at a new engine and understand the engineering principles. By forcing people filing patents to publish their ideas, this helps good ideas spread more quickly. Also, many more creative works are created each year. It’s just not worth it.
Why does copyright only cover artistic aspects?
Well, imagine that when Don Daglow created Utopia in 1981, he didn’t just get a copyright on the art and characters and code, but also on the idea of a real-time strategy game. Starcraft wouldn’t exist. The horror.
Why is it that even conspiring to steal a trade secret is illegal?
Say you and I decide to steal Coca-Cola’s secret formula, so we high-five and drive to Atlanta, but then we realize we’re idiots and go home. Believe it or not, we are now guilty of conspiracy to commit economic espionage and could theoretically be imprisoned for up to 15 years. Weird, huh? While this seems ridiculous, I guess prosecutors find (as with regular espionage) that it’s hard to prove actual espionage and only use this power in egregious cases.
And why do trade secrets at all? Why make it illegal for someone to steal them only if you make “reasonable efforts” to keep them secret? Why is it legal to discover trade secrets through reverse engineering, but illegal to discover them by getting engineers drunk? Why does this protection apply to basically any form of information, and why does it last forever?
I think the idea is that people will keep secrets no matter what. But without laws, people would spend vast sums trying to steal and/or secure secrets, and this arms race would have no social benefit. Meanwhile, trying to make it illegal to “steal” things that aren’t really secrets at all would trample on basic freedoms and be impossible to enforce.
Why can you patent inventions but not discoveries? Why can you patent “algorithms” but not “math”?
Well, imagine you could patent math. Would we be like the Pythagoreans and punish anyone who mentioned the wrong theorem? It’s far easier to judge if someone is using math, which is sort of the definition of an algorithm. Better to just pay mathematicians out of taxes and let the math be free.
So, I view these tricks as a very clever and highly evolved solution to a difficult problem. If you take all the possible ways you could reward information and then sort them by (ease of implementation) × (how much ideas are rewarded), you’d probably end up with something close to our current system.
Intermission: No really, they’re weird
While I think our tricks are clever, we shouldn’t forget that they’re highly imperfect and lead lots of perversity. In case you’re a perversity aficionado, I’ve collected here some favorites:
-
Say you suspect that some type of fungus might cure cancer, so you spend $50 billion checking each of the 144,000 known fungal species. And say you actually find one that works. Too bad! The fungus already existed, so that’s a “discovery”, not an “invention”. You might be able to patent some extract or something, but if you’re charging $100,000 per cure, people will find ways around the patent. Better not to spend that $50 billion in the first place.
-
Information production is sometimes rewarded through bundling. For decades, local newspapers made half their revenue from classified ads. This was a great business, since printing classified ads costs almost nothing. But because economics is weird, they found it was profitable to also pay tons of money for reporters who would create news, and then bundle the news and classified ads together. Then Craigslist was invented and now most of those newspapers are dead.
-
You’ve surely noticed that recipe sites have thousands of words of inane blabbering before they show the actual recipe. That’s partly to manipulate search engines and to have more space for ads. But it’s also because inane blabbering is copyrightable but recipes are not.
-
Many people find it strange that you can patent “business methods”. But did you know this was already happening in France in 1792? As some people were arresting Louis XVI, others were filing patents for financial inventions. (Though these were later deemed invalid.)
-
Believe it or not, you can patent methods for reducing taxes. I’m unsure what public interest is served by rewarding such inventions.
-
You’ve probably heard that map makers sometimes add fake “trap cities” or “trap streets”. The idea is that if your traps appear on another map, you’ve got them dead to rights on copyright violation, right? Turns out: Nope. Locations of cities and streets are facts. A fake fact isn’t a form of creative expression and still isn’t copyrightable.
-
One of the central concepts of patents is that you must publish your invention. But companies don’t like telling their competitors about their inventions. So many—particularly pharmaceutical companies—pay lawyers gigantic sums to write patents in a way that’s legally valid but impossible for a normal person to read. In response, their competitors pay their lawyers gigantic sums to decode the legal gibberish, and sometimes get patents translated from other countries (often Japan) that are less tolerant of such chicanery.
-
When Don Daglow created Utopia in 1981, he couldn’t have copyrighted the idea of real-time strategy game. But he might have been able to patent some version of that idea. I’m glad he didn’t.
About those indirect subsidies
We’re 1900 words in. What is this post about again? Oh yeah:
- Our current tricks for rewarding information production rely on coincidental facts about how the world works.
- Artificial intelligence is changing those facts.
- ???
What are these coincidental facts? Well, “facts” and “discoveries” are important. We want more of them! But legally protecting these things seems terrible, because that requires you to police what words people can say to each other.
But copyright still cleverly provides indirect subsidies for creating facts and making discoveries. I’ll highlight two.
First, while you can’t copyright facts, you can copyright “presentations” of facts. If I write a book, you are free to steal all my facts and write your own book. But, for humans, doing that is hard. If you go way back to, like, three years ago, “unbundling” the facts from their presentation took a ton of work. Now, AI can do this instantly and for near-zero cost.
Second, I suspect that a lot of creation is driven by our old tribal band instincts. Like why am I writing this? Probably because some part of me hopes that after I post it, glowing pixels will show up on my screen which my brain will interpret as meaning that people love me or whatever. I’m pretty sure this doesn’t provide concrete benefits that will increase my chances of passing on my genes, but my brain is still operating on some confused heuristics where “more glowing pixels” means “more sexual opportunities” or “more friends to provide resources in times of need” or something.
AI is overturning both of these. A few months back I did some research into how well LLMs can play chess. My most surprising finding was that if you asked chat-based LLMs to regurgitate the full sequence of moves before choosing a new one, that greatly improved performance. I’m pretty sure I was the first person to show this.
A few weeks later, I asked Google “can llms play chess”.
No credit. Beyond the idea of replaying moves, several quotes were taken from me almost verbatim.
If a human did this, most people would think it was rude. But at least I’d know that they paid a tax on their time, and hypothetically some people might think less of them for having done it. AI does it instantly, for free, and does not care about social approval.
Now, I don’t mean to beat up on poor Google. I actually think they deserve credit for exceptionally good behavior. Many AI companies offer you a way to signal that you don’t want your content used for AI, but then they ignore their own signal and if you try to block them they switch IP addresses. I expected Google to say, “Sure, you can block our AI, we love you, just add the same signal that completely removes your website from our search engine.” But no. They offer a different signal just for AI, and they actually respect it. I added it, and when I checked a few weeks later, Google no longer provided any AI summary at all.
So, good on Google for giving creators some control. But I’m skeptical that this is what the future will look like.
Some people I respect say that they now write for AI. I admire this zen-like detachment from earthly concerns. But really? You’re happy to spend your time creating information just to feed it into a training set, so it can be used for purposes you might hate, and without giving you any reward, neither money nor recognition? For better or for worse, I don’t feel that way.
What could be done?
Option: Do nothing
This is a strong choice! As I see it, our historical compromise was to try to reward information production, but to take a light touch. Only do it in a few places, where it can be done at low cost, with unambiguous rules that don’t involve degrading intrusions on basic human liberty.
Maybe AI changes how well that compromise works. But that doesn’t imply that we should change anything. After all, when Craigslist debundled the news from classified ads, we didn’t make Craigslist illegal. We just left newspapers to their fate. This meant less news, particularly in smaller markets, and I think this has had some bad effects. But tend to think it was the right choice. Even if you ignore “freedom”, people have saved billions of dollars, and people trade far more goods now than when newspapers had a monopoly.
And we shouldn’t forget that AI also has (possibly enormous) positive effects on information production, by making it easier and cheaper.
Change how copyright and/or patents and/or trade secrets work?
I have a couple (possibly bad) ideas for minor tinkering at the margins, below. In principle, we could make some kind of dramatic change. But I don’t see many options without huge problems. Can you think of anything? This all seems under-theorized.
Increase non-market incentives for information production?
After Craigslist killed newspapers, some started adding paywalls. Paywalls are, theoretically, a market-based solution. But I suspect that a lot of people who subscribe to these newspapers (and blogs (not me, money bad)) do it not just out of self-interest, but because they want to support them. They think what they’re doing is good for the world and they want to encourage it. There are also many charities that fund journalism.
And OK, maybe AI decreases the incentives for internet randos to do research and share it with the world. How much information is really lost by this? How hard would it be to provide more grants (or post-hoc “awards”) to make up for what’s lost?
Clarify the meaning of a derivative work.
Say you create a game. You write code and sell me an executable. Then I take the executable, decompile it, replace all your art with new version, and then re-compile it for some other operating system and start selling it. Can you go to court and take my money?
Yes, because your source code was copyrighted, and my new executable would be a “derivative work”.
But say you write a book, and I get an AI to re-write it. Can you take my money? The legal standard here is “substantial similarity” which is just as confusing as it sounds. Courts talk about “total concept and feel test” and “comprehensive non-literal similarity” versus “fragmented literal similarity”. As far as I can tell, this is an incredibly blurry boundary that we’ve only gotten away with because cases are relatively rare.
AI will force us to find a clearer boundary, one that doesn’t require judges to listen to individual pieces of music.
But I’m not sure a clear boundary would do much to incentivize creators. If we had a magic box that perfectly decided what’s infringing and what isn’t, I don’t expect that the response will be for AI companies to pay creators. Rather, they’ll probably just tune their AIs to run right up to that boundary. Instead of re-writing one books, rewrite N books for whatever value of N is legal.
Create a legal opt-out.
Many companies theoretically offer an opt-out. It’s a different opt out for each company, and many of them seem to just ignore it.
In principle, governments could create a legal mandate for this. It could even be fine-grained. Then, companies might compete with each other to make creators happy.
It’s quite possible that such a mandate would be a disaster. For one thing, it would be a headache to enforce—would Federal AI inspectors demand to inspect the training data for all AI companies?
And for any even moderately popular blogger, lots of people steal their articles and re-post them without credit. (Google and Bing usually de-list these sites, but you can find them with Yandex.) If they have different opt-out headers, how are AI companies supposed to know which one is correct?
Most of all, I worry that that this mandate would just hurt “good” companies and/or companies in jurisdictions that actually enforce the mandate. If the effect was to hand AI leadership over to That Other Country, that seems bad.
Go Xanadu
Or maybe we could develop technology that would solve this problem using existing laws.
A while back, I mentioned Project Xanadu, The Original Hypertext Project. I was mostly attracted by their attitude. (“It is a continual war over software politics and paradigms. With ideas which are still radical, WE FIGHT ON.”) But I couldn’t really understand what the hell it actually was. But then I read Jason Crawford’s The lessons of Xanadu and WIRED magazine’s 1995 piece The Curse of Xanadu.
I now understand that Xanadu was (is?) intended to be a system for interlinked documents. But it also included a crazy “transclusion” feature. This was some kind of distributed copyright scheme where authors could link and copy each other and royalties were somehow apportioned to all upstream documents.
Maybe we don’t need new laws. Maybe a solution exists that uses a combination of technology and existing contract law. There’s a very large space of possibilities, and I don’t pretend to have the answer. But at a high level, there could be some system where people (and AIs?) put their creations. In order to access the system, you have to agree to distribute royalties according to some formula, and to treat anything learned through the system as a trade secret.
In principle, it seems like essentially any combination of technology and laws could be implemented this way? And there could be a competition to find the best one? And we don’t have to rely on the Leviathan? And it might be better than our current crazy hacks?
Seems hard, but it’s the best idea I’ve got.
TLDR
Weird situation, evolving, new ideas needed.