The Diff

This is an audio version of the free newsletter. Read the newsletter at thediff.co, and subscribe for an additional 3 posts (and podcasts!) per week.
  • (00:00) - Are We Already Building a Piecemeal AI Data Royalty Model?
  • (05:30) - Refounding
  • (06:20) - User Acquisition Costs
  • (07:09) - Data Sharing
  • (08:05) - Fees

What is The Diff?

The Diff is a newsletter exploring the technologies, companies, and trends that are making the future high-variance. Posts range from in-depth company profiles, applied financial theory, strategy breakdowns and macroeconomics.

## Are We Already Building a Piecemeal AI Data Royalty Model?

One of the odd comforts of LLMs is that they made the zeitgeist a real thing. A statistical language model trained on all available text—which is going to skew to the text that people put online, since that made publishing, sharing, and storing text so affordable—basically is the spirit of the age. So you can ask the zeitgeist questions and get something very close to the consensus view of everyone in the world, weighted by how much they write, how their writing gets scored by pre-training classifiers, and how representative the labeling of answers is during post-training. If you've written something online, there's a good chance that your ideas will achieve partial immortality through model weights.

But maybe that's not comforting for you, perhaps because the idea of a big company packaging your creative output into some subscription product annoys you. If you produce tokens for a living, and models have likely trained on hundreds of thousands or millions of the tokens you've made, you have reasons to be really concerned. Every sentence you write is making AI a little bit better at writing the kind of sentence you'd write. You could feel a little steamed that AI researchers are getting poached with nine-figure offers in order to go to work rendering your job obsolete.

English common law has held since at least 1410 that charging a lower price for a better product doesn't make you liable for taking away your competitor's livelihood, and that's a good thing. If General Electric had to pay royalties to the local candlemakers, we would have lived in a literally darker world for longer. On the other hand, sometimes we work something out: cable companies pay local TV stations for the right to show broadcast channels, and if you look at the annual report for a company like Nextar, you'll see that this retransmission revenue is growing faster than ad revenue, more stable, and also most of their revenue. The record labels were pretty worried about what online music sharing would do to their revenue model, but it turns out that torrent sites just aren't designed by people who are obsessed with making the process of listening to music as seamless as possible, and that a profit-seeking company actually has a stronger incentive to either make its subscription product great or keep the ads to a tolerable minimum. ATMs didn't kill branch banking, and actually made it easier to visit a bank branch and get cajoled into signing up for a credit card.

In the case of retransmission, this happened through legislation, and still leads to the occasional lawsuit. But in other cases, it was an organic process: there was some specific product whose marginal cost went to zero, and companies adapted in a way that kept some of the original economic incentives in place. If you've been charging a high price for something, and you can now charge a lot less, the profit-maximizing incentive is to keep the economic side of things as stable as possible, and to just keep improving your product so people don't think of the 90%-cheaper challenger as being in the same category. You can listen to DRM-free, free-in-terms-of-dollars music if you want, but if you live in a developed country and have a job, it's almost certainly a low-ROI activity that you'd only engage in because of ideological commitments.

And for-profit companies are moving in the direction of licensing, or at least giving creators some control over how their content is used. YouTube is expanding the tools it uses to let public figures track the appearance of their likeness in AI-generated videos, for example. They say they won't necessarily take down content, but originals are almost always worth more than parodies, so if YouTube has to choose which side to disappoint, they'll be siding with MrBeast over people who make AI-generated parodies that MrBeast doesn't like. Meta is also trying to reward people who post original work, and penalize people who just rip it off or aggregate it. In their case, the economic motive is even clearer: a meme page that plagiarizes other people's memes is directly competing with the news feed in the business of finding funny stuff and sharing it with people who will appreciate it. The existence of this as a viable business is basically a measure of how much room there is to improve the recommendation algorithm. Meanwhile, in yet another business, doctors are very rapid adopters of AI, as we've discussed in The Diff recently. In their case, it's a bit analogous to Philip Anschutz buying railroads and then using their rights-of-way to lay fiber-optic cable. Whenever there's a new form of land transportation or fixed-line communication, one of the most valuable assets you can own is a plot of land that extends in a roughly straight line between big population centers, and previous novel technology buildouts basically get you that for free.

It's also illustrative to look at companies making the opposite bet. A few weeks ago, Grammarly shipped something called "expert review," where your writing could be reviewed based on the style and taste of assorted experts, none of whom had opted in. In case you're curious about this, you're too late: they shut it down.

One reason they did this was that all of these writers have fans, and while Grammarly has a decent brand, there just aren't that many people who have fonder feelings about a multi-billion dollar company than about their favorite writer. So the more likely someone is to say "make this one sound like it was written by X," the more likely they are to listen when X says "Absolutely not!"

In the end, the value of novel tokens and the thin-skinned narcissism of the people who produce them conspire to create an environment where it's better for AI companies to throw comparatively modest sums at creators than to deal with all of those creators' annoying fans.

-----

Disclosure: long META, GOOGL.

## Elsewhere

### Refounding

Elon Musk is cutting jobs at xAI and moving in some workers from SpaceX and Tesla. He's continuing to operate the many-pockets model, where he has control over different companies with different cash flows and funding sources, and can shift these companies' money, personnel, and hardware around to shore up any problems. This is still a Martingale strategy—at some point, two of his companies will be in trouble at the same time—but it also illustrates the flexibility of the Musk model. Musk is pretty comfortable making predictions that don't pan out; one of his strengths as an operator is that he seems incapable of embarrassment (or sunk cost bias). And that means that he's sometimes willing to say that he made some catastrophic mistakes when he started xAI, and that he's going to go ahead and take another whack at it.

### User Acquisition Costs

Chinese AI companies are giving users massive discounts in exchange for adopting agentic AI products. This is partly a feature of China's private sector, where many industries willingly go through a who-can-lose-money-faster contest in order to shake out weaker competitors. But it illustrates another difference: China's online economy is still heavily mediated by super-apps, and other companies tend to exist under the umbrella of one of the majors. But agentic AI weakens that model; you can be a little more ecosystem-agnostic if your AI agent is handling the fiddly integrations between incompatible products. The companies recognize this, and probably expect some share shift, but if they can lock customers into their own agents, and route those agents to other products they either own or get a cut from, they'll survive.
### Data Sharing

Big tech companies are, increasingly, defined by which kinds of data they have unique access to, and how they can use that data to get more of the same—the video service with the most cumulative data on viewer preferences is the one that can keep people watching longer and get an even more precise measure of those preferences, the search engine with the most searches can perform the most A/B tests of different ranking approaches, etc. But sometimes, they cooperate, as in the new agreement a consortium of big tech companies have signed in order to deal with scams. There are two reasons they're willing to work together on this:

1. Some scams will be easier to spot if there's sharing across different organizations; the scam message sent through Gmail or via Instagram DMs might have been composed with the help of ChatGPT, and
2. The more effectively they self-regulate, the less likely they are to face external regulation.

### Fees

The Trump administration is going to collect a $10 billion fee for handling the TikTok deal. This is, on its face, a weird and negative precedent. We probably don't want the US government looking for expropriation opportunities in order to get a cut, since that will make the US a worse place to do business. On the other hand, it's accidentally a very good policy: it's a gigantic excise tax on the proceeds of such deals, and that makes it harder for companies to specialize in being the buyer of expropriated businesses. There were many good cases for pressuring Bytedance to sell TikTok's US business to a US-based buyer, and one of the arguments against was that the government would get in the habit of forcing less strategically-important companies to do the same thing. And this enormous deal fee makes that less likely.