Barely Possible

[Barely Possible 2026-06-04] Today's episode: • Uber caps coding agents at $1,500/month per employee per tool — $18k/year per seat that Simon Willison reads as a value tell. • Anthropic launched a Services Track and Partner Hub to formalize the integrators wiring Claude into big enterprises. • Anthropic mapped a year of AI-enabled cyber threats onto MITRE ATT&CK, the defender's standard attacker taxonomy. Hear the full breakdown in today's episode of Barely Possible. Want a podcast for your own topics? Join early access: https://www.barelypossible.to/waitlist/?source_path=public_episode_94&feed_source=rss&episode_id=94 Transcript: https://media.clawford.org/episodes/2026-06-04/podcast-episode-2026-06-04.txt | Notes: https://media.clawford.org/episodes/2026-06-04/2026-06-04-notes.md

What is Barely Possible?

A daily briefing on the AI systems, products, companies, and policy shifts that are just becoming possible.

Want a podcast for your own topics? Join early access: https://www.barelypossible.to/waitlist/?source_path=public_feed&feed_source=rss

Okay kiddos, I'm your boy Tony DeLuca, and welcome back to Barely Possible, the show where I read the technical stuff so you don't have to, and then I tell you which parts actually matter for the thing you're building. We've got a leaner menu today than usual, but there's a real meal in here, so don't worry. Buckle up.

Here's where I want to start, because it's the most concrete thing on the table and it tells you something true about where we are. Uber, the rideshare company, reportedly now caps its coding agents at fifteen hundred dollars a month per employee, per tool. That's the number. Fifteen hundred dollars. Per person. Per tool. Every month. Simon Willison flagged this and his read was, and I'm paraphrasing, seems sensible, but it's also a really interesting hint at the value Uber thinks these tools are providing. And I want to sit with that for a second, because that one little number says more about the state of enterprise AI than a hundred keynote slides.

Let me unpack it. A cap of fifteen hundred a month per person is not a leash. That's a fence way out at the edge of the property. Think about it. If you're an engineer and your company says, hey, you can burn up to eighteen thousand dollars a year on coding agents, that is not a company saying we're worried about you spending too much. That's a company saying we have done the math and we believe an engineer who's getting real value out of these tools is worth way more than eighteen grand a year in tokens. The cap is high precisely because the value, when it works, is higher. So Simon's right. The number is the tell. Uber, which by the way has been the poster child for this whole story all year, the company that burned through its entire 2026 AI budget in four months back in the spring, has gone from unlimited let-it-rip to a structured budget, but the budget they picked is generous. It's a guardrail, not a clamp.

Now, why does this matter to you, the person building a company or a product? Because the era of free is over and the era of the cap is here, and the shape of the cap tells you what the buyer believes. We've been circling this token-economics theme on the show for a couple weeks now, ever since the Opus 4.8 stuff and all the talk about harness pricing and the so-called subsidy era ending. So I'm not going to abstract this up into another sermon about the subsidy era versus the scarcity era. You've heard me on that. Instead I want to zoom way into this specific number, because it's the part that's actionable.

If you sell tools to engineers, the Uber cap is your pricing ceiling and your pricing proof at the same time. It tells you that a serious enterprise will tolerate a five-figure annual per-seat token spend if, and only if, the value is legible. That little phrase, per tool, is the dangerous part for vendors. Per employee, per tool. That means Uber is watching not just total spend but how it's distributed across vendors. Which means they are, right now, comparing your tool against the next tool on a dollars-of-value-per-dollars-of-tokens basis. If your agent burns through the cap by Tuesday and produces mush, you're getting cut. If a cheaper agent does eighty percent of the job at a third of the cost, you're getting cut. The cap doesn't just constrain the employee. It turns every vendor into a contestant in a value bake-off, every single month. That's the world. Build accordingly.

And here's the continuity thread, because I promised myself I'd connect it. Yesterday and the day before we talked about Microsoft AI's whole chip-to-model-to-harness vertical stack play under Mustafa Suleyman, and the headline there was a custom-tuned model that they claimed beat a frontier model on quality at a tenth of the cost on a specific customer's tasks. At the time I told you the interesting part wasn't the benchmark, it was the cost story. Well, the Uber cap is the demand side of that exact same coin. Microsoft is racing to give enterprises cheaper tokens per unit of work. Uber is the enterprise putting a hard dollar fence around how many tokens its people can consume. Supply side and demand side of the same squeeze. Cheaper tokens from the labs, harder caps from the buyers, and in between, a brutal monthly audition for every vendor in the stack. That's the whole game right now, and the Uber number is the cleanest single data point I've seen on it.

Now let's shift from the buyer's wallet to the seller's strategy, because Anthropic put out something this week that's the other half of this picture.

Anthropic announced what they're calling the Services Track and Partner Hub of the Claude Partner Network. Now, the source on this is thin, it's basically the announcement itself, so I'm going to be careful and not invent details. But I can tell you what it is and why it lands the way it does. Anthropic already had a partner network, the way every platform company does. What they're adding now is a dedicated track for services partners, that's consultancies, implementation shops, the firms that actually go inside a big company and wire Claude into the guts of the business, plus a Partner Hub, which is the central place where those partnerships get organized and surfaced.

Here's why I care, and why you should. Remember a couple weeks back, both OpenAI and Anthropic stood up these enterprise-deployment vehicles. OpenAI did a majority-owned deployment company to put engineers inside big clients. Anthropic partnered up with some big finance names to launch a services firm. The instinct behind both was the same: the gap between what these models can do and what companies actually get out of them is enormous, and somebody has to go in and close it. This Services Track is Anthropic formalizing the layer below that. It's not the flagship consulting venture, it's the broad on-ramp for the whole ecosystem of integrators who want to build a business on top of Claude.

And this dovetails with something a sharp essay called Productized Services Are Back was getting at recently, which we touched on the other day. The idea that in an AI world, the consulting-plus-software hybrid, the productized service, is having a renaissance, because the technology is moving so fast that pure self-serve software can't carry an enterprise across the gap on its own. Somebody has to hold their hand. Anthropic just built the official handhold registry. If you run a services shop and you've been wondering whether to plant your flag with one of these labs, this is the kind of thing that turns a loose relationship into a channel. The upside is distribution and credibility. The downside, and there's always a downside, is that you're now a tenant on someone else's platform, and the landlord can change the rent. We've watched that movie before, in every app store and every cloud marketplace. So go in with eyes open. The partner network is real leverage, but it's leverage you're renting.

Now let's go from partnerships to the thing that's quietly keeping a lot of CISOs up at night. Anthropic also put out a piece this week with a title I'll read plainly: What we learned mapping a year's worth of AI-enabled cyber threats. They mapped it against the MITRE ATT and CK framework, which if you don't live in security land is basically the industry's standard catalog of how attackers actually behave, step by step. It's the periodic table of bad guy moves.

The source we've got is again pretty thin on specifics, so I'm not going to put words or numbers in their mouth that aren't there. But here's the framing that matters, and it's a framing builders should internalize. The interesting move here isn't that AI is being used in cyberattacks, we've known that, we covered the Mythos cyber-capability story and Project Glasswing in past weeks. The interesting move is that Anthropic is taking a year of observed AI-enabled threat activity and forcing it onto a standard, shared taxonomy that defenders already use. That's a discipline thing. When you map a fuzzy new threat onto an old, well-understood framework, you turn vibes into something a security team can plan against. You can say, okay, AI is showing up at this stage of the attack chain and not that one, our existing defenses cover these steps and have a hole over here.

Why does a founder care about a threat-intel report from a model lab? Two reasons. One, if you're building anything that touches sensitive data, and at this point that's most of you, the threat model for your product just got a public update from one of the few organizations with a real telescope pointed at this. Read it. It'll be in the show notes. Two, and this is the meta-point I keep coming back to, the lab that publishes the clearest map of how its own technology gets abused is making a credibility play. In a world where Bernie Sanders is writing op-eds about nationalizing these companies and the White House is doing that weird on-again off-again executive order dance about pre-release model testing, being the lab that voluntarily publishes the responsible-disclosure version of events is a position. It's PR, sure. But it's also genuinely useful PR, the kind that gives defenders something to work with. I'll take useful PR over a glossy keynote any day.

That belief that mapping a threat onto a shared framework changes how you defend against it, by the way, is the same instinct behind the next thing I want to flag, just pointed at a completely different domain. So let me shift gears entirely, from cyber threats to, of all things, thrift shopping.

Google put out a piece, written by Megan Stoner, on five ways Google Search can level up your thrift and vintage shopping. And look, on its face this is a consumer fluff post, and I'm not going to pretend it's a watershed moment in artificial intelligence. But I want to use it as a tiny window, because it's a current item and it tells you where the consumer search interface is actually going. The whole pitch is using AI-driven search to do the stuff that used to require a human expert at a vintage store: identify an item from a photo, figure out roughly what era it's from, what it might be worth, whether the price tag in front of you is a steal or a stickup.

Here's the builder angle, and it connects right back to a story we did earlier this week about AI Mode reshaping how people search. The thrift post is the friendly, lifestyle-section version of the same machine. Google is steadily turning search from a list of blue links into an answer engine that does the appraising, the comparing, and the deciding for you. For the person hunting a bucket hat at a flea market, great, that's a nice little feature. For anyone whose business depends on being found through search, it's another reminder that the interface between a customer and a decision is being colonized by the model. If the AI tells you the green bucket hat is overpriced and points you somewhere else, the store you walked into just lost the sale and never knew it had a shot. Multiply that across every category. That's the quiet stakes hiding inside a cheerful thrifting blog post. I keep flagging this because it keeps mattering: the discovery layer is moving, and if your go-to-market depends on the old discovery layer, you've got planning to do.

Alright, now I want to do something a little different and give you a quick tour of a company we don't talk about enough on this show, because a whole stack of their announcements surfaced at once and it's worth stepping back to see the shape. I'm talking about Mistral, the French lab. Now, full honesty with you, the way these surfaced, it's a run of their older announcements from across the last year and a half showing up together, so I'm not going to pretend any single one of these is breaking news today. The OCR document model is from December. The production platform they call AI Studio is from last October. The memory and MCP-connector features for their Le Chat assistant are from last fall. The enterprise version of Le Chat, the agents API, their coding tool, those are all from last spring and summer. The news-partnership with the AFP wire service goes all the way back to early last year.

So why mention them at all if none of it's fresh? Because seen all at once, that timeline is a portrait, and the portrait is interesting. What you're looking at is a roughly eighteen-month march by a non-American, non-Chinese lab to assemble the complete enterprise toolkit. Watch the sequence. News partnerships for grounding, then an enterprise assistant, then an agents API, then a coding tool, then memory and twenty-plus enterprise connectors over the model context protocol, then a production platform, then a frontier document-understanding model. That is somebody methodically checking every box an enterprise buyer asks about. Connectors, check. Memory with transparency controls, check. On-prem-friendly enterprise deployment, check. A document pipeline, check. An agent framework, check.

The reason this matters to a founder is sovereignty and optionality. Every European enterprise, every regulated outfit that gets nervous about routing its crown-jewel data through an American cloud, now has a credible third option that isn't Beijing. That's a real wedge. And for you, the builder, it's a reminder that the model layer is not a two-horse race the way the American tech press likes to frame it. When you're picking a foundation to build on, the question isn't only who's smartest this quarter. It's who'll still be a viable vendor, on terms your customers' compliance teams can stomach, in three years. Mistral's eighteen-month box-checking march is them making exactly that argument to exactly those buyers. Whether they win is another matter, and I'm not here to tell you they will. But the optionality is real, and optionality is worth money when you're negotiating with the big labs.

Now let me connect that to something on the infrastructure side, because there were a couple of Meta engineering pieces that resurfaced too, and they round out the picture even though, again, they're older posts. One is a broad look back at Meta's infrastructure evolution and the advent of AI, from last fall. The other is a deep technical writeup on scaling large-model inference using a bunch of parallelism techniques, tensor, context, and expert parallelism, from last October. I'm not going to drag you through the kernel-level details, that's not what this show is for and it's frankly not what most of you need. But I'll give you the one-sentence takeaway from each, because the why matters even when the how doesn't.

The infrastructure piece is Meta admitting, in public, that twenty-one years of infrastructure built for a social network had to be substantially rebuilt for the AI era. That's the honest part. The factory that served three and a half billion people photos and posts is not the same factory you need to serve agentic AI workloads, and they had to retool. The inference-scaling piece is the engineering of how you actually serve a giant model to a lot of people without latency falling apart or costs exploding. And here's the thread back to the Uber cap and the Microsoft cost story: all of this parallelism wizardry exists to drive down the cost-per-token of serving these models at scale. The whole industry, top to bottom, is bending itself around one number, the marginal cost of a token. The buyer caps it. The lab tunes it down. The infrastructure team re-architects the data center to shave it. Everybody's pulling on the same rope. When you read a dry engineering blog about expert parallelism, that's what it's secretly about. Money. Cents per token. It's always cents per token now.

Let me pull back and give you the through-line on all of this before we close, because there is one. Every story I touched today is really about the same shift, told from a different seat at the table. Uber's fifteen-hundred-dollar cap is the buyer learning to ration. Anthropic's Services Track is the seller building a channel to help buyers get enough value to justify the bill. Anthropic's cyber-threat map is the seller buying trust in a jittery policy moment. Google's thrift-search post is the discovery layer quietly eating the decision. Mistral's eighteen-month toolkit is a third vendor making a sovereignty-and-optionality pitch to nervous enterprises. And Meta's infrastructure rewrite is the foundation underneath all of it, retooling the factory to get the cost-per-token down so the whole thing can actually pencil out.

It's not a story about a magic model anymore. It's a story about who pays, how much, for what value, and who gets squeezed in between. And the people who get squeezed, let me be blunt, are the vendors in the middle who can't prove their value cleanly, and the businesses whose customers used to find them through a search box that's now answering the question itself.

So here's what I'd watch next, and it's the kind of thing I'll keep an eye on for you. One, watch whether more of the big enterprises follow Uber and publish, or leak, their per-seat caps, because once two or three of them are out there, that becomes the de facto market price for agentic seats, and every vendor will be priced against it. Two, watch how many of these services-partner programs actually convert into revenue for the integrators, or whether it's just another logo on a slide. The productized-services thesis lives or dies on whether those shops make real money, not on whether they get a badge. And three, watch the discovery layer. Every time Google ships another little AI-search convenience, ask yourself whether the businesses on the other side of that query still have a way to get found. Because that's the slow-motion squeeze that doesn't make headlines but reshapes whole industries.

That's the menu for today. A little shorter, a little tighter, but I think the Uber number alone earns the price of admission. When somebody puts a dollar figure on a thing, the dollar figure is the truth, and everything else is marketing. Hold onto that.

Alright, that's me. I'm Tony DeLuca, this has been Barely Possible, and I'll be right back here tomorrow with whatever the machine coughs up overnight. Be good to each other, and keep an eye on the meter.

More episodes

Chapters

What is Barely Possible?