API Intersection

Do you have a question you'd like answered, or a topic you want to see in a future episode? Let us know here:
https://stoplight.io/question/

Show Notes

In our latest podcast episode, we spoke with Jon Parise, a lead architect at Pinterest. At Pinterest, Jon provides company-wide technical leadership across several strategic initiatives and leads Pinterest's open-source program. We sat down with Jon to learn more about how the API program at Pinterest works, some keys to their success, and where they plan to go from here.

Take a listen for more on: APIs & Pinterest, navigating APIs through the constant iteration & innovation of technology, managing a giant API program at scale, and taking a design-first approach.

Do you have a question you'd like answered, or a topic you want to see in a future episode? Let us know here:
https://stoplight.io/question/

What is API Intersection?

Building a successful API requires more than just coding.

It starts with collaborative design, focuses on creating a great developer experience, and ends with getting your company on board, maintaining consistency, and maximizing your API’s profitability.

In the API Intersection, you’ll learn from experienced API practitioners who transformed their organizations, and get tangible advice to build quality APIs with collaborative API-first design.

Jason Harmon brings over a decade of industry-recognized REST API experience to discuss topics around API design, governance, identity/auth versioning, and more.

They’ll answer listener questions, and discuss best practices on API design (definition, modeling, grammar), Governance (multi-team design, reviewing new API’s), Platform Transformation (culture, internal education, versioning) and more.

They’ll also chat with experienced API practitioners from a wide array of industries to draw out practical takeaways and insights you can use.

Have a question for the podcast? DM us or tag us on Twitter at @stoplightio.

I'm Jason Harmon, and this is API Intersection, where you'll get insights from experienced API practitioners to learn best practices on things like API design, governance, identity, versioning and more.

Welcome again to API Intersection podcast with me again today.

My co host, Adam Dovander, and our guest today is John Paris from Pinterest.

So our middle aged women listeners should be super excited.

Sorry, that was a jab.

That wasn't fair.

So Adam and John, tell us a little bit about yourself.

Yeah.

Thanks, Jason.

I work at every developer where we engage API companies with developers.

And that's something Pinterest has been doing for a while in various ways.

And, John, you've been there eight and a half years.

So we're excited to hear your story.

So welcome.

Yeah.

Thank you.

It's good to talk to you.

I've been there for a long time.

I've worn a lot of hats, but the thing I do today is I look after client API development of Pinterest, which covers a lot of ground.

So excited to talk about it with you.

Yeah.

It sounds like John in classic scaling startup fashion has a whole hat closet.

Seems like you got your fingers on a lot of things.

It seems to Adam was touching on this before we started that there's a lot of history with kind of Pinterest and APIs, and it sounds like internally, even a lot of different ways, different form factors you're using for that.

So I guess some overview for folks that maybe haven't integrated something with Pinterest and what that looks like.

And then why do people integrate with Pinterest via APIs? Yeah.

So when we talk about people, we're talking about a bunch of different audiences.

So maybe I'll start from that perspective, the thing that is the largest source of our API surplus area is ourselves because everything we do is internally API first and API oriented.

So everything you would do on the Pinterest website or iOS or Android apps is powered by a really thick API layer that all developers and Pinterest are able to contribute to.

So that's where I'd start.

Obviously, we also have friends outside of the company that aren't just employees who want to integrate with Pinterest.

And there's a number of use cases there.

But the biggest one are folks who are advertising partners, and they're looking for analytics and spend metrics and the ability to automate campaigns and things like that.

So that's the biggest, well managed part of our external facing API service area.

And as always, folks can find out more about these things by looking at developers at Pinterest dot com or one of our business sites.

If you want to get into more of the details there.

Got it.

Yeah.

I guess a good reminder that at its core, Pinterest does advertising for business as much as sharing images and all your home improvement project reminders is how we use it.

It sounds like the typical tip of the iceberg story that what we see externally is just a tiny fragment of the overall API surface.

Are those kind of different worlds in Pinterest, or is this all kind of one platform view, so to speak? Yeah, that's a great question, because we've been looking at ourselves a lot in this particular way.

So the easiest way to describe it is that we have a single sort of thick software layer that underlies a lot of this.

And then above that, we think about how we present that layer to various customers or audiences.

When I talked about our first party use cases, that's the majority of what we're serving and how we're thinking about how we build our product logic and how we design our data models, the third party audiences tend to be a subset of that.

So, for instance, I mentioned earlier that we talk about how we manage Advertiser data largely 30 API, and we ourselves are a first party consumer of that with our own business focused management interfaces.

Similarly, if we were talking about the simplest thing we can talk about if interest is creating and reading pins, that's a relatively straightforward, Crud style API interaction.

But our first parties do a lot more with that than you'd ever expect a developer to be able to do.

There's a lot of really rich functionality that's evolved over the last bunch of years.

I think evolution will be a word.

I come back to a lot as I talk about what's changed the last eight years of a product.

But we also can't expect a third party developer or a partner to react to our changes the same way that we ourselves would be reacting or evolving our APIs in terms of stability and long term support.

So that's again, where we come into a lot of what we ship is first for us, and a very prescribed subset of that would go to a partner.

But underlying all of it is the same data model, et cetera.

So that's what's unique in the world.

But it's certainly how we've chosen to prioritize flexibility, but also partner stability, and in terms of devices that you have to support, I would think that you've got to be supporting quite a few.

I mean, that would be something that someone listening could look to you for advice on how to.

I mean, is there an approach that you have iphone, you have ipad, you have Android.

There's probably many more and details within those.

How do you approach that? That's a great question, because that's one of the other hats I wear at the company, because that's also the area technology I'm responsible for.

The thing that's pretty fun about being a job like mine is I get to own the technology from our first party clients all the way through the APIs to our back end product logic.

I think that's one of the things that helps me do a better job of my job than if I was one of four or five people in the room for better or worse.

I'm in my own head about a lot of these questions, so it's a fun place to be most of the time.

Sometimes it's a challenging place.

One of the ways it's a challenging place is to your question, how do we continue to support really well that variety of devices on iOS and Android in particular? I would say the way we think about it is that for better or worse, mostly better, those devices are constantly evolving.

But the more challenging aspect of them is that we want to support the latest features on, for instance, iOS 15, which is announced at WWC recently.

There's a lot of exciting things there that we want to use, and we're going to try and get them as soon as possible, while also maintaining compatibility with devices that are four or five, six years old.

And that also means everything we've shipped four, five or six years ago, API wise still needs to work.

So not only are we trying to support the range of device level features, we're also supporting the level of Pinterest level features that we ship those years ago, and that's the support matrix in terms of how we manage that the API itself isn't incredibly challenging because we've designed with those things in mind where we're effectively append on the API, because that's more or less what we can do.

We don't version things at a higher granular letter than that.

But in terms of the testing matrix and QA, it becomes much more complicated where everything that you launch needs to be tested across the range of supported software versions we shipped, and then the range of devices versions it's running on, and then you multiply that again by ipad and iOS, screen size, et cetera.

Like you said, our QA and Release Services team, they do an awesome job, and they are very busy all the time, and they're well known at the Apple Store.

That's right.

So you mentioned a support matrix.

Is that an actual thing that you have? And what goes in that? Yeah, we don't have a lot of prescribed things that I would say we build it out of.

But to give you a general shape and size, we tend to support maybe two or three or four back versions of iOS and Android, depending on reasons why and depending on whether old hardware has been supported or not based on Google and Apple in that case.

So that basically means that our API surface area goes back at least two to three years at all times.

And then if you multiply that by because it's a matrix, all the features we support.

For instance, Pinterest is doing a lot with video these days, whereas if you look at the product when I started eight years ago, we are primarily an image focused site.

So what's interesting about video? Well, a lot of things.

It's audio and video it's streamed, and it's also in a really immersive full screen, but also grid scrolling format.

So this is not the video distribution CDN podcast, but the challenges.

There are way easier on modern devices than older devices need to scale along those dimensions as well.

So there isn't one specific matrix.

There's basically what we need to do today, and in the last two years it's a trailing window going forward indefinitely.

So I have to imagine that you've got a pretty sophisticated way of handling versioning in your APIs to make all that possible.

Yeah, I wish we did, but it's fun to imagine, isn't it? Yeah.

I mean, it's one of those things where it would certainly be nice to have, and it's probably a really difficult thing to retrofit.

So the versioning that we have is more about principles and less about technology that enables us to do exotic things and keep dishonest in terms of principles.

It's effectively independently API.

It's one version where once you add something, it stays almost forever and forever is our three year window.

Deprecation is basically our one tool where we remove code, not because we don't want to support it anymore, but because it's part of our tech that clean up cycle.

So once we've able to prove through traffic metrics that we've migrated all of our users, our Pinners, and also third party customers away from an endpoint or a set of fields that's returned from continued to be supported endpoints, we can pull that code away, and that just keeps our surface area more manageable to share just a rough scale of how many endpoints you're talking about.

We're in the thousands, and the product is a pretty rich product, but we've evolved in the direction of creating new endpoints for new use cases when it's become more complicated to evolve an existing endpoint to support the new use cases.

And that keeps a lot of the append on the attitude of our API a little more sane in terms of management and ownership and cutting over things from what could be a former version to a newer version where version is not really a number, but it's a view into the set of endpoints that we think are active.

Yeah, I'm curious.

I've seen some of this kind of in practice before where it would be better to have versioning, but that's just not the reality you inherit.

And sometimes it's like you just have an endpoint that does something slightly different with a different name rather than V two in front of it.

Is that kind of like how this tends to work out in practice? That's exactly it.

I would say the other two things that come up pretty often are when we have an endpoint that returns the Paginated set of well defined schematic objects.

So in our case, I would say imagine an endpoint that returns a stream of pins, and then we find that the thing that we might want to present in, for instance, the Pinterest home feed.

It's not all pins.

We have recommendable objects in there.

We have calls to action for new features coming out, things like that.

It doesn't work to return that always as a Pin or Pin lookalike object.

So we're subsidy any now and you want in there.

But you can imagine homogeneous stream not always working.

So one of the reasons we might change to a different endpoint would be to change the response format, even though it's more or less working the same.

We want to render it differently.

We can talk about GraphQL more, but that's one of the things we're most excited about, that being a potential sort of accelerator for us to be able to react more quickly to some of these kinds of API changes, which is why we're excited to be looking at that at the same time.

Yeah, I'm particularly curious.

Not too long ago had kind of some bonus Q and a questions around when you're building your stack.

Where should you put Rest or GraphQL or gRPC or anything else? What's an appropriate kind of usage pattern.

In all honesty, I think we punted a little bit and just picked on Graph QL for a minute, but I'd love to get a sense of like, what are the kind of use case patterns in which you choose to use GraphQL versus restaurant gRPC or anything else? Yeah.

So I put a lot of thought into this.

So I'll lay this out as we see it today.

Pinterest has been running a Restful API for the last almost nine years, maybe even longer, actually.

And that's been our single sort of hammer for all API related customers, and it works like a really good hammer, and it's definitely survived the test of time.

But now that we're thinking about how we want to do a better job in the future, how do we want to take a step function change in certain areas? We're starting to bucket various audiences, use cases and trying to pick better technologies if they're available for each of those use cases.

So the way that looks for us right now, again, we're still entirely restful, but we're heading in some new directions.

That's why it's excited to talk about them internally.

We communicate between services using Thrift, which is an RPC mechanism, and that's what we've had for a really long time.

And that's a really clear use case where you have request response patterns and highly schematic data for request responses.

And you also have binary encoding.

And that works really well internally for both sort of traffic routing and visibility and tracing.

There's a lot of really good extensions for something like that.

It also gives us a way to schematically put data at rest, so at risk not to reuse the rest of them.

That's our go to Hammer for everything internally, but externally, restful APIs over Http.

They are excellent.

But like I mentioned, we have some different audiences in mind.

So the direction we're heading in now is to evolve all of our first party use cases around GraphQL.

We think that's bringing a lot of benefits to the story where we have again independently focused API.

We already have that muscle internally.

So that is a good fit for us.

But it also supports the kind of rapid iteration Schema exploration, discovery.

And overall, I would say, flexibility and query patterns that are going to let our APIs for first parties evolve at a much faster pace, because that's necessary relative to what we're going to do for third party developers and partners for those use cases.

That's still going to be Rest.

But we are really embracing OpenAPI in that regard.

And we're going in the design first spectrum and development direction there, which is really exciting.

And now that I've kind of walked around everything at Broadstrokes, the easiest way to summarize it in our verbal whiteboard here is that if you're an internal Pinterest developer, you're going to developing towards a graphic level API, and you're going to focus entirely on that use case.

If you're going to be a Pinterest partner, we're going to be in the future shipping restful, open API based Schemas with strong versioning requirements and really good documentation.

And they'll be well wanted with some excellent open source tools.

And that's what we're going to commit to for partners.

And internally we're focused on Thrift.

So those are the three pieces that work together really well to deliver what we think are perfect audience specific API renderings for how we see the world.

So first party, in your terms, is sort of your internal developers.

Yeah, that's right.

And you're saying that working on graphicials kind of may be the primary thing.

So is this, like, right from the kind of data sources bottom of the stack, or is this doing any sort of aggregating of existing internal rest stuff? Yeah.

That's a great question.

I think early on we were probably like a lot of companies were started effectively shipping our database models.

Right.

Like that's the easiest thing to do as a pass through.

And I think that influences things like which fields are available, which objects are in our terminology, joined or nested against one another, because that's how they're stored in the database originally.

So therefore, that's how the API represents them, how Pagination works, whether you go forward and backward, or whether you have a cursor based or offset limit based structure.

That's definitely how we started.

And to answer your question, more specifically, there's still a large business logic layer for us.

It's largely Python, with a lot of help of a lot of faster languages, providing a lot of the data munching that underlies our API presentation layer.

And that's where we're going to retool part of it for GraphQL.

And the part to 30.

Rest is just going to get a few new tricks to speak Open API as a more of a first class concept.

The part of this that's pretty exciting is the approach we're taking to get to GraphQL from Rest when we started with our Restful API years ago.

This is before we embraced anything that was highly schematic.

So we have effectively a bunch of Python based software which defines the API, and that is the contract.

You can substitute any language you want there.

But you've probably seen that kind of approach before.

What we're doing to keep ourselves honest is we are reflecting out of that code and generating an internal Open API based schema that we consider going forward, the source of truth.

And from that schema, we will generate our GraphQL schema.

So it's a little bit of a roundabout way, but it's going to give us a lot of confidence in how we test and compare what we are shipping today over a restful interface with what our future graphic all interface is going to look like, and we'll be able to slowly transition clients from one to the other with at least a bit of confidence in time safety and what the expectations are around what was being shown before and what's being shown later for our partner based APIs.

Like I mentioned earlier, we're going spec first.

So we're building a lot of Open API tools, but we're building one from left to right and one from right to left, so it's a little bit it might appear redundant.

But right now we like the idea that no matter what we're doing, we have a GraphQL scheme that you've generated by software or generated by hand, and a lot of our common testing tools can be built based on that spec oriented source of truth.

Well, all right.

So any open source treats coming from that probably not too soon.

I think the attitude we've had so far is that there's a lot of great work in that space already.

I'll name a few of the tools we're building on today, and hopefully we'll be able to contribute some of the work we've done to those tools without necessarily reinventing our own tools.

And the process we're using Open API generator, which is a really flexible tool.

The reason we like it is because we are specifically using a lot of its outputs, but because we can customize the output generously, which is a good fit for what we're trying to do.

As a retrofitting exercise.

We've hit our head in a few things, but we've been able to work around all of them, so definitely kudos to the team, the twelve step tool.

It basically can input and output everything.

So that's amazing.

The other tool I mentioned earlier, we're big fans of Spectral to keep ourselves honest in our Schema generation and hand offered Schemas.

So we have spectral integration on piles of our tools.

So we're really happy with that so far.

And there's a smattering of mostly Python based Open API tools that I won't bother listing, but there's a lot of really strong libraries there, and I've been able to contribute various patches and things to those.

So I run an open source program at Pinterest as one of my many hats we talked about.

I'm a big fan of contributing back, and our attitude so far has been that if there's something that's already doing a good job in that space, we'd rather improve that than to create something from scratch.

That's not to say you won't see open source projects from us in the space in the future.

But as of today, that's their attitude.

I think the last part of this, which you can call it open source, but I think we're excited to ultimately publish a lot of our Open API specs ourselves for developers on places like GitHub, and it's not traditional open source, because it's a hard place for people to contribute back to.

It's a different kind of development cycle, but at least putting them in a place where it will be next to all the tools they're used to.

I think it's going to be a benefit to everybody, both off of them.

So for the record, I did not prompt John to mention Stoplight Spectral open source project, and more importantly, to me, is definitely let's chat about what that contribution looks like.

That'd be fantastic.

If you guys are making improvements, it's been really cool to see all the different people using it in all the different ways.

I've been around two or three quarters and just starting to play with the rule sets myself and realizing, like, all the crazy things you can do.

It's pretty neat.

Yeah.

Thank you for building it.

Yeah.

We have some really smart people on the team that have worked on that over the years.

It's interesting, though, that when you're thinking about the existing Rest API surface and kind of what that contract looks like, and you're looking forward to GraphQL, but you're looking to kind of have almost like this parity check out of the gate to make sure that they both have all the same things.

So I guess is the reasoning behind shifting to GraphQL that it's giving you those relationships is kind of the primary value thing, or is there some other aspects of it that make it more attractive? Yeah.

There's a few things I would say that the top few are that it's highly schematic, which also Open API is, but having that level of data typing from between what I already described as a Python based, which is not a strongly type language.

It has type annotations today, but it's not a strong type language being able to complete the circle from or I guess not quite a circle to draw the line from that Python base code, which is annotated through something that's schematic.

In this case, it could be Open API, but we're talking about GraphQL all the way to an iOS Android or Web client or partner GraphQL.

But in this case, our first party clients and understanding exactly what type is nullable or which type is specifically an array type of a different data type is very powerful.

It short circuits a lot of the conversations we otherwise would have to have internally.

Where you have one developer working on, say, an iOS client needs to go talk to someone that is more intimately familiar with the Python code to understand those things.

And GraphQL in particular, gives us a really clear way for those client developers to self discover those things with a tool like graphical is a great example in the graphical space where you can explore your own queries, you can view side by side documentation.

You can see your live results and try things out safely before you commit client code to it.

And then once you ship client bits, especially in iOS Android, those bits live effectively forever for us.

So you want to make sure you get that right.

The other benefit GraphQL is also in that exploration space I mentioned before.

How many endpoints we have? Well, a lot of that was intentional for the reasons we talked about, but a lot of that was also because a person ended an endpoint for a very specific use case that maybe could have been served by an existing endpoint.

Maybe it was slightly different.

It could have been extended, but it was easier to build something from scratch.

We think that by giving people the full menu of what we have in terms of object types and the relationships with GraphQL, that they'll be much less likely that people are going to jump to the first step of writing a new endpoint for a new case, and they'll be more prone to explore what's available through the GraphQL interface and to use it as a query based language to get the objects they want, the data they want rather than having to remix everything every single time.

That's the other big advantage we're excited about there.

And I think the third benefit, which comes with a lot of the Schematic based API patterns, is code generation, which saves a lot of time from client side boilerplate.

We haven't been able to leverage that as much in the past.

One of the open source projects that is in the space that Pinterest has built in the past is a tool called Plank, which works on JSON schema.

So we're familiar with that in general, but that isn't necessarily a full API contract.

That's a way to describe in our case model objects.

We've gotten good leverage out of that.

But that's again, only half the story.

It doesn't talk about request response patterns, doesn't talk about parameters, it doesn't talk about those kinds of things.

So we like that tool a lot, and we're really excited to go even further with it by having the entire API story, not just the response object story and Cogeneration.

So those are all the pieces that we're really excited about in GraphQL.

We probably could have looked at other options.

We talked about thrift internally.

That could have been a choice for us.

You could talk about something like gRPC and park offers.

They bring a lot of the similar benefits, but being able to interact over Http like we're used to with something that is as rich as GraphQL with the really strong stories you see in industry around how graphic levels evolved.

I've been excited about it since it was first released, and we had early conversations with the team at Facebook that was putting it out there.

It's taken a lot of years for us to make the full investment in going in that direction, but I think it's a really good fit for a company like ours that is a rapidly evolving product team, and that's what that tool is built for.

So it's going to be a really important part of our story going forward.

And keeping us honest and transition is going to make that software project that we're running internally, much more successful than if we were just kind of I don't know, yellowing it if you will, and doing the entire work sort of off the cuff and hoping it matches at the end.

This will make sure that our two railroads meet in the middle.

And that's certainly an easier conversation to sell up the management chain, but also it'll keep a lot more folks calm and not on call for these transitions.

That's a lot of fascinating stuff.

I'll first of all, say, when we had our kind of Q and a episode about GraphQL, I gave a lot of kind of the aggregate things I've heard over the last few years, one of which is probably not a good idea to use for your kind of domain APIs, kind of that deep down the stack.

And I will cave up by saying, like, I'm just saying what I hear there's certainly other alternatives out there that you're describing here.

And I'm curious if part of that investment and part of those building blocks to get to this point beyond this kind of parity checking and open API sync up.

Are there sort of performance considerations that you've thought about that are a risk that you've mitigated in some way? That's a good question.

We have a dedicated performance team of interest who are excellent and keep us honest.

They measure everything, and when the measurements don't match up, they help us understand why GraphQL definitely has a really strong performance story around it.

And it's one of the benefits we're looking at.

It's kind of twofold.

I would say the biggest benefit that we're excited for is being able to be more specific around which data the client wants for each kind of view.

And you can do that with a lot of other APIs, including the rest of the API we have today where we support the idea of partial field selection similar to how you would see it in, for instance, Facebook's Graph API.

But the other advantage is similar to what I was mentioning earlier.

We believe by giving people the graph fuel schema and the tools around that they're going to be able to make from the client very specific and very well informed choices around what data they select and why they select that data and what goes over the wire will be smaller and therefore hopefully more performance.

The other benefit we get out of GraphQL and this is talking more of my client hat on is that the Caching story is very clear.

So if you look at how you would cache a Restful API, you have tools like E tags.

You have other tools that if you have a good, well defined Restful API, you can make certain guarantees around how long a given response should live for and what you can do with it.

That doesn't really mesh well with partial field selection like I just described.

Unfortunately.

And unless you put a lot of intelligence into client side Caching or even traffic level Caching over the broader Internet, you're not really going to be able to lock those advantages with GraphQL.

You lose a lot of the obvious Http benefits for Caching, but you gain the idea that the client side representation is intentionally designed for strong object IDs and field level Caching, and the idea that when you build one query, you can produce a Delta query based on what you have in your whole cache and what you get from the server.

So by combining all those concepts, which obviously is a lot of work, but a lot of work at work, we think we're hoping to unlock performance benefits, which we've proven we're not the only ones, but we've proven data wise that this is both a better user experience and treats people's mobile browsing experiences more respectfully with regard to data usage and how much time they've spent waiting for things to load.

So yeah, there's a good story there.

Graphql is not the only way to unlock that, but it's the way that we think gives us.

It's probably the most common platform that we can get those benefits, plus the other benefits we talked about earlier.

So that's why we're excited for it, especially if the tools are available these days to help us get visibility and told those things interesting, because the one kind of other qualifier that I gave and some of that advice was if you're going to use this kind of highly flexible graph that lets your client query in potentially serendipitous ways.

One, you need to have a highly performing data layer and two like finding ways to constrain that kind of querying path to something that you can be more predictable with.

I think you touched on both those perfectly, which is you've got sophisticated performance monitoring you're looking at and essentially proving to yourself that it will perform.

And two, that kind of providing or really, you're almost in principle relying on this field selection being more narrow because it benefits the client.

So that's a fascinating perspective.

The one I hadn't heard is this caching bit, because I know in the rest world.

Oh, it's all post like this is ridiculous.

What are these people thinking? So you mentioned object kind of strong object IDs.

What was the other bits that you mentioned? Yeah.

So that's the most important one, but you can do that in a lot of other APIs, but field level caching.

So I'll give an example if you have an account object and maybe it has ten fields on there, and you've fetched a list of these accounts to your client, but you've only selected maybe five fields out of them, because that's all you need for the given view you are populating.

Then maybe you go to a detailed view of one of those things, and that view.

The detail view needs eight of the ten available fields.

With GraphQL, you can realize you had the first five in your local cache already and then narrow the query the subsequent query down to the other three.

Do a merge operation, and then you can draw that.

It's very powerful when you think about this from an iOS or Android first perspective, because that difference is significant.

You need to think about this obviously, beforehand.

What data is stale.

What is the TTL on various fields, but that's part of building a good API anyway, in a lot of ways that you need to understand the lifetime of various data.

Thinking about it from a field level, though, is a really powerful concept, and I think it's something GraphQL offers.

I'd be curious to know how many people that have adopted graphic.

You will think about it that way, but it's certainly how we are planning on leveraging what's available in front of us to our benefit? And that example that you described, does that happen on the front end or the back end in the iOS example? Is that something the iOS developer is looking and saying, what data do I have or is that provided by the GraphQL API? Yeah, it's client oriented, which is how you get the bulk of the benefit.

And ideally, you have an iOS developer who's just trying to get their job done, of course.

And that's a framework level concern, which is a feature of a number of graphic dual frameworks.

I'll point to Apollo, for instance, which is probably the most predominant open source success story in the graphicale space.

So that's definitely a framework level concern.

The thinking is in GraphQL is that it's important to get your queries really clear.

And if there's an authorization story in the framework level, then the framework should take advantage of it.

So you write the full query and then you go from there.

Well, John, you've mentioned APIs evolving rapidly, and one of the things that goes with that is technical debt and changes.

You wrote a post about the Dead Code Society channel.

Can you tell us about that? Yeah.

I'm glad you found that it was one of the most fun things to write, because it is one of our shortest blog posts, which I think kind of speaks to the joy of doing something by either taking it away or cutting it down to its base essence.

Right.

So for us, cleaning up Dead code is a celebration.

Someone wrote it.

It was important for some period of time, but maybe it's past its point of usefulness.

And Dead code is to be celebrated and remembered.

But it doesn't need to be in front of you at all times.

Applies to a lot of software, and in our case, it applies to some of the APIs that we've shipped for product features that we might not have active anymore.

We did a lot of fun work back in 20, 13, 20, 14.

That might not be in the product anymore, and we don't need to consider that part of our testing surface anymore.

We don't need to consider it part of our CI builds anymore.

So retiring that is a joy.

And we celebrate that in a slack channel called the Dead Code Society, where people drop in various changes that they've done, which the rules it has to be all are very close to all red meaning negative deletions.

So it's super satisfying to see someone show up there and say, like, I just deleted 1000 files or something or localization for a large feature.

Yeah.

It's a very cool thing.

I love the practical view on what being good at Deprecation looks like and kind of having some fun culture around it.

I love it.

It reminds me of a type form when we killed off some of the older API surface.

Internally, we would always put, like, basically the person who was in charge of taking care of that.

Hi, Andrea.

We would always put, like, the desk guy with the size and the hood and everything, but we put that on, like, at the time it was like the little Beardy man guy.

So on any slides, kind of announcing this stuff internally, we'd always have him like, he's the bringer of death, but we celebrated those deaths.

Our emoji is a little tombstone, so yeah, we're definitely in the same humor camp here.

Yeah.

I guess it's all too often.

The reality, though, like when you say on kind of the versioning thing earlier that maybe it's not normal, but I actually think it is.

And I think to some extent people go, What's your versioning strategy? It's like, well, if you didn't already have one in your to get stuff out there probably need to get good at Deprecation and learn how to accept that that's going to be painful.

So I'm curious, in your case, with kind of killing things off, not just in the code, but in the support for a given API, even internally, I imagine you have probably thousands of developers, right.

These are probably a big deal.

If this happened, how does that kind of communication and keeping everyone in sync go, yeah.

I think the biggest part of that for us is recognizing that this is something we do regularly.

It's not a special thing to do a clean up operation.

So as a software organization, you want to prepare for that always being a possibility.

You want to have the tools in front of you to be able to make a decision as to whether something is used or not.

And that is metrics and stats and test coverage reports.

That's helpful, but just culturally, knowing that it's safe to do that once you've gone through the checklist with those tools that's really powerful.

Otherwise, you have developers who wouldn't even think to do it, because I think it's such an exceptional activity that it would persist.

And I think two things are true of most software companies.

It's true of mine that you have both employee turnover and you have software turnover.

I think in writing that blog post, I did some gift diving and found that in the last kind of trailing two year window, like most of our software has been rewritten or written from scratch.

It's not the same software we had if you took a snapshot five years ago is what we had today, and that's healthy turnover.

Like, there's probably a forestry analogy in this where that's a cycle of life that you want to participate in and make it as frictionless as possible in the ways that the software and the people around the software need to renew it.

So there's some philosophy there.

Yeah.

Making it safe and giving people the confidence that it's okay to remove something that they don't need anymore, and they don't need to think about in terms of support.

Burden is a pretty powerful idea because it allows people to do new things or to revise things that they're currently actively looking at.

And that just helps developer velocity, which is an important aspect of running a software company.

But it also just helps engineer sanity.

Thinking about this is the world they've inherited.

Very few people of interest were there when it started and therefore don't feel necessarily like the code was there to start with.

Certainly, from the API perspective, we're running APIs that we've had in service for nine years now, but everyone needs to have a sense of ownership over those things.

And by retiring things being an equally fun thing to do as writing new things, it keeps that stuff healthy.

So that's the philosophy behind it.

At least that's how we look at it.

I want to take on the words safety.

And there you said, making it safe, and it sounds like, on one hand, it's kind of psychological safety.

That like, this is an okay thing to do.

This is normal, but I would guess that another part of that safety is that from a testing standpoint, things are pretty sophisticated.

You guys are in kind of a continuous delivery mode, and that making it safe means you had enough testing up front.

That when you kill off the old way.

The testing gives you the safety that you didn't break anything true.

That's mostly true, and we can only make it more true over time.

As we build more and more systems.

I would say that we have very good testing for the parts of our products that are very important to us, and we test them from every possible dimension.

We have monitoring and logs on sort of CDN level activity all the way down to unit test coverage.

We have a lot of functionality based tests that exercise and in a continuous fashion, whether I'll make up an example.

But whether a search result for Red Shoes returns actual Red Shoes as an example, which is an important aspect of running a site like Pinterest.

So that's all true.

It's interesting because the things that are most visible in the products are getting the most attention, and the things we just talked about in terms of retiring things that aren't relevant anymore are probably not getting that much attention because it's not where our focus is anyway.

That's where things become a little more nuanced, and that's where safety is even more important.

For instance, if I were to tell someone that I'm going to retire our login endpoints, they would probably know without even having to look, those are important, and we should not do that if I were to make up some other endpoint and say, hey, this thing that I don't know returns, I don't know something they might not have ever heard of it, and they wouldn't know whether it was used or not.

And then we'd end up with a two week email or slack thread trying to find the last person that might have touched it, and maybe they weren't the company anymore, but maybe someone else inherited it.

And that Gray area is the part that we're trying to provide some tooling or some kind of confidence around being able to answer that question.

Is this really unused to this? Really go away? Is there one partner integration out there who is using this? But they're only using it once a month because it has to do with a monthly analytics report.

That's the level where we can always do better, and we rely mostly on observability tools and logging and metrics to help make that decision.

People are great, but also you can't rely on one human being knowing everything, whereas machines out here are pretty good at recording information and counting.

So we try to leverage them best we can for those use cases that makes me reflect that you're running a big multi sided network and one of the kind of principles of that sort of marketplace theory stuff.

Is there's key interaction points that there are these kind of magic moments in the journey of all parties involved in the platform in which the good things happen.

Right.

Positive network effects.

So I'm curious, this notion of kind of key interaction points.

And is that sort of a well known subject from that kind of marketplace theory side that might inform the engineering side? That's a good way to think about it.

I would break that down a little differently.

What you said makes a lot of sense when you think about it from the way you look at it internally, we tend to talk about surface areas or surfaces because that's what our product oriented folks think in terms of.

So if you look at Pinterest, for instance, we have a home feed and we have a search.

Those are services.

At least that's the terminology we use for them.

And there are product managers and engineering teams built around them.

If I were to ask an engineer, tell me every AP endpoint involved in search, they might not be able to name all of them, because that's not how we're building a product.

I think other companies have taken a service oriented presentation view model for their APIs.

That's not what we've done.

So it doesn't map from one side to the other clearly, clearly.

But when you think about prioritizing how we work internally in product services, the APIs are a secondary effect of that.

We don't think about the APIs first in that regard.

So that's how we think about it.

I think hopefully that gets to what you're asking.

It's correlated to it.

And also fascinating, which is like when you think about that portfolio of APIs, and we've asked this question with a lot of folks on the show here that like, how do you think about structuring those? And do you have any kind of governance around that to make sure that it all rationalized as well? And it sounds like smartly.

You don't have this sort of UX pattern driven thing.

But what is the kind of mental model that you're using for composing those APIs? Yeah, that's a great question, because we think about this a lot.

There's probably two dimensions we try to think along, and these aren't very principled.

I wish I had a really strong answer.

As I mentioned, we have thousands of APIs, which is a lot.

So they don't all follow the same rule set in one dimension.

We think about ownership.

So when you have that many APIs and we just celebrated our 1024th engineer, you can tell why.

So it's a big engineering team, and many of them are able to make API endpoints themselves.

So we give access to basically every engineer that wants to work in that space, to be able to do that with the code review path afterwards.

A lot of people can contribute to this area when you think about ownership, that's team level ownership.

So we talked about the surface areas.

If someone is working on part of our search product, there's probably an engineer or a group of engineers on that team who are thinking about how they get their new product feature available to our clients by way of the API.

And they have probably written some of those API endpoints in the past, and they know how to evolve them.

So they own, for instance, the slashsearch endpoint.

They probably have a pretty good idea what they would do to change that and to supplement it with some new feature functionalities that's one dimension on another dimension.

We maybe think about some of the things we've done, for instance, working on new shopping things.

We just made a big announcement about our new shopping capabilities, and maybe there's a search endpoint for searching products, which is different than searching pins.

Maybe again, kind of making examples up to tell the story.

There might be a different team that is not the search team.

It's the shopping team that needs to think about how they'd build a really good product search endpoint.

And they have the option of maybe piggybacking on the first team's endpoint by extending it with new data and new parameters.

Or they could build a parallel endpoint, which is specific for that use case.

Now, going back to what I was saying about product services.

They still might appear in the product right next to each other, or they might be interchangeable depending on which UI toggles on the screen, but they might be built back end wise by two different teams.

So it's important for us to think about a portfolio of product features and also the inventory of endpoints behind that are powering them and then which sub teams are evolving those endpoints and why and how they interact.

And you can further complicate this, of course, by looking at overlapping time windows.

One team might be doing some work the first half of this year, and another team might be doing work the second half of this year, and there might have been a team doing work all year long, all trying to match these things up.

And we've been releasing continuously multiple times a day all the way through that entire calendar year.

So there's a lot going on for sure when you're thinking about fast product services, multiple teams, and what products we put in front of people on iOS, Android and Web today versus what we're still supporting a year ago.

So I guess I have a fun, busy job.

Yeah, for sure.

I guess it kind of puts a bow on some of the stuff you're saying earlier that moving in this more kind of design first, shifting left direction, I would imagine, is probably trying to help with that kind of coordination and keeping everybody on the same page before they build stuff, right? Yeah.

That's certainly what we're trying to unlock there, trying to smooth some of those things over and get more people talking the same language at different levels of the conversation I just told.

So that's the goal for sure clearly stop light.

I'm a fan, right? We're pretty opinionated on that one.

That it makes sense, but that's a great story to hear as like you're in the midst of it.

And kind of what the hope of it is.

It'd be fascinating to hear an update, maybe in a year or something and see how that went.

I guarantee in a year we'll have lots more things to talk about.

We're always on the path.

Yes.

All right.

Well, I'm going to throw the same question at you.

We've thrown at a lot of folks when you reflect on all this eight years of doing this and you had to go, let's say dear and thing, and you got to start building APIs.

What would be essential, what would be the things that you would start with in terms of building sanity from the ground up? That's a great question.

And of course, everybody wants the opportunity to start over without falling victim to second system syndrome or something like that.

For me, I've talked a lot about our audiences before, and I think that's probably where I'd start.

I know a lot, especially with my experience on Pinterest around restful APIs, so it would be comfortable to build something there.

And I think easy because it's what I know.

But I would also be tempted to do something maybe in the gRPC space or GraphQL first space if I thought that the audience I was targeting was appropriate for the other end of that of that Http connection.

I would think of it that way.

I am 100% on board with the spectrum development approach.

I think that there's a nuance there that I'm very excited about.

Also, I think if you embrace the program language model, they can effectively model your spec in a DSL.

I think that's just as good and has other advantages around how you can do code level introspection and testing there without relying on tools to pull in a textual description.

So I would be excited to probably build something there.

My tendency would probably be to hopefully not build something from scratch there.

But I think that's a pretty exciting direction.

And I think the last thing that I would do that I always wish I've done in the past, is be able to build something minimal first and get continuous feedback all the way through.

The nice thing about Pinterest is we're always shipping new things, but also we are a business that can't always sample our audience and ask them what they think of our API.

We think about what do you think about our product? So if it was possible for us to get feedback from developers out there partners early on about what we're building and whether they like their approach to Pagination, or whether they think that we have a good sort of separation or concerns around different object models and the operations you can form for filtering.

For instance, I would spend way more time up front getting that right.

Because the overall lesson that I think you probably heard is we built something in the course of months, nine years ago, and we're still running it today, and you don't get too many of these opportunities.

So making the most of my thing is very important.

Awesome insights.

And I think, as always, it brings us back to kind of you can't get away from API design because you're building long live things.

You got to pay attention to what customers need and actually ask them.

I'm a huge fan of early access groups, by the way, because once this stuff is in the wild, it doesn't go away.

I love it, man.

Fantastic closing thoughts.

I think unless you have any other bits you needed to get in here or something you wanted to bring to our attention.

No, thanks for the conversation.

I get to talk to a lot of people internally about these things, but it's always fun to hear other people's perspectives and questions from outside the world that we've built.

Yeah.

You're literally describing why I want to do the podcast in the first place.

Get to talk to other geeks who do this in different industries, different products.

There's always a different take.

And I think you absolutely brought us a different take on things today, which I love.

I'm glad.

Thank you.

All right.

And thanks again, Adam, for co hosting.

Yeah.

Great.

All right, John.

Well, again, thank you.

And have a good one.

Yeah.

Thanks very much.

Good to talk to both of you.

Thanks for listening.

If you have a question you want to ask, look in the description of whichever platform you're viewing or listening on, and there should be a link there so you can go submit a question, and we'll do our best to find out the right answer for you.

More episodes

Chapters

Show Notes

What is API Intersection?