The Bootstrapped Founder

What happens when the seeds you planted eighteen months ago finally start breaking through? In this episode, Arvid shares how Podscan's long-term investments are compounding—from programmatic SEO earning backlinks from major publications to an OP3 integration improving data fidelity across millions of podcasts. He also talks about how agentic coding tools helped him migrate to OpenSearch, a system he never would have touched on his own, and the semi-automated 10-80-10 workflows that are freeing him up for higher-leverage work.

This episode of The Bootstraped Founder is sponsored by Podscan.fm

The blog post: https://thebootstrappedfounder.com/when-long-term-investments-finally-pay-off/
The podcast episode: https://tbf.fm/episodes/436-when-long-term-investments-finally-pay-off

Check out Podscan, the Podcast database that transcribes every podcast episode out there minutes after it gets released: https://podscan.fm
Send me a voicemail on Podline: https://podline.fm/arvid

You'll find my weekly article on my blog: https://thebootstrappedfounder.com
Podcast: https://thebootstrappedfounder.com/podcast
Newsletter: https://thebootstrappedfounder.com/newsletter

My book Zero to Sold: https://zerotosold.com/
My book The Embedded Entrepreneur: https://embeddedentrepreneur.com/
My course Find Your Following: https://findyourfollowing.com


Here are a few tools I use. Using my affiliate links will support my work at no additional cost to you.
- Notion (which I use to organize, write, coordinate, and archive my podcast + newsletter): https://affiliate.notion.so/465mv1536drx
- Riverside.fm (that's what I recorded this episode with): https://riverside.fm/?via=arvid
- TweetHunter (for speedy scheduling and writing Tweets): http://tweethunter.io/?via=arvid
- HypeFury (for massive Twitter analytics and scheduling): https://hypefury.com/?via=arvid60
- AudioPen (for taking voice notes and getting amazing summaries): https://audiopen.ai/?aff=PXErZ
- Descript (for word-based video editing, subtitles, and clips): https://www.descript.com/?lmref=3cf39Q
- ConvertKit (for email lists, newsletters, even finding sponsors): https://convertkit.com?lmref=bN9CZw

Creators and Guests

Host
Arvid Kahl
Empowering founders with kindness. Building in Public. Sold my SaaS FeedbackPanda for life-changing $ in 2019, now sharing my journey & what I learned.

What is The Bootstrapped Founder?

Arvid Kahl talks about starting and bootstrapping businesses, how to build an audience, and how to build in public.

Arvid:

Hey. It's Arvid, and this is the Bootstrap founder. There's something deeply satisfying about watching the seeds you planted a year ago finally break through the soil figuratively even though I do enjoy growing my own tomatoes. Today I do want to share a few stories from PodScan that's my podcast intelligence platform about what happens when long term investments start compounding. Some of those took eighteen or so months to materialize and others became possible only because I embraced tools that I thought I'd never touch.

Arvid:

So this is a bit of a year in review, although slightly delayed and probably incomplete. But I think there are lessons here that apply to any founder playing the long game, and maybe you are going to encounter a technology or a thought that you may not have thought about before. Now let me start with something that took real patience because that's the hardest thing to do as a founder and that was our programmatic SEO efforts for Podscan. For a solid year and a half, I wasn't sure that they were working too well or at all particularly in the beginning but that's kind of obvious because things just take a while and obviously each new page particularly if it comes with a .fm TLD like PodScan has some work to do before it's recognized as a trustworthy location but that seems to be happening now and more and more people are linking to particular shows or particular episodes directly using the podscan.fm link and each of these backlinks from major newspapers like Wall Street Journal and Forbes and other places wherever a podcast is mentioned each of these backlinks increases the quality of the domain. The domain rating goes up and that obviously means better search result placements, higher trust value with these anti spam systems and generally people picking our page to link to when they need to reference a podcast also it means a lot of people email me trying to place their weird spammy advertising stuff on my blog but that's a different story.

Arvid:

A good domain rating gets attention, the good kind and unfortunately the bad kind as well. But here's what I find particularly interesting for my business from the kind of business owner perspective: podcasters themselves, like the people running the shows, are now putting links to their Podscan page into their show notes and some of them not all of them but some of them have claimed their show on the platform as well others are just collecting links to every analytics service out there Not exactly sure why they do it but I'm glad they do. Either way, the name has become almost synonymous with Podcast Analytics in certain circles and that's exactly what I wanted to accomplish. And this whole SEO effort has proved to be a very reliable lead generator. More and more of my prospects find their way through what is effectively user generated content, the podcast transcripts themselves.

Arvid:

We just host the content. We facilitate already pre generated content on the platform by hosting transcripts for shows that other people recorded. And for that reason, it's a kind of cheap way of generating unique content or at least, like, facilitating the discovery of that unique content. Like, we don't generate it. It's somebody else's work, but we point at it so people can actually listen to the show.

Arvid:

And because it's generally useful content all these new podcast episodes it is found by people with a vested interest in that particular topic so they want to find out more and they discover Podscan they explore the other features and ideally they become customers Obviously competitors in this field have understood this too and have similarly high rated domains but just from all the links that they've been able to collect over a longer time than Podscan has been around. That comes with the territory of being a newcomer in the field but the compound effects are real and they extend beyond just search rankings. Domain Age also helps with email deliverability scores. The reputation builds up in multiple dimensions simultaneously. And I like it.

Arvid:

It's nice. It's nice to see that this effort is working. One thing that really helped apparently was putting a embeddable player into the system. You can link to any podcast episode on PodScan. You can link to any podcast for that matter using a player that you can embed on any, like, HTML based website, and it will just start playing.

Arvid:

I kinda felt inspired by the embeddable player that the transistor.fm platform has that I host my own podcast, this very podcast on. And I thought, I would like something like this as well. And then I built it, and then people actually use it and play it and embed it. So it's kinda nice. So that helps to have links.

Arvid:

Right? Because now the embed part is linked to from other pages. Really cool. Another investment that's been paying off has been our integration with third party data, particularly with open third party data. We have been integrated with OP3.

Arvid:

That's an open standard for transparent podcast analytics. The idea behind o p three is as brilliant as it is simple. You can just use it as a prefix for the URL to each file that's being downloaded by a podcast client. Right? Imagine your files live on, I don't know, transistor dot f m slash your podcast name /episode3 dot mpthree.

Arvid:

So now you just put h t t p colon slash slash o P 3 dot dev slash E slash in front of that link, and you have a link that starts with o p three. And once it gets there, they do the analytics magic, and then they forward it to your old URL, and then people get the file from there. But they track the actual download by going through their system. It's a pretty well established system. It's all open source.

Arvid:

Like, you could set this up yourself if you wanted to. It costs a couple $100 a month. I think it's really not that expensive. Even though it tracks podcasts with millions of downloads it's really really cool op3 collects metrics about how often people download episodes how many of these people are real people versus bots where they come from when they download how much of the file is downloaded all of this information So we've built an integration that synchronizes OP3 data with our existing database. And relatively few podcasts actually use OP3, but the ones that do obviously benefit from us showing their RealTrack analytics on their Podscan page.

Arvid:

But here's where it actually gets interesting. We've been collecting so much data through OP3 now that we've started feeding this data into all our machine learning systems and the estimation models that we use for audience sizes and download numbers. So for podcasts that may not have as easily traceable information, like most shows out there that are completely opaque and you don't really know much about them, our estimates are now more accurate because they're calibrated against real data by podcasts like these that we can fetch. And this integration benefits the user because they see real data for real podcasts, and it benefits the platform's data fidelity overall because it stabilizes our estimates across the board. And that kind of improvement, that just takes time to materialize and data to ingest.

Arvid:

That once it does yet again it compounds was a good idea to pull it in and it's just fun because the system gets better and better over time because more data flows in and better data can be used for training it's just fun because you know two months from now the automation will make sure that the training data is even better that we can train against. Good times. Now here's a story about how my own work as a founder, as a developer, has changed over the past year. About six months or so ago, I really needed to migrate our search system, and search is pretty massive for Podscan. Like, we try to search in millions, now dozens of millions, almost 50,000,000 podcast episodes that we have transcripts for, for just keyword mentions.

Arvid:

It's quite the system. So we had to move from a setup that included a lot of handcrafted Laravel code and MailiSearch, which I should say is still a very, very great search engine for full text search at a certain scale. But we had to migrate to an OpenSearch cluster maintained and scaled autonomously on AWS. The migration obviously had a trade off because MightySearch in itself is incredibly fast. It's one of the search engines that does this kind of type ahead search really, really well.

Arvid:

It has sub dozens of milliseconds response speed. It's all in RAM. It's super fast. It's great. And we sometimes would get results back even under three digits in milliseconds, fraction of a second for a complicated search.

Arvid:

Now OpenSearch, is kind of Elasticsearch in AWS's own vernacular, is still very fast. It's definitely subsecond, but it's not quite as lightning quick as the other one. But what we gained was reliability and capability because Mindysearch started to struggle ingestion of data because it was just getting so big and I was working with the team to try and get it to scale better but we were outscaling the efforts of them building better scale into the tool. It is an open source product. What can I expect?

Arvid:

But OpenSearch is such a well tested and well established system. It can handle a lot of data really well. And because it has the Elasticsearch query DSL, the domain specific language for expressing these search queries of any complexity, it's highly configurable. We can get exactly the right results in exactly the right order for any possible request so yeah that's where we went. Here's the thing though two years ago I would never have touched this.

Arvid:

I probably would have wanted to try another kind of scaling instead than going to an Elasticsearch kind of thing. I I was burned by the complexity of Elasticsearch kind of a decade ago. Must have been, like, yeah, 2015, 2016, I experimented with that stuff. It was horrible. It always freaked me out.

Arvid:

It was such a complex and hard to compose way of querying a database. That's actually one of the reasons that I initially went with Miley Search. It had more kind of an HTTP request style of search instead of this deep nested JSON object stuff. But what I found was that now we have very capable agentic coding agents out there. And I could just tell the agent what I want the query to do and what I want the result to look like.

Arvid:

And the agent, having been trained on millions of documents, millions of lines of code containing Elasticsearch and OpenSearch queries, well, that agent understands what they do and what the results should look like. And is so much more capable than I am of crafting not just the right query I'll get there at some point but also building the logic that would compose such a query reliably and testably. And this has been the game changer for me. I've been a big fan of agentic coding. Just listen to the last 10 or 30 or 40 episodes of this podcast if you want any evidence of that but the quality of these systems isn't just the code they write it's also in the code I don't have to write and maybe more importantly the actual value is in the code I would never have written or could never have written or never would have wanted to write but they can and this particular hindrance of mine where I wouldn't even have touched Elasticsearch would have kept me from building the system that is now running extremely smoothly extremely reliably is highly customizable and powers not only search but now also internal reporting and analysis across the platform.

Arvid:

I just would not have done it if I hadn't been able to tell the agent to do it for me. So big big margin improvement over what the years before would have made impossible for me to build. Migration also gave me a reason to rework the filter and search interface of Podscan which has been received very positively. It's more of a professional tool now than a quick lookup tool because I was trying to kind of throw this sub millisecond query speed thing into, like, a professional search mask, and that never really worked well. So if I ever wanted to build a simpler quick lookup experience, I can then layer this on top with a simple query and a solid caching system, which is very nice at this point.

Arvid:

It's much more flexible. And outside of search, I've been building a lot of what I call semi automated systems. And I talked about this kind of automation before. The first 10% of the process is me, then there's 80% AI, and then the last 10% is me as well. Kind of 20% wrapped around 80%.

Arvid:

And these systems are part of my preparation for being less hands on in the day to day operations. I've noticed that it's getting a lot for me to do all the dev work and all the ops work, so I'm trying to automate certain things away. You know how it is. Every founder wants to be able to focus on things that take their full attention, not just write emails or sift through databases and records to figure things out. So the most helpful automation that I've built so far has been a targeted mid trial AI drafted outreach email.

Arvid:

It's a bit of a complex thing. It combines data about what the user has already accomplished in the application, the things that they've seen, things that they've tried, and it kind of congratulates them on having done this. Plus, that's the important part it gives them the next highest impact step to take and that's always been helpful it's been quite useful for giving people insight into the full platform capabilities at scale like with hundreds of new users a day, but also individually because it looks at their full history and then kind of figures out how to do this. So I'm building a lot of similar systems, AI assisted scraping and data lookup, data validation, data confirmation, GPT, particularly 5.2 and all the five fives, like five mini, five nano, they've gotten so good that they can reliably do 80% of the work that I used to do before when it comes to data acquisition verification. You can pretty much trust this, particularly if you use the web search feature in there or if you use scraping tools like FireCrawl to get data to verify what the AI has just presented to you.

Arvid:

You build your own little mini agents, really. That's what I've been doing. So if you're interested in hearing more about this and which of the systems have worked in which way, I can dive into the details in a future episode. Just let me know. Reach out on Twitter or send me an email.

Arvid:

I would love that to know what you would be interested in, what you might be struggling with when it comes to automations. And that's the overall picture. Long term investments in SEO, they're compounding right now. Data quality is improving through these integrations, automated integrations like OP3 and AI assisted eightytwenty validations where I still kind of drive the effort. It's not everything is completely automated.

Arvid:

I don't trust AI that much. But Agenci coding has helped me unlock capabilities that I would have never attempted on my own. And that's the Elasticsearch DSL. If you ever have done this like ten years ago, you know how painful it can be. And even if you do it professionally now and have always done it, you probably know it's not that easy to understand.

Arvid:

It's not easy to build things around it. So having Cloud Code do this for me, very cool. And then building semi automated systems that are like agents that help me get to the 20% that I can do and not have to do the 80% that I don't want to do. Those are freeing up my time for higher leverage work and non work as well. So it's been a good year for Podscan's growth and the data fidelity, which also fuels its growth.

Arvid:

So what strikes me most is how these improvements kinda feed into each other. Right? Better data makes better search results. Better search results makes happier users. They create more backlinks and more backlinks improve domain authority.

Arvid:

And the cycle continues. Patience, it turns out, is a competitive advantage. So is being willing to embrace tools that let you build things you never thought you could. So like I always say, check out Cloud Code. It really will change your development life.

Arvid:

That's it for today. Thanks for listening to The Bootstr Founder. You can find me on Twitter at avidkahl, a r v I d k a h l. And speaking of Podscan, if you're a founder, a PR expert, marketing team wondering what people are saying about your brand, this is exactly what we built. Podscan monitors 4,000,000 podcasts in real time and we alert you when influencers or customers or competitors mention you.

Arvid:

It turns all this unstructured podcast chatter into competitive intelligence and even opportunities for PR and customer insights. Very helpful. Check it out at podscan.fm and if you're searching for your own next venture, the next idea? Well take a look at ideas.podscan.fm where an AI agent identifies startup opportunities from hundreds of hours of expert discussions every day so you can build what people are already asking for. Thanks so much for listening.

Arvid:

Have a wonderful day and bye bye.