James Dooley: SRO, selection rate optimisation in the LLMs. Today I’m joined with Charles Flo. Charles, pleasure having you on. But with regards to selection rate optimisation, for anyone that doesn’t know what it is, can you just briefly help people understand what it is and why it’s important now within AI search?

Charles Flo: Yeah. So, without going into extreme technical detail around how the system actually works, SRO is the process of the AI selecting which source it’s going to extract information from and then summarise information from a number of sources. So essentially, let’s say as an example, ChatGPT goes and does five grounded searches for your specific query inside GPT for your specific conversation with it. Those five grounded searches might then pull back 50 different results per query. You’ve got 250 different results. There’s a bunch of different overlap there. There’s only a certain number of sources that it can maximum pull from. For most queries right now, it’s going to be between 14 and 16 inside ChatGPT. So let’s say from those 250 sources, you have 50 or so overlap, which means you have about 200 left. From those 200, they’ve got to select 16. SRO is the ability to get from those 200 into that selected 16 pack.

James Dooley: Yeah. So obviously on there, if someone initially does the search query, they extrapolate and maybe do, like you said, five synthetic queries, which is part of the query fan-out, and then from there they’re looking at the top 50 results. They’re bringing that back and then you’re now talking about that selection rate optimisation for the documents. Then within there they’re doing the chunking, which then forms part of the answer. But for anyone who’s watching, is there any tips and tricks with regards to trust elements, or what can be done to try to improve upon the selection rate optimisation?

Charles Flo: One hundred per cent. So the initial thing that you want to do, obviously, and the most cost-efficient thing that you can do with the highest ROI, is content-level optimisation. The content optimisation that you have to do on the page is, number one, there’s a very limited number of tokens that it’s going to actually have in the first place from those 200 results. From those results, it’s going to have a very limited amount of tokens that it can use to select from your page. So, you’re going to want to start creating these chunks of content, these optimised pieces that are specifically designed to trigger being pulled up and selected by the AI.

Now, that is obviously very context dependent on the query. It’s going to look very different how you’re going to optimise that particular snippet or that particular chunk of information. Usually, because of bias that the AI has, it’s going to be towards the top of the article and usually it’s going to be somewhere within a H2 or a H3.

It needs to be relevant enough and it needs to be on a powerful enough source in the first place to be able to rank in Bing or in Google or wherever the AI model is actually pulling those sources from in the first place. If you go and build a fresh domain and you try and rank for best casino websites, you’re probably not going to end up being pulled into the AI with a brand new domain because you’re not going to be able to rank in the first place. That domain, that page needs to be able to rank in the first place and then needs to be able to have that snippet extracted and pulled in from the rest of the source.

James Dooley: Yeah. So, obviously you’ve mentioned there with regards to the semantic content that it needs to be higher up the page ideally. I remember Dejan saying that trying to get a majority of your headings to be question-based, directly underneath the question it’s done in a semantic triple that clearly answers in a concise and clear way to the specific question, so that then they can use that chunk and pull that in. But what was interesting as well, which is why I was excited to get you on different episodes to do with parasite SEO and link building and actually building consensus.

Dejan was saying about where if there’s enough evidence without even opening up any of the documentation. So if there were 30 of those top 50 articles all saying within the title tag, URL and the meta description that James Dooley is the best SEO in the world. Let’s just throw that out there. If it came across all saying that, they didn’t even need to open the documents. They could literally just use the URL, the title tag and the description. So can you explain a little bit with regards to selection rate optimisation, why it’s important to get not just your own website but off-site and that third-party corroboration to build that consensus and be part of that selection rate optimisation?

Charles Flo: Yep, one hundred per cent. So initially, again, it’s very model dependent. So it depends on if you’re using Bing or if you’re using Google, whatever, for the grounding in the first place. Perplexity also has its own custom crawler and custom caching and all sorts of stuff, which again adds its own element there.

All of them are going to have some level of weighting that goes on essentially. So as an example, OpenAI has partnerships with news websites. These websites end up having a preferential weighting treatment over other websites. That happens across the board and that is going to be not only from a SERP level in terms of actually getting the authority to go and get in the SERPs, but also from a model level as well, from the actual off-page signals.

There’s an array of entities and knowledge-based signals and knowledge-centric signals, which is basically reinforcing training data, reinforcing those grounded signals, all of those kinds of things for the AI to actually understand and trust and validate your entity and your brand in the first place.

We’ve seen increasingly that a lot of those brands, if they don’t have any existing information on the brand, even if they are showing up in a listicle, there will be some caveat in the output from the AI. So the AI might not put it number one or it might flag it or it might put some caution emoji on it or something along those lines. Those are generally the worst things that you can have happen to be associated with your business. So we really want to make sure that there’s positive sentiment across our entity’s board so that we can make sure that any part of the training data doesn’t essentially get poisoned for that sentiment to turn on our brand.

James Dooley: Yeah, for sure. So obviously on there then you’re trying to expand upon the entity attributes. So it could be entity reviews, entity testimonials, entity case studies, entity awards and building that reputation for it. So instead of just being cited, you can be cited and it can pull the reasoning behind being cited where it could go, it’s got a lot of brilliant five-star reviews, as opposed to it saying it’s got mixed reviews or there’s not enough evidence, they’ve not won any awards or anything like that. Like you said, it can end up poisoning the LLMs.

Well, Charles, anyway, it’s been an absolute pleasure talking about selection rate optimisation. Check out the link in the description. There’s several other episodes where me and Charles are talking about what type of articles that you want to be on, building up that third-party corroboration. Charles, it’s been a pleasure.

Charles Flo: Cheers.