Postgres FM

Nikolay and Michael have a high-level discussion on all things search — touching on full-text search, semantic search, and faceted search. They discuss what comes in Postgres core, what is possible via extensions, and some thoughts on performance vs implementation complexity vs user experience.

 
Here are some links to things they mentioned:

~~~

What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!

~~~

Postgres FM is brought to you by:
With special thanks to:

Creators & Guests

Host
Michael Christofides
Founder of pgMustard
Host
Nikolay Samokhvalov
Founder of Postgres AI

What is Postgres FM?

A weekly podcast about all things PostgreSQL

Michael: Hello and welcome to PostgresFM,
a weekly show about

all things PostgreSQL.

I am Michael, founder of pgMustard.

This is my co-host Nikolay, founder
of Postgres.AI.

Hello Nikolay, what are we talking
about today?

Nikolay: Hi Michael, let's talk
about search at a high level.

But before we proceed, let me express
a few words, a few thoughts

about the news we've got a few
hours ago, a couple of hours ago,

about Simon Riggs.

We are recording this on Wednesday,
March 27.

And Simon Riggs just passed away.

I remember him as a very bright
mind.

I remember he was not an easy person
to deal with, obviously.

I remember like 100 emails, even
100 emails, to convince him

to come to Moscow to speak at a conference.

Many people were involved, but eventually
he did, he came and it was

a great talk, but the work he did
and like in general, yeah, it's

a big loss obviously for the Postgres
community.

So yeah, condolences to family,
friends, and co-workers, ex-co-workers,

and so on.

And Simon built a lot of things
and he was quite brave to attack

very complex topics in PostgreSQL
system in general, right, in

the core of PostgreSQL, in the engine
itself.

For example, point-in-time recovery,
things related to replication.

Many achievements were made by
Simon or involving Simon.

So it's a big loss, definitely.

Michael: Yeah, over many years
as well, right?

I have only, I actually had the
opportunity to meet him a couple

of times at a couple of London
events and heard him speak.

And not only was he a great contributor
to the code base, but

I was amazed at how he was able
to communicate and educate and

also engage in community building, right?

Like he was involved in organizing
a lot of events, especially

in the UK, growing companies and
a lot more around the ecosystem

as well.

Nikolay: I must say, I remember
very well this look in Simon's

eyes, which was like, had some
sparkles.

And I remember the conference,
the first, the very first conference

in the American continent I attended
in 2007, speaking with Peter

Eisentraut.

I was a baby actually, I was, like,
involved in implementing some parts

of XML implementation functions
and type in PostgreSQL.

I remember Simon looking directly
to me with those sparks and

asking, what's your next thing
to build in PostgreSQL.

I was like, I was caught off-guard
didn't answer anything, actually.

And yeah, so this is what I remember
about Simon, this, this look,

and courage. Do I pronounce it right?

Courage, yeah.

Courage, yes, he obviously had big
courage, huge.

Michael: So.

And the ability to silence Nikolay,
that's quite the...

Nikolay: Well, yeah, yeah, yeah.

It's interesting.

Yeah.

Yeah, so it's sad, very sad. So yeah.

Michael: Yeah, absolutely. Condolences
to everybody who knew him,

and worked with him. I don't really
know how to move on from that,

actually, but we were going to
talk

Nikolay: about search.

Let's just after this small break,
let's return to the search

topic.

And it's wide, it's very wide topic,
And I guess we just want

to touch it a little bit today,
at a very high level, right?

Michael: Yeah, well, it's amazing
if you consider all the things

Postgres is used for.

Search is one of the top use cases,
but looking back at our episodes,

we've touched on it a few times,
like when we've looked at using

external databases or forgotten
what somebody called them, actually,

like a second type of database,
like a partner database or something

like that.

So, we touched on them a few times,
but never, we haven't done

anything on full-text search, we
haven't done anything more recently

on semantic search, we've done
a PG vector episode and a few

related-ish subjects, but no, it
crossed my mind that we hadn't

touched on this really as a topic.

And obviously, it's one of those subjects
that the more you learn,

the more you realize you didn't
know, or the more complicated

it gets.

So I can imagine us doing quite
a few follow-ups on the more

specifics or, you know, implementation
details or the tricky

parts.

But yeah.

Nikolay: Don't forget faceted search.

Michael: Faceted, yeah.

I could be saying it wrong, but
I think I've heard it called

faceted like boring bitmaps and
things like that.

Nikolay: Well, usually we start
from UI, and in UI.

In my head, this starts from UI.

We have some things in big form
consisting of multiple, very

different selectors and filters
and so on.

This is very common in various
marketplaces.

For example, imagine Airbnb, you
want to limit price and location,

and various categories and properties
and so on.

And let me put this on the table
right away.

Ideally, we always should have
just a single index scan or index

only scan.

This is our ultimate goal always,
but unfortunately, it's not

always possible.

Why do we need it?

Because it's the best plan, and
I think you can, like, do you

Agree or not?

Because you deal with plans

Michael: all the

Nikolay: time, with Postmaster,
explain plans and so on.

Single index scan is the best.

Michael: One of the things I love
about databases and performance

in general is when it gets to the
point where you have to trade

one thing off against another.

And I think search is one of those
topics where often we're trading

off user experience versus complexity
of the back end.

Some of the nicest search features
are just a search bar.

But without getting any input from
the user, you have to do a

lot of work on the back end to
be able to serve that in any kind

of performance way.

So you've got the complexity
of matching

the results to the intent, with
the additional caveat that you

want it to give some, at least
some results, good results quickly.

And that's a trade-off.

Like it's really easy to get, well
not easy, but you can give

great results if you can search
through everything you have and

score everything and if you've
got forever to return them, but

if you've given yourself a budget
of returning within a few hundred

milliseconds, suddenly that becomes
a more difficult problem.

So I love that it's a trade-off
and we're trading off user experience

with resources, with technical
complexity on the back end.

So I think it's one of those topics
that there are trade-offs and

yes, one of those is performance,
but sometimes I think you are

willing to pay a little bit of
performance for a better result.

Nikolay: Yeah, this is difficult,
I guess.

What are better results and what
is high quality of search, right?

Like, I remember a definition that
users should be happy, which

is very broad.

What makes users happy?

Maybe we return good results but
UI is very bad, so they are

not happy, right?

Like, it's quite an interesting
topic.

And I think you're right, but also
like I just dove into the

very bottom of performance.

Performance matters a lot, right?

If the search is very slow, the users
won't be happy and it means

poor quality of the search, right?

So we do care about performance,
but also we do care about things

like if it's a full-text search,
we want stop words to be removed

and ignored, we want some dictionaries
to be used, maybe synonyms

to be applied, and so on and so
forth, right?

This matters a lot, and of course,
but this also moves us to

the performance part, because if
these steps are slow, it's also

bad.

Why was I mentioning faceted search?

I just see a common pattern.

Postgres is huge in terms of capabilities
and extensibility and

various index types, extensions.

But we have simple problems unsolved.

For example, take full text search
and order by timestamp or

ID.

I want the very, like, instead
of old school, regular approach,

return like most relevant documents
to me, I want fresh documents

to go first because it's social
media, and this is number 1 pattern.

But also they need to follow some
full-text search query I used.

I just need to see the latest but
following some text patterns.

And this problem is unsolved in
Postgres, unfortunately.

And there is a good, the best attempt
to solve it, it's called

RUM index, which is an extension,
like a new generation of GIN

index.

But why isn't it in the core?

Because it has issues.

It's huge, it's slow, and so on.

And similar things I now observe,
not only observe, I touch them.

For example, just before we started
recording, you showed me

the Supabase blog post about
how to combine full-text search

and semantic search based on embeddings.

I don't like the word embeddings,
I like the word vectors.

Because embedding, in my opinion,
in the database, it doesn't

settle in my mind at all.

Embedding is what we embed into our
prompt.

This is content.

But a vector is a vector.

Maybe I missed something, but why
do we call vectors embeddings?

Honestly, in our database, this
column is called embeddings just

because OpenAI dictated it.

But I also see OpenAI's Assistant
APIs name state machines' states

as statuses, which also like, what's
happening there?

A status is a state, state machine,
state, like in progress,

function call, etc.

So, but it's off-topic.

So these vectors, they provide
us a great capability to have semantic

search and we have text search
and the Supabase article describes

how to combine them.

But basically we perform 2 searches
and then like merge results.

It means we cannot do pagination.

Pagination is very important.

Maybe a user needs to go to the second
page, third page.

In quality search engines, they
do need it.

And in this case, it means that
it's similar to the offset problem

we described.

Michael: That's the only solution,
I guess,

Nikolay: is pagination through
offset.

Michael: So far, yes,

Nikolay: but maybe it's possible
to combine something, right?

Honestly, GIN is also about multidimensional
things and like,

I don't know, I don't know.

It also has KNN, I don't know,
like it's, I know only parts of

things here, but what, like I don't
like the index scans and then

we combine things and we lose pagination.

I mean we can have pagination but
if we want to go to page number

100 it's insane how much data we
need to fetch and buffers will

show very bad numbers and analyze
buffers.

It means it's not working well.

And a different example, sorry, I
will finish my my complaining

speech.

So, a different example is what we
have right now.

In our bot we imported more than
900,000 emails from some 6 mailing

lists, 25 years of them.

So we have more than 1 million
documents.

And of course, immediately, like
before we'd only imported to

the bot's knowledge base only documentation,
source code, and

blog posts.

And all of them were quite relatively
fresh, almost.

But when we imported 25 years of
mailing list archives, I'm

asking, hey bot, what can you explain
to me about sub-transactions?

Okay, this is documentation, my
article, but also this is a very

good email from Bruce Momjian from
2002.

And it went to first place.

It's not good.

We need to...

Basically we need to take into
account the age of the data here,

right?

How to do that?

There's no good way.

If you work with pgVector, there's
no good way to deprioritize

old documents, to take into account
the age of the data.

So what we did, we just, when,
usually we need to find like 10

or 15, 20 entries, maximum like
100 usually entries, and embed

them as embeddings to the prompt.

So what we do, we find a thousand
entries, and then just in memory

Postgres recalculates adjusted
similarity, adjusted distance

based on a logarithm of age.

And this is how we do it.

If nothing new, okay, we are satisfied
with old documents.

But if there are...

So we take into account the age, right?

But again, this doesn't scale well.

Like, if we will have a lot of,
like, 10 million documents, it

will be worse.

And also we cannot have pagination
if we talk about search here.

Kind of a similar problem as well.

And this makes me think, great
that we have extensibility, but

these types of searches are so, like,
so different.

We have, like, what is the name
of when different things are

combined?

So it means that it's hard to build
good system which works...

Heterogeneous.

Yes, heterogeneous.

This word, I know how to spell
it but I cannot pronounce it because

I saw it many times in papers,
but in scientific papers and so

on, in technical papers, but yeah.

I'm not

Michael: even sure I know how to
pronounce it.

Heterogeneous or something like
that?

Nikolay: I cannot pronounce it
in Russian, sorry.

So, what I'm trying to say, we
are kind of in the Linux early stage.

You need to compile a lot of drivers,
like, and deal with it

to make the system work as you want,
like, as a good product, right?

Compared to some things like Elasticsearch,
when you take it and things

work together very well because
it's a single product. What do

you think about it?

This is a problem.

Accessibility has a negative side
here.

Michael: I think you've jumped
straight to, like, where are the

limits of Postgres' search capabilities
right now?

And that’s a really interesting
topic and quite deep already.

But it skips over all the things
you can do already in Postgres.

And there are a ton of different
inbuilt things or add-on modules

or extensions that mean that those
limits are being pushed further

and further.

And I think a lot of people come
from an assumption that Postgres

won't be able to handle search
super well because products like

Elasticsearch exist and are successful,
and therefore probably people

aren't doing this in the database,
but I see a lot of use cases

that can be served adequately with
good results in acceptable

response times for users without
touching any external services.

So I think you're right that there
are edges and there are limits

that can be better served by other
products, but those limits

are quite far down the road for
a lot of use cases.

You can build pretty good search
features for a lot of different

use cases, especially if you're
willing to learn exactly how

it works and factor in your own
product or services requirements.

If you're not just searching every
field for every word or like

I'm assuming like a text search
type field, it can be really powerful

already.

Nikolay: Yeah, I agree, I agree,
but yeah, well.

Michael: Can we talk about some
of them quickly, like just to

cover?

Nikolay: Let's talk about them.

I agree, and you're like basically
echoing the usual problem I

have.

I had the cases when people listening
to me said I'm a Postgres

hater, right?

So again, of course, this criticism
goes quite deep, and of course

I don't like the idea to have Elasticsearch
for full-text search.

and the need to constantly synchronize
or maybe some...

What's the name of these new vector
database systems?

Pinecone or something like that?

So you basically need to synchronize
data from your main OLTP

database all the time and you have
a lag and then you bring some

regular data there and you think
how to combine and search that

data because obviously for Elastic
you need to not only bring

textual data but you need to bring
categories to have the same

faceted search.

Sometimes people want, like, I
want to do full text search, but

again, limit price, right?

Range, some range, and this usually
is stored in a regular column

in the relational database.

And of course, we have good capabilities
to combine it with full-text

search and achieve in the single
index scan.

For example, if you use GiST index,
well, GiST is slower.

It works well for smaller datasets.

But you combine it with GiST B-tree
and GiST B-tree, right?

Or...

Michael: B-tree/GiST, I think.

Yeah.

Nikolay: Right.

So, and then you have a capability
to combine both full-text

search and numeric range filter
and have a single index scan.

This is perfect.

Again, I'm staying on the same
point.

Single index scan is the best.

But unfortunately in many cases
we cannot achieve it.

Ideally, user types something,
chooses something on the form,

presses search, or maybe it's automated,
like I don't like automated,

I like to press search explicitly.

Anyway, we have a request and this
request translates to a single

index scan and we return.

And this is an ideal case in terms
of performance.

Otherwise, for bigger data sets,
you will have very bad performance.

Michael: Well in terms of performance
but also in terms of system

resources, right?

Like, we're also not having to
use a lot of system resources

to satisfy quite a lot of searches.

Whereas a lot of the alternatives
require, well because they're

not just one index scan, require
more resources as well.

So I think it's efficient from
a couple of angles but it very

much limits what the user can search
for.

If you can, if it has to be indexable
that way some other searches

wouldn't be possible.

Like I, I don't know about you,
but since, to give people a bit

of an insight into how we do this,
we agreed on a topic about 24

hours ago and every, every product
I've used since I've been

like thinking, how does search
work here exactly?

And it's really interesting how
different products implement

it.

And not, not everyone does it the
same.

And we've been somewhat spoiled
as users by Google, in my opinion,

Google and Gmail, both of which
have an incredibly good search

features for quite a long time.

And most people have experienced
those.

But it isn't the same in every
product.

Not every product is firstly capable
of doing that.

But also it is not quite the right
trade-off for a lot of products

either.

So like a lot of products you use,
a lot of products I use, things

like Slack, for example, or Stripe,
they will encourage you to

use filters.

They let you type whatever you
want in, and they will perform

a wide search, depending on whatever
you type.

But they encourage the use of, for
example, in Slack, search within

just 1 channel, or just from 1
person, or things that filter

it right down to make those searches
much more efficient.

So it's interesting that they're
doing that partly, I guess,

to give you the results you're
exactly looking for high up, but

also, as I guess, to reduce system
use, like they, they don't

have to do as much work if you
filter it down for them.

Nikolay: All right.

Michael: So I think there are a few
things that the beginners

I think or when I was a beginner
when I didn't know quite how

this stuff worked.

I don't think I fully appreciated
the complexity of doing search

well.

So there's the basics.

When we say full-text search, by
the way, I never really understood

what the word full is doing in
there.

It's basically just text search.

Does this document or does this
sentence or something contain

this word and this word or this
word or this word and so basic

kind of text-based searches.

I don't know why it's called full,
do you know?

Nikolay: No, good question.

No?
No.

Good question.

I think, right, so I understand
the difference.

You can compare the whole value,
in this case, a bit reasonable,

but you'll be dealing with the
problem of the size of this value,

right?

And then you can...

Michael: But there's also, there's
like a million complexities

just, if we only consider that,
there's a million complexities

to do, like, should you care about
the case of, like, does the,

like, the capital letters matter?

Nikolay: So, full-text search is
a very, like, well-established

area already.

Yes.

And where instead of comparing
whole value or doing some mask,

regular expressions, right, which
is also an interesting topic,

but it's also related, like a GIN
can be, trigram search can

be used, right?

Michael: Yes.

Nikolay: Instead of that, we consider
words, like, first of all,

we, as usual, have this problem
with First Normal Form, but it's

off topic, right?

Because this value is not atomic
anymore, right?

We consider each value as...

We have atoms in this molecule,
right?

Yeah.

So, and first of all, some words,
words we usually either normalize

using this stemmer, snowball stemmer,
right?

Or we use some dictionary to find
kind of...

Michael: Synonyms or...

Nikolay: No, No, synonyms is one
thing, it's also a synonym dictionary

that can be used, but I'm talking
about Ispell dictionaries,

for example, when you take...

Oh, yeah.

When you take a word and...

Stemmers is very dumb.

It's just cut the ending.

That's it.

You can feed some new word and
it will cut it according to some

rule.

But Ispell, it's a dictionary,
it knows the language, it

knows the set of words and it can
transform words in different

forms to some, like, normalized,
it normalizes every word, right?

And then we have, basically, we
have, we can build either a tree

and use GiST, Generalized Search Tree, which can be used

for even B-tree or R-tree.

B-tree is one dimension, R-tree is
2 or more dimensions, and R-tree

is based on GiST in Postgres because
implementation based on

GiST was better than original implementation,
which was not the

case for B-tree.

B-tree remained the native implementation,
but there is GiST

implementation, that's why I already
mentioned it, right?

And then, so, tree, right?

Great.

So you can find the entries which
have your words, but also you

can...

There's an inverted in this, GIN,
right?

And GiST actually...

Oh, GiST, I didn't mention.

So, B-tree is one dimension, so just
one axis, R-tree is two or more dimensions,

and you can build trees, for example,
rectangles in two dimensions.

But what to do with text?

Because it has a lot of words.

We can consider words as kind of
array or set, right?

And then we can say, this set contains
that set, right?

Or is contained.

So we define operators, intersects,
is contained, contained.

And in this case, we talk about
sets.

And we can still build the tree
based on GiST.

There are 7 functions you need
to implement to define the operations

and like basically so for sets
we can build tree and it's called

actually R-tree, Russian Doll Tree.

This is how full-text search, it's
official name for Berkeley

paper.

We can attach the link to it.

And This is how originally full-text
search was implemented based

on GiST.

But also later it was implemented
GIN, which is General Inverted

Index, which works much better
for very large volumes of data,

and this is what search engines
use.

So it's basically a list of terms
and links to which document

terms are mentioned and then there
are internal bit trees to

find faster each term.

I think there are 2 kinds of bit
tree inside GIN, but it's like

implementation details.

So, in general, it means that we
can say, okay, these words are

present in this document.

And we can very fast find them
and we can also order by like

rank. Rank, it's an interesting thing.

It's calculated based on like most
relevant documents. Like for

example, I don't know. Like words
are like mentioned more in this

document, right?

Do we have phrase search?

Can we do double quotes?

Yeah, we have also some...

And or some...

We can write some formulas, right?

There's like

Michael: Followed by, yeah, you
can do like followed by for free

search, but there's also a, so
there's, we have some data types

and loads of functions that are
really helpful for doing this

without...

Nikolay: And 4 categories, right?

ABCD, right?

We can, you can...

Like weighting.

Michael: What do you mean by categories?

Nikolay: I don't remember.

I remember when you define GIN
index, you need to convert data

using tsvector, so you convert
to special text search vector,

tsvector type, and you can say
that some parts can be considered

one category, some parts different
categories, there are a maximum

of four categories.

And when you search, you can say,
I'm searching within only a

specific category.

It means that, for example, you
can build one TS vector, but take,

for example, if you're indexing,
for example, emails, you can

take words from the subject and mark
them as category A, for example,

right?

But the body is B.

And then you have the freedom and flexibility
to search globally,

not taking into account the origin
of the words.

Or you can limit inside a single,
same gene, same search, you can

limit saying I'm searching only
inside the subject.

So you can mark these parts of
the TS vector, which is good, but

four is not enough in many cases.

So there are many capabilities
in Postgres systems.

Michael: Yes, and you don't have
to build that much around it

to get something pretty powerful
out.

And one thing I learned about relatively
recently from a blog post

was websearch_to_tsquery.

So this TS query being the query
representation and the TS

vector being the vector representation,
like, normalized.

So once you've, like, taken each
word, normalized them, like,

plurals out of it and things like
that.

Yeah, websearch_to_tsquery means you can
achieve a query that's a bit like

you might imagine a fairly complex
search engine search, like

taking the not operator and saying
I don't want documents that

include this word or using and
and or type operators as well.

So it could let you build something
that has some basic search

engine-like features built into
the text search field without

much work at all.

And these will come like in Postgres
core.

You don't even need an extension
for this lot.

Which is pretty cool.

But yeah, I was going to move on.

I think obviously that contains
loads of complexities and helps

you solve some of the trickier
things immediately.

But there's also relatively built-in
modules for fuzzy search.

So handling typos elegantly.

The kind of things that just start
to get a bit more complicated.

Do you want to be able to match
on people nearly getting the

name right?

And not all products do, but it's
pretty common that we do want

to.

And when I first came across trigram
search or pg_trgm the

extension, I was blown away by
like how elegant a solution that

is to typos.

So were you not as impressed?

Like I was thinking it's so simple
and it works so well.

Nikolay: Trigrams.

Well, let me disagree with you.

I use it many times.

Michael: Go on.

Nikolay: For many, many years,
and I cannot say it's elegant,

because it requires, first of all,
it requires a lot of effort.

It's not just to say, I want regex
here, that's it.

No, you need to do something, like
some things you need to do.

And also at really high volumes,
it doesn't work well in terms

of performance.

Michael: Sure.

Nikolay: And if you have a lot
of updates coming, you will bump

into this dilemma, fast update
or without fast update GIN and

pending list, which is by default
4 megabytes, right?

And during regular select, it decides
it needs to be like processed,

select is timing out, then you
need to tune it.

Well, I still have a feeling it
could be better.

Michael: No doubt.

And I guess this is one of the examples
of can work really well

for a while and once you hit
a certain scale, maybe an external

system is beneficial.

But yeah.

Nikolay: Yeah, because it's more
polished.

But also I think there is an opportunity
for people who are new

to Postgres companies to improve and
benefit from a more polished

version when you develop some product
on top of Postgres, right?

Michael: Yeah, and there are some
startups, right, that are quite

focused on search, or at least
what, is it ParadeDB that are

looking into this stuff?

And then there's also one other that's
worth, I would be ashamed

to finish this episode without
mentioning them.

Yes, who do synchronization, who
take the hard parts out of synchronizing

Postgres with Elasticsearch.

Nikolay: Yeah, that's a good tagline.

Also, I wanted to mention, sorry,
I was trying to find the book

I recently bought.

For specialists, it should be super,
like, important book, I

guess, and I need to thank Andrei
Borodin, as usual.

This guy helps me a lot with Postgres
in general.

So this book is called Introduction
to Information Retrieval

by Christopher Manning, Prabhakar
Raghavan, I'm sorry, and Hinrich

Schütze.

So this book is interesting and
this is exactly where I saw this

like, what is good quality?

Users are happy.

And then a lot of formulas, right?

It's interesting.

Michael: Well, and it's a moving
target.

One of the good blog posts I re-read
as part of doing research

for this was by Justin Searls
about a Ruby implementation and

he made a really good point at
the end about it being a moving

goalpost.

So users might be happy with waiting
a few seconds 1 year and

then 5 or 10 years later that may
not be considered good enough

anymore because they can use other
search engines that are much

faster.

Or your data volumes grow.

You talked about, you know, if
your implementation relies on,

well, volume might change, but
also patterns might change.

And you might find it's harder
to provide as good results as

your data changes or as your users
change in their expectations.

So it's kind of a moving goalpost
constantly as well.

So not only might the same results
10 years later not be good

enough, but also like, yeah, it's
a tricky one, but I think, I

think user happiness is one good
one, but also Google uses lots of

metrics like...

Nikolay: Yeah.

I know what I think about.

Michael: Go on.

Nikolay: So since we talk about
semantic search, which supports

some kind of AI systems, like these
embeddings, I'm thinking

about not user happiness, but LLM
happiness, so to speak.

So, and I think that usually we
deal with very large documents

And when we generate vectors there
are certain limits, for example

OpenAI limit is 8,192 tokens roughly
like 15-16 characters.

And for example my article about
sub-transactions it exceeds

30,000 characters.

So it was hard to vectorize it,
right?

And we needed to summarize it first
using LLM and then vectorize

the summary only, so unfortunately,
right?

It works, okay, but what I'm trying
to say, when we talk about

traditional search engines, search
results are not whole documents,

they are snippets.

Michael: True.

Nikolay: And it's also a part of
this happiness or quality is

how we present results.

For example, we can provide snippets
and highlight the words

which are present in the query,
right?

It's good.

It's good for user experience.

User immediately sees the words
mentioned that user typed.

If it's a synonym, it will be different,
but anyway, it's a good

practice.

But if you think about Google,
how it works, you see some results,

first page, second page and so
on, but then it's good when very

relevant results are on the very
first page, right?

Maybe on the first top, like, included
in the first page and

you're satisfied.

We like, there is a new topic, LLM happiness.

They should be satisfied.

But what does it mean?

It means that what we decided to
include, to embed, to prompt

and use in answers should be very
relevant, right?

And this is a very big topic which
is not yet discovered properly.

In my opinion, what I have in my
mind, I'm just sharing and if

some guys are interested, I would
love to have a discussion around

it.

So what I think, this is what we
are going to do.

Not yet, we are going to do it.

We are going to return many snippets
or summaries or something

like that, not whole documents.

And then ask LLM internally during
the same cycle of requests.

We just say, LLM, evaluate, is
it relevant?

Do you think based on this snippet,
is it relevant to your original

request?

And if it's on the scale from 1
to 10, we limit the result and

then it's important, and this is
how humans behave.

We open each document and inspect
it fully and think again, is

it relevant?

And only then, and maybe we go
to the second page, I don't know,

like maybe the second page is not needed,
we just, our first page

can be bigger because everything
is automated here.

But we inspect the full document and
decide, is it worth keeping

and using in the answer or no?

Maybe not.

Maybe in the end of this process,
we will have 0 documents left.

Maybe we need to think maybe to
provide a different query.

And what I see from LLM, sometimes
if we do this, like you ask

something, there is this rank system
which performs a search

and you are not satisfied and if
you just tell like if you consider

LLM as a junior engineer you just
say I'm not satisfied you can

do better. Invent some new searches
and try them.

You don't say exactly what searches
to use.

In all cases I did this, I was
satisfied on the second step.

So I mean the internal process
of searching can be very complex,

it might take longer, much longer,
but the result will be much

better.

And the question about quality
here is very, it's shifting.

This semantic search should be
somehow different to be considered

as good quality for this type of
process, for AI systems.

So I'm reading this book, understanding
that it was written for

search engines targeted humans,
and I'm very interested how this

topic will be changed.

I'm sure a lot of things will be
inherited.

A lot of science is there, like
huge science.

Search engine, like information
search, is a huge topic, right?

So a lot of things will be inherited,
but there will be changes

as well, because of a high level
of automation, and I'm very curious

how quality will be changed.

So this is a very interesting topic.

I have many more questions than answers.

Yeah.

Thank you for listening to this
long speech.

It just sits in my head right now,
like what will happen

Michael: with search.

Don't you consider search engines
semantic search too?

Nikolay: Yeah, well, Google and all top search engines,

they do semantic search for many,
many years already.

I know that.

But now we, like, they didn't provide,
like, do you know API

to Google?

I don't know.

I only know some workaround solutions
to it, right?

But now we are building small Googles
for our small knowledge

bases, and it's interesting.

What, like, pgvector is a very
basic thing.

Of course, a lot of work there,
a lot, like, 2 types of indexes

it provides, and so on, performance
improvements but what to

do with age What to do with this
embedding process?

Yeah, these questions are big right
now in my head.

Michael: Cool.

I'm looking forward to some in-depth
ones on this.

Nikolay: I hope we will have some
follow-up episodes, maybe about

full-text search as well, and semantic
search as well, and some

faceted search as well.

Yeah.

Michael: Sounds good.

All right, thanks so much Nikolay,
take care.