A weekly podcast about all things PostgreSQL
Michael: Hello and welcome to PostgresFM,
a weekly show about
all things PostgreSQL.
I am Michael, founder of pgMustard.
This is my co-host Nikolay, founder
of Postgres.AI.
Hello Nikolay, what are we talking
about today?
Nikolay: Hi Michael, let's talk
about search at a high level.
But before we proceed, let me express
a few words, a few thoughts
about the news we've got a few
hours ago, a couple of hours ago,
about Simon Riggs.
We are recording this on Wednesday,
March 27.
And Simon Riggs just passed away.
I remember him as a very bright
mind.
I remember he was not an easy person
to deal with, obviously.
I remember like 100 emails, even
100 emails, to convince him
to come to Moscow to speak at a conference.
Many people were involved, but eventually
he did, he came and it was
a great talk, but the work he did
and like in general, yeah, it's
a big loss obviously for the Postgres
community.
So yeah, condolences to family,
friends, and co-workers, ex-co-workers,
and so on.
And Simon built a lot of things
and he was quite brave to attack
very complex topics in PostgreSQL
system in general, right, in
the core of PostgreSQL, in the engine
itself.
For example, point-in-time recovery,
things related to replication.
Many achievements were made by
Simon or involving Simon.
So it's a big loss, definitely.
Michael: Yeah, over many years
as well, right?
I have only, I actually had the
opportunity to meet him a couple
of times at a couple of London
events and heard him speak.
And not only was he a great contributor
to the code base, but
I was amazed at how he was able
to communicate and educate and
also engage in community building, right?
Like he was involved in organizing
a lot of events, especially
in the UK, growing companies and
a lot more around the ecosystem
as well.
Nikolay: I must say, I remember
very well this look in Simon's
eyes, which was like, had some
sparkles.
And I remember the conference,
the first, the very first conference
in the American continent I attended
in 2007, speaking with Peter
Eisentraut.
I was a baby actually, I was, like,
involved in implementing some parts
of XML implementation functions
and type in PostgreSQL.
I remember Simon looking directly
to me with those sparks and
asking, what's your next thing
to build in PostgreSQL.
I was like, I was caught off-guard
didn't answer anything, actually.
And yeah, so this is what I remember
about Simon, this, this look,
and courage. Do I pronounce it right?
Courage, yeah.
Courage, yes, he obviously had big
courage, huge.
Michael: So.
And the ability to silence Nikolay,
that's quite the...
Nikolay: Well, yeah, yeah, yeah.
It's interesting.
Yeah.
Yeah, so it's sad, very sad. So yeah.
Michael: Yeah, absolutely. Condolences
to everybody who knew him,
and worked with him. I don't really
know how to move on from that,
actually, but we were going to
talk
Nikolay: about search.
Let's just after this small break,
let's return to the search
topic.
And it's wide, it's very wide topic,
And I guess we just want
to touch it a little bit today,
at a very high level, right?
Michael: Yeah, well, it's amazing
if you consider all the things
Postgres is used for.
Search is one of the top use cases,
but looking back at our episodes,
we've touched on it a few times,
like when we've looked at using
external databases or forgotten
what somebody called them, actually,
like a second type of database,
like a partner database or something
like that.
So, we touched on them a few times,
but never, we haven't done
anything on full-text search, we
haven't done anything more recently
on semantic search, we've done
a PG vector episode and a few
related-ish subjects, but no, it
crossed my mind that we hadn't
touched on this really as a topic.
And obviously, it's one of those subjects
that the more you learn,
the more you realize you didn't
know, or the more complicated
it gets.
So I can imagine us doing quite
a few follow-ups on the more
specifics or, you know, implementation
details or the tricky
parts.
But yeah.
Nikolay: Don't forget faceted search.
Michael: Faceted, yeah.
I could be saying it wrong, but
I think I've heard it called
faceted like boring bitmaps and
things like that.
Nikolay: Well, usually we start
from UI, and in UI.
In my head, this starts from UI.
We have some things in big form
consisting of multiple, very
different selectors and filters
and so on.
This is very common in various
marketplaces.
For example, imagine Airbnb, you
want to limit price and location,
and various categories and properties
and so on.
And let me put this on the table
right away.
Ideally, we always should have
just a single index scan or index
only scan.
This is our ultimate goal always,
but unfortunately, it's not
always possible.
Why do we need it?
Because it's the best plan, and
I think you can, like, do you
Agree or not?
Because you deal with plans
Michael: all the
Nikolay: time, with Postmaster,
explain plans and so on.
Single index scan is the best.
Michael: One of the things I love
about databases and performance
in general is when it gets to the
point where you have to trade
one thing off against another.
And I think search is one of those
topics where often we're trading
off user experience versus complexity
of the back end.
Some of the nicest search features
are just a search bar.
But without getting any input from
the user, you have to do a
lot of work on the back end to
be able to serve that in any kind
of performance way.
So you've got the complexity
of matching
the results to the intent, with
the additional caveat that you
want it to give some, at least
some results, good results quickly.
And that's a trade-off.
Like it's really easy to get, well
not easy, but you can give
great results if you can search
through everything you have and
score everything and if you've
got forever to return them, but
if you've given yourself a budget
of returning within a few hundred
milliseconds, suddenly that becomes
a more difficult problem.
So I love that it's a trade-off
and we're trading off user experience
with resources, with technical
complexity on the back end.
So I think it's one of those topics
that there are trade-offs and
yes, one of those is performance,
but sometimes I think you are
willing to pay a little bit of
performance for a better result.
Nikolay: Yeah, this is difficult,
I guess.
What are better results and what
is high quality of search, right?
Like, I remember a definition that
users should be happy, which
is very broad.
What makes users happy?
Maybe we return good results but
UI is very bad, so they are
not happy, right?
Like, it's quite an interesting
topic.
And I think you're right, but also
like I just dove into the
very bottom of performance.
Performance matters a lot, right?
If the search is very slow, the users
won't be happy and it means
poor quality of the search, right?
So we do care about performance,
but also we do care about things
like if it's a full-text search,
we want stop words to be removed
and ignored, we want some dictionaries
to be used, maybe synonyms
to be applied, and so on and so
forth, right?
This matters a lot, and of course,
but this also moves us to
the performance part, because if
these steps are slow, it's also
bad.
Why was I mentioning faceted search?
I just see a common pattern.
Postgres is huge in terms of capabilities
and extensibility and
various index types, extensions.
But we have simple problems unsolved.
For example, take full text search
and order by timestamp or
ID.
I want the very, like, instead
of old school, regular approach,
return like most relevant documents
to me, I want fresh documents
to go first because it's social
media, and this is number 1 pattern.
But also they need to follow some
full-text search query I used.
I just need to see the latest but
following some text patterns.
And this problem is unsolved in
Postgres, unfortunately.
And there is a good, the best attempt
to solve it, it's called
RUM index, which is an extension,
like a new generation of GIN
index.
But why isn't it in the core?
Because it has issues.
It's huge, it's slow, and so on.
And similar things I now observe,
not only observe, I touch them.
For example, just before we started
recording, you showed me
the Supabase blog post about
how to combine full-text search
and semantic search based on embeddings.
I don't like the word embeddings,
I like the word vectors.
Because embedding, in my opinion,
in the database, it doesn't
settle in my mind at all.
Embedding is what we embed into our
prompt.
This is content.
But a vector is a vector.
Maybe I missed something, but why
do we call vectors embeddings?
Honestly, in our database, this
column is called embeddings just
because OpenAI dictated it.
But I also see OpenAI's Assistant
APIs name state machines' states
as statuses, which also like, what's
happening there?
A status is a state, state machine,
state, like in progress,
function call, etc.
So, but it's off-topic.
So these vectors, they provide
us a great capability to have semantic
search and we have text search
and the Supabase article describes
how to combine them.
But basically we perform 2 searches
and then like merge results.
It means we cannot do pagination.
Pagination is very important.
Maybe a user needs to go to the second
page, third page.
In quality search engines, they
do need it.
And in this case, it means that
it's similar to the offset problem
we described.
Michael: That's the only solution,
I guess,
Nikolay: is pagination through
offset.
Michael: So far, yes,
Nikolay: but maybe it's possible
to combine something, right?
Honestly, GIN is also about multidimensional
things and like,
I don't know, I don't know.
It also has KNN, I don't know,
like it's, I know only parts of
things here, but what, like I don't
like the index scans and then
we combine things and we lose pagination.
I mean we can have pagination but
if we want to go to page number
100 it's insane how much data we
need to fetch and buffers will
show very bad numbers and analyze
buffers.
It means it's not working well.
And a different example, sorry, I
will finish my my complaining
speech.
So, a different example is what we
have right now.
In our bot we imported more than
900,000 emails from some 6 mailing
lists, 25 years of them.
So we have more than 1 million
documents.
And of course, immediately, like
before we'd only imported to
the bot's knowledge base only documentation,
source code, and
blog posts.
And all of them were quite relatively
fresh, almost.
But when we imported 25 years of
mailing list archives, I'm
asking, hey bot, what can you explain
to me about sub-transactions?
Okay, this is documentation, my
article, but also this is a very
good email from Bruce Momjian from
2002.
And it went to first place.
It's not good.
We need to...
Basically we need to take into
account the age of the data here,
right?
How to do that?
There's no good way.
If you work with pgVector, there's
no good way to deprioritize
old documents, to take into account
the age of the data.
So what we did, we just, when,
usually we need to find like 10
or 15, 20 entries, maximum like
100 usually entries, and embed
them as embeddings to the prompt.
So what we do, we find a thousand
entries, and then just in memory
Postgres recalculates adjusted
similarity, adjusted distance
based on a logarithm of age.
And this is how we do it.
If nothing new, okay, we are satisfied
with old documents.
But if there are...
So we take into account the age, right?
But again, this doesn't scale well.
Like, if we will have a lot of,
like, 10 million documents, it
will be worse.
And also we cannot have pagination
if we talk about search here.
Kind of a similar problem as well.
And this makes me think, great
that we have extensibility, but
these types of searches are so, like,
so different.
We have, like, what is the name
of when different things are
combined?
So it means that it's hard to build
good system which works...
Heterogeneous.
Yes, heterogeneous.
This word, I know how to spell
it but I cannot pronounce it because
I saw it many times in papers,
but in scientific papers and so
on, in technical papers, but yeah.
I'm not
Michael: even sure I know how to
pronounce it.
Heterogeneous or something like
that?
Nikolay: I cannot pronounce it
in Russian, sorry.
So, what I'm trying to say, we
are kind of in the Linux early stage.
You need to compile a lot of drivers,
like, and deal with it
to make the system work as you want,
like, as a good product, right?
Compared to some things like Elasticsearch,
when you take it and things
work together very well because
it's a single product. What do
you think about it?
This is a problem.
Accessibility has a negative side
here.
Michael: I think you've jumped
straight to, like, where are the
limits of Postgres' search capabilities
right now?
And that’s a really interesting
topic and quite deep already.
But it skips over all the things
you can do already in Postgres.
And there are a ton of different
inbuilt things or add-on modules
or extensions that mean that those
limits are being pushed further
and further.
And I think a lot of people come
from an assumption that Postgres
won't be able to handle search
super well because products like
Elasticsearch exist and are successful,
and therefore probably people
aren't doing this in the database,
but I see a lot of use cases
that can be served adequately with
good results in acceptable
response times for users without
touching any external services.
So I think you're right that there
are edges and there are limits
that can be better served by other
products, but those limits
are quite far down the road for
a lot of use cases.
You can build pretty good search
features for a lot of different
use cases, especially if you're
willing to learn exactly how
it works and factor in your own
product or services requirements.
If you're not just searching every
field for every word or like
I'm assuming like a text search
type field, it can be really powerful
already.
Nikolay: Yeah, I agree, I agree,
but yeah, well.
Michael: Can we talk about some
of them quickly, like just to
cover?
Nikolay: Let's talk about them.
I agree, and you're like basically
echoing the usual problem I
have.
I had the cases when people listening
to me said I'm a Postgres
hater, right?
So again, of course, this criticism
goes quite deep, and of course
I don't like the idea to have Elasticsearch
for full-text search.
and the need to constantly synchronize
or maybe some...
What's the name of these new vector
database systems?
Pinecone or something like that?
So you basically need to synchronize
data from your main OLTP
database all the time and you have
a lag and then you bring some
regular data there and you think
how to combine and search that
data because obviously for Elastic
you need to not only bring
textual data but you need to bring
categories to have the same
faceted search.
Sometimes people want, like, I
want to do full text search, but
again, limit price, right?
Range, some range, and this usually
is stored in a regular column
in the relational database.
And of course, we have good capabilities
to combine it with full-text
search and achieve in the single
index scan.
For example, if you use GiST index,
well, GiST is slower.
It works well for smaller datasets.
But you combine it with GiST B-tree
and GiST B-tree, right?
Or...
Michael: B-tree/GiST, I think.
Yeah.
Nikolay: Right.
So, and then you have a capability
to combine both full-text
search and numeric range filter
and have a single index scan.
This is perfect.
Again, I'm staying on the same
point.
Single index scan is the best.
But unfortunately in many cases
we cannot achieve it.
Ideally, user types something,
chooses something on the form,
presses search, or maybe it's automated,
like I don't like automated,
I like to press search explicitly.
Anyway, we have a request and this
request translates to a single
index scan and we return.
And this is an ideal case in terms
of performance.
Otherwise, for bigger data sets,
you will have very bad performance.
Michael: Well in terms of performance
but also in terms of system
resources, right?
Like, we're also not having to
use a lot of system resources
to satisfy quite a lot of searches.
Whereas a lot of the alternatives
require, well because they're
not just one index scan, require
more resources as well.
So I think it's efficient from
a couple of angles but it very
much limits what the user can search
for.
If you can, if it has to be indexable
that way some other searches
wouldn't be possible.
Like I, I don't know about you,
but since, to give people a bit
of an insight into how we do this,
we agreed on a topic about 24
hours ago and every, every product
I've used since I've been
like thinking, how does search
work here exactly?
And it's really interesting how
different products implement
it.
And not, not everyone does it the
same.
And we've been somewhat spoiled
as users by Google, in my opinion,
Google and Gmail, both of which
have an incredibly good search
features for quite a long time.
And most people have experienced
those.
But it isn't the same in every
product.
Not every product is firstly capable
of doing that.
But also it is not quite the right
trade-off for a lot of products
either.
So like a lot of products you use,
a lot of products I use, things
like Slack, for example, or Stripe,
they will encourage you to
use filters.
They let you type whatever you
want in, and they will perform
a wide search, depending on whatever
you type.
But they encourage the use of, for
example, in Slack, search within
just 1 channel, or just from 1
person, or things that filter
it right down to make those searches
much more efficient.
So it's interesting that they're
doing that partly, I guess,
to give you the results you're
exactly looking for high up, but
also, as I guess, to reduce system
use, like they, they don't
have to do as much work if you
filter it down for them.
Nikolay: All right.
Michael: So I think there are a few
things that the beginners
I think or when I was a beginner
when I didn't know quite how
this stuff worked.
I don't think I fully appreciated
the complexity of doing search
well.
So there's the basics.
When we say full-text search, by
the way, I never really understood
what the word full is doing in
there.
It's basically just text search.
Does this document or does this
sentence or something contain
this word and this word or this
word or this word and so basic
kind of text-based searches.
I don't know why it's called full,
do you know?
Nikolay: No, good question.
No?
No.
Good question.
I think, right, so I understand
the difference.
You can compare the whole value,
in this case, a bit reasonable,
but you'll be dealing with the
problem of the size of this value,
right?
And then you can...
Michael: But there's also, there's
like a million complexities
just, if we only consider that,
there's a million complexities
to do, like, should you care about
the case of, like, does the,
like, the capital letters matter?
Nikolay: So, full-text search is
a very, like, well-established
area already.
Yes.
And where instead of comparing
whole value or doing some mask,
regular expressions, right, which
is also an interesting topic,
but it's also related, like a GIN
can be, trigram search can
be used, right?
Michael: Yes.
Nikolay: Instead of that, we consider
words, like, first of all,
we, as usual, have this problem
with First Normal Form, but it's
off topic, right?
Because this value is not atomic
anymore, right?
We consider each value as...
We have atoms in this molecule,
right?
Yeah.
So, and first of all, some words,
words we usually either normalize
using this stemmer, snowball stemmer,
right?
Or we use some dictionary to find
kind of...
Michael: Synonyms or...
Nikolay: No, No, synonyms is one
thing, it's also a synonym dictionary
that can be used, but I'm talking
about Ispell dictionaries,
for example, when you take...
Oh, yeah.
When you take a word and...
Stemmers is very dumb.
It's just cut the ending.
That's it.
You can feed some new word and
it will cut it according to some
rule.
But Ispell, it's a dictionary,
it knows the language, it
knows the set of words and it can
transform words in different
forms to some, like, normalized,
it normalizes every word, right?
And then we have, basically, we
have, we can build either a tree
and use GiST, Generalized Search Tree, which can be used
for even B-tree or R-tree.
B-tree is one dimension, R-tree is
2 or more dimensions, and R-tree
is based on GiST in Postgres because
implementation based on
GiST was better than original implementation,
which was not the
case for B-tree.
B-tree remained the native implementation,
but there is GiST
implementation, that's why I already
mentioned it, right?
And then, so, tree, right?
Great.
So you can find the entries which
have your words, but also you
can...
There's an inverted in this, GIN,
right?
And GiST actually...
Oh, GiST, I didn't mention.
So, B-tree is one dimension, so just
one axis, R-tree is two or more dimensions,
and you can build trees, for example,
rectangles in two dimensions.
But what to do with text?
Because it has a lot of words.
We can consider words as kind of
array or set, right?
And then we can say, this set contains
that set, right?
Or is contained.
So we define operators, intersects,
is contained, contained.
And in this case, we talk about
sets.
And we can still build the tree
based on GiST.
There are 7 functions you need
to implement to define the operations
and like basically so for sets
we can build tree and it's called
actually R-tree, Russian Doll Tree.
This is how full-text search, it's
official name for Berkeley
paper.
We can attach the link to it.
And This is how originally full-text
search was implemented based
on GiST.
But also later it was implemented
GIN, which is General Inverted
Index, which works much better
for very large volumes of data,
and this is what search engines
use.
So it's basically a list of terms
and links to which document
terms are mentioned and then there
are internal bit trees to
find faster each term.
I think there are 2 kinds of bit
tree inside GIN, but it's like
implementation details.
So, in general, it means that we
can say, okay, these words are
present in this document.
And we can very fast find them
and we can also order by like
rank. Rank, it's an interesting thing.
It's calculated based on like most
relevant documents. Like for
example, I don't know. Like words
are like mentioned more in this
document, right?
Do we have phrase search?
Can we do double quotes?
Yeah, we have also some...
And or some...
We can write some formulas, right?
There's like
Michael: Followed by, yeah, you
can do like followed by for free
search, but there's also a, so
there's, we have some data types
and loads of functions that are
really helpful for doing this
without...
Nikolay: And 4 categories, right?
ABCD, right?
We can, you can...
Like weighting.
Michael: What do you mean by categories?
Nikolay: I don't remember.
I remember when you define GIN
index, you need to convert data
using tsvector, so you convert
to special text search vector,
tsvector type, and you can say
that some parts can be considered
one category, some parts different
categories, there are a maximum
of four categories.
And when you search, you can say,
I'm searching within only a
specific category.
It means that, for example, you
can build one TS vector, but take,
for example, if you're indexing,
for example, emails, you can
take words from the subject and mark
them as category A, for example,
right?
But the body is B.
And then you have the freedom and flexibility
to search globally,
not taking into account the origin
of the words.
Or you can limit inside a single,
same gene, same search, you can
limit saying I'm searching only
inside the subject.
So you can mark these parts of
the TS vector, which is good, but
four is not enough in many cases.
So there are many capabilities
in Postgres systems.
Michael: Yes, and you don't have
to build that much around it
to get something pretty powerful
out.
And one thing I learned about relatively
recently from a blog post
was websearch_to_tsquery.
So this TS query being the query
representation and the TS
vector being the vector representation,
like, normalized.
So once you've, like, taken each
word, normalized them, like,
plurals out of it and things like
that.
Yeah, websearch_to_tsquery means you can
achieve a query that's a bit like
you might imagine a fairly complex
search engine search, like
taking the not operator and saying
I don't want documents that
include this word or using and
and or type operators as well.
So it could let you build something
that has some basic search
engine-like features built into
the text search field without
much work at all.
And these will come like in Postgres
core.
You don't even need an extension
for this lot.
Which is pretty cool.
But yeah, I was going to move on.
I think obviously that contains
loads of complexities and helps
you solve some of the trickier
things immediately.
But there's also relatively built-in
modules for fuzzy search.
So handling typos elegantly.
The kind of things that just start
to get a bit more complicated.
Do you want to be able to match
on people nearly getting the
name right?
And not all products do, but it's
pretty common that we do want
to.
And when I first came across trigram
search or pg_trgm the
extension, I was blown away by
like how elegant a solution that
is to typos.
So were you not as impressed?
Like I was thinking it's so simple
and it works so well.
Nikolay: Trigrams.
Well, let me disagree with you.
I use it many times.
Michael: Go on.
Nikolay: For many, many years,
and I cannot say it's elegant,
because it requires, first of all,
it requires a lot of effort.
It's not just to say, I want regex
here, that's it.
No, you need to do something, like
some things you need to do.
And also at really high volumes,
it doesn't work well in terms
of performance.
Michael: Sure.
Nikolay: And if you have a lot
of updates coming, you will bump
into this dilemma, fast update
or without fast update GIN and
pending list, which is by default
4 megabytes, right?
And during regular select, it decides
it needs to be like processed,
select is timing out, then you
need to tune it.
Well, I still have a feeling it
could be better.
Michael: No doubt.
And I guess this is one of the examples
of can work really well
for a while and once you hit
a certain scale, maybe an external
system is beneficial.
But yeah.
Nikolay: Yeah, because it's more
polished.
But also I think there is an opportunity
for people who are new
to Postgres companies to improve and
benefit from a more polished
version when you develop some product
on top of Postgres, right?
Michael: Yeah, and there are some
startups, right, that are quite
focused on search, or at least
what, is it ParadeDB that are
looking into this stuff?
And then there's also one other that's
worth, I would be ashamed
to finish this episode without
mentioning them.
Yes, who do synchronization, who
take the hard parts out of synchronizing
Postgres with Elasticsearch.
Nikolay: Yeah, that's a good tagline.
Also, I wanted to mention, sorry,
I was trying to find the book
I recently bought.
For specialists, it should be super,
like, important book, I
guess, and I need to thank Andrei
Borodin, as usual.
This guy helps me a lot with Postgres
in general.
So this book is called Introduction
to Information Retrieval
by Christopher Manning, Prabhakar
Raghavan, I'm sorry, and Hinrich
Schütze.
So this book is interesting and
this is exactly where I saw this
like, what is good quality?
Users are happy.
And then a lot of formulas, right?
It's interesting.
Michael: Well, and it's a moving
target.
One of the good blog posts I re-read
as part of doing research
for this was by Justin Searls
about a Ruby implementation and
he made a really good point at
the end about it being a moving
goalpost.
So users might be happy with waiting
a few seconds 1 year and
then 5 or 10 years later that may
not be considered good enough
anymore because they can use other
search engines that are much
faster.
Or your data volumes grow.
You talked about, you know, if
your implementation relies on,
well, volume might change, but
also patterns might change.
And you might find it's harder
to provide as good results as
your data changes or as your users
change in their expectations.
So it's kind of a moving goalpost
constantly as well.
So not only might the same results
10 years later not be good
enough, but also like, yeah, it's
a tricky one, but I think, I
think user happiness is one good
one, but also Google uses lots of
metrics like...
Nikolay: Yeah.
I know what I think about.
Michael: Go on.
Nikolay: So since we talk about
semantic search, which supports
some kind of AI systems, like these
embeddings, I'm thinking
about not user happiness, but LLM
happiness, so to speak.
So, and I think that usually we
deal with very large documents
And when we generate vectors there
are certain limits, for example
OpenAI limit is 8,192 tokens roughly
like 15-16 characters.
And for example my article about
sub-transactions it exceeds
30,000 characters.
So it was hard to vectorize it,
right?
And we needed to summarize it first
using LLM and then vectorize
the summary only, so unfortunately,
right?
It works, okay, but what I'm trying
to say, when we talk about
traditional search engines, search
results are not whole documents,
they are snippets.
Michael: True.
Nikolay: And it's also a part of
this happiness or quality is
how we present results.
For example, we can provide snippets
and highlight the words
which are present in the query,
right?
It's good.
It's good for user experience.
User immediately sees the words
mentioned that user typed.
If it's a synonym, it will be different,
but anyway, it's a good
practice.
But if you think about Google,
how it works, you see some results,
first page, second page and so
on, but then it's good when very
relevant results are on the very
first page, right?
Maybe on the first top, like, included
in the first page and
you're satisfied.
We like, there is a new topic, LLM happiness.
They should be satisfied.
But what does it mean?
It means that what we decided to
include, to embed, to prompt
and use in answers should be very
relevant, right?
And this is a very big topic which
is not yet discovered properly.
In my opinion, what I have in my
mind, I'm just sharing and if
some guys are interested, I would
love to have a discussion around
it.
So what I think, this is what we
are going to do.
Not yet, we are going to do it.
We are going to return many snippets
or summaries or something
like that, not whole documents.
And then ask LLM internally during
the same cycle of requests.
We just say, LLM, evaluate, is
it relevant?
Do you think based on this snippet,
is it relevant to your original
request?
And if it's on the scale from 1
to 10, we limit the result and
then it's important, and this is
how humans behave.
We open each document and inspect
it fully and think again, is
it relevant?
And only then, and maybe we go
to the second page, I don't know,
like maybe the second page is not needed,
we just, our first page
can be bigger because everything
is automated here.
But we inspect the full document and
decide, is it worth keeping
and using in the answer or no?
Maybe not.
Maybe in the end of this process,
we will have 0 documents left.
Maybe we need to think maybe to
provide a different query.
And what I see from LLM, sometimes
if we do this, like you ask
something, there is this rank system
which performs a search
and you are not satisfied and if
you just tell like if you consider
LLM as a junior engineer you just
say I'm not satisfied you can
do better. Invent some new searches
and try them.
You don't say exactly what searches
to use.
In all cases I did this, I was
satisfied on the second step.
So I mean the internal process
of searching can be very complex,
it might take longer, much longer,
but the result will be much
better.
And the question about quality
here is very, it's shifting.
This semantic search should be
somehow different to be considered
as good quality for this type of
process, for AI systems.
So I'm reading this book, understanding
that it was written for
search engines targeted humans,
and I'm very interested how this
topic will be changed.
I'm sure a lot of things will be
inherited.
A lot of science is there, like
huge science.
Search engine, like information
search, is a huge topic, right?
So a lot of things will be inherited,
but there will be changes
as well, because of a high level
of automation, and I'm very curious
how quality will be changed.
So this is a very interesting topic.
I have many more questions than answers.
Yeah.
Thank you for listening to this
long speech.
It just sits in my head right now,
like what will happen
Michael: with search.
Don't you consider search engines
semantic search too?
Nikolay: Yeah, well, Google and all top search engines,
they do semantic search for many,
many years already.
I know that.
But now we, like, they didn't provide,
like, do you know API
to Google?
I don't know.
I only know some workaround solutions
to it, right?
But now we are building small Googles
for our small knowledge
bases, and it's interesting.
What, like, pgvector is a very
basic thing.
Of course, a lot of work there,
a lot, like, 2 types of indexes
it provides, and so on, performance
improvements but what to
do with age What to do with this
embedding process?
Yeah, these questions are big right
now in my head.
Michael: Cool.
I'm looking forward to some in-depth
ones on this.
Nikolay: I hope we will have some
follow-up episodes, maybe about
full-text search as well, and semantic
search as well, and some
faceted search as well.
Yeah.
Michael: Sounds good.
All right, thanks so much Nikolay,
take care.