Recsperts - Recommender Systems Experts

In episode 15 of Recsperts, we delve into podcast recommendations with senior data scientist, Mirza Klimenta. Mirza discusses his work on the ARD Audiothek, a public broadcaster of audio-on-demand content, where he is part of pub. Public Value Technologies, a subsidiary of the two regional public broadcasters BR and SWR.

We explore the use and potency of simple algorithms and ways to mitigate popularity bias in data and recommendations. We also cover collaborative filtering and various approaches for content-based podcast recommendations, drawing on Mirza's expertise in multidimensional scaling for graph drawings. Additionally, Mirza sheds light on the responsibility of a public broadcaster in providing diversified content recommendations.

Towards the end of the episode, Mirza shares personal insights on his side project of becoming a novelist. Tune in for an informative and engaging conversation.

Enjoy this enriching episode of RECSPERTS - Recommender Systems Experts.

  • (00:00) - Episode Overview
  • (01:43) - Introduction Mirza Klimenta
  • (08:06) - About ARD Audiothek
  • (21:16) - Recommenders for the ARD Audiothek
  • (30:03) - User Engagement and Feedback Signals
  • (46:05) - Optimization beyond Accuracy
  • (51:39) - Next RecSys Steps for the Audiothek
  • (57:16) - Underserved User Groups
  • (01:04:16) - Cold-Start Mitigation
  • (01:05:06) - Diversity in Recommendations
  • (01:07:50) - Further Challenges in RecSys
  • (01:10:03) - Being a Novelist
  • (01:16:07) - Closing Remarks

Links from the Episode:
Papers:
General Links:

What is Recsperts - Recommender Systems Experts?

Recommender Systems are the most challenging, powerful and ubiquitous area of machine learning and artificial intelligence. This podcast hosts the experts in recommender systems research and application. From understanding what users really want to driving large-scale content discovery - from delivering personalized online experiences to catering to multi-stakeholder goals. Guests from industry and academia share how they tackle these and many more challenges. With Recsperts coming from universities all around the globe or from various industries like streaming, ecommerce, news, or social media, this podcast provides depth and insights. We go far beyond your 101 on RecSys and the shallowness of another matrix factorization based rating prediction blogpost! The motto is: be relevant or become irrelevant!
Expect a brand-new interview each month and follow Recsperts on your favorite podcast player.

Note: This transcript has been generated automatically using OpenAI's whisper and may contain inaccuracies or errors. We recommend listening to the audio for a better understanding of the content. Please feel free to reach out if you spot any corrections that need to be made. Thank you for your understanding.

I'm a big fan of the very simple approaches.
Until we squeeze the last drop of the potential of the very simple method, I'm not a big fan of going immediately with the state-of-the-art because state-of-the-art comes from the industry specific.
Like if it comes from the big companies, they have very specific problem and they optimize for it.
I mean, as a public broadcaster, our goal is to provide a diversified content.
So it's not like only about the precision.
So the intersection basically between the set of items that we recommend and the set of item the user clicks, but we want the content to be diversified.
This is important for us.
So what we do is we literally take the embeddings from the USC embeddings and we calculate the interlist similarity.
The higher the similarity, the more diversified the content.
So it was for me, the whole trouble of the matrix factorization was for me, link prediction where this kind of inner products will represent the edge weights.
Hello and welcome to this new episode of our experts, recommender systems experts.
This time we actually look at a public service media provider and how that public service media provider is recommending audio content to its users.
To be a bit more precise, I have invited Mirza Klimenta to the show.
Hello Mirza.
Hello Marcel, thanks for having me.
Yeah, thanks for joining.
So Mirza is a senior data scientist and also a Recspert.
And what is very interesting, he's also a novelist, which we will hopefully have the chance to talk about later a bit.
Mirza obtained his PhD in computer science at the University of Constance.
And I actually found that he was the youngest PhD student in computer science, obtaining his PhD from the university there.
He is working for PUB, Public Value Technologies, which is a German company and a subsidiary of two regional public broadcasters, the Beirecher-Runfung and Zutwest-Runfung.
A company is actually focusing on developing digital products and especially developing digital products for public broadcasting.
And in this case, and for today, we talk about a specific one, the IAD audio tech, which Mirza will give us some more details about in this episode.
So Mirza, can you introduce yourself to our listeners?
Sure.
So I did my basic study.
So the bachelor studies I did at the University Sarajevo School of Science and Technology in Sarajevo, Bosnia and Herzegovina.
After having a four-year bachelor degree, I did immediately a PhD.
So I did transition immediately without having a master.
And this was like a not really usual case back then in Germany.
So I was put on a fast track and they kind of experimented with me.
But I managed to do it in less than three years.
I obtained a PhD and it was in the field of multidimensional scaling applied for graph drawing.
So basically graph embedding here, it was limited to two dimensions, also with three dimensions we experimented.
Yes.
But my passion was designing and implementing efficient algorithms.
So algorithms that are efficient in terms of time complexity, efficient in terms of space complexity.
And so one of the chapters from my dissertation actually got awarded the Best Paper Award that actually this was the world's most renowned conference in the field of graph drawing.
And this was basically the graph embedding in two-dimensional case where we managed to reduce the complexity per iteration from quadratic to linear.
It was like an approximation, but a pretty good one.
Yes.
So after that, I took a year off because it was intense.
It sounds impressive to get a PhD at 25, but you really have to put some effort.
Yes.
So then I worked in in cars in automotive industry a bit.
But when I saw that actually kind of deviates from the data science work that I expected, then I said, OK, let me listen to some opportunities like if I can move to data science or even to academia.
And it was actually at that point that I got the Best Paper Award with my colleague, Mark Orban, who did all the experiments.
And actually, thanks to him, it was his initiative because I kind of left academia.
But interesting development because back then getting the Best Paper Award was kind of a passport for me for a postdoc.
So I got an offer from the University of Rome Italy to do a postdoc in graph drawing there.
And I did it for around nine months.
It was kind of a theoretical contribution.
It was like a problem of morphing one graph drawing to another, like two-dimensional morph to achieve and such that all the graphs in between are planners, such that there are no edge intersection, which is more into the theoretical direction contribution.
So I then decided, OK, let me go now to collect data science experience from the real world.
So this is how I landed in in Sarajevo, where it all started.
So there I worked as a data scientist for a American based company.
So that means after having finished the PhD in Constance, you joined the university in Rome for a postdoc.
And from there, you went back to Sarajevo.
Exactly. It was kind of back and forth.
I have to admit. So when when I have discussions with recruiters, I always have to like they ask me, what is this? Is it like a good chronologically ordered or did you make a mistake?
Everything is fine. I will explain.
Yeah. But yeah, so let's let's continue.
So data science was in Sarajevo.
I worked for a San Francisco based company.
What we did there was actually very cool.
It was the stock market prediction based on graph serious analysis.
So basically graphs were built off, built out of news articles, literally out of news articles, we will extract the keywords.
We call them concepts.
We will build the graphs.
We will have for each day a separate graph and then we will study the graphs, study like centrality scores and so on.
And then we will kind of feature engineer out of the graphs and we will then use it for the stock market prediction.
And it was then after some time that I said, OK, let me go back to Germany.
So let's go. Let's go.
Because I had a very good experience.
My friends were in Kalsra.
But this time I said, OK, let me see if there is some kind of like a narrow data science role. Usually when they when when you look at the data science role, they are very broad.
You know, they ask you, you know, pandas, numpy, you know, classification, regression.
But I said, OK, it's like a very broad.
But is there any something's very specific?
You know, OK, let's do like pricing engines like regressions.
But this one was very interesting because it was in recommender system and it was like the title recommender system.
And previously to that time, I never heard of by the Sharon phone, which is the mother companies was like a daughter company kind of idea.
And I said, yes.
So I sent them the application and they like the application.
They have given me a test.
I did the test and everything worked smoothly.
I started working as a data scientist in the field of recommender systems.
And interestingly enough, the author that was most frequently cited in my dissertation, Yehuda Koren and also Yifan Hu, which were the authors of the so-called implicit alternative least squares.
So this is the weighted matrix factorization approach back in 2008, I think.
And it was actually the system that I found in production in the audio tech.
So it was kind of I tell you this because it's a closing loop.
You know, it's everything was was kind of getting back together.
Yes.
So let me just give you some some background for the for the listeners.
So probably some of them will be from Germany.
They they are familiar definitely with the audio tech.
We actually really do have many listeners worldwide.
And most of them are actually according to my statistics coming from the United States.
So many worldwide have heard, of course, about the BBC, about the German public broadcasters, which are actually the biggest ones, at least in terms of their budget.
But there's also a very big one in Japan.
But maybe we can elaborate a bit more on them to give our listeners worldwide a better image of what they are actually doing in Germany.
Exactly.
So I think that we can make a parallel between BBC and I are there, right?
Okay, so if we make a parallel between BBC in Great Britain and I are there in Germany, then people will understand the importance of the of the platform.
It's an audio on demand platform where you can listen to podcasts.
I actually couple of days ago, I had a exchange with one man and he said that I had the audio tech was the only content that actually he was consuming on the internet because he said it's very it's very sophisticated by his standards.
Like he said that the editorial team takes care of letting out the content that was like scientifically proven kind of, you know, so this is my feeling.
I have to be honest that, okay, my German is not that good as my English.
So I don't spend that much time listening to the audio tech, except when I'm testing the algorithms.
But I would say that I have the audio tech content is pretty good quality content.
It's definitely good quality content is coming is coming from various sources.
As I said, there are you already mentioned.
So there is this various houses that are under the roof of I are there like Zutwester and folks, which is like a south west part of Germany.
And then there is Bavaria where I am and so on and so forth.
So the content is pretty good.
We have like couple of thousands of items on disposal.
They are with daily updates.
Yes.
So some content is evergreen and other content is obviously updated.
And yes, so there is the the recommender system play an important role for the audio tech.
But I have to say, and this should be actually a little slap to the management is this I really don't have the feeling that they understand the potential of what the recommended and actually the personalization.
So I'm not talking about only recommendation, but the personalization could play in user engagement.
I guess there are many different areas in Germany where we have definitely room for more digital mindset, one could say.
But before we dive into into the specifics of the recommender systems with regards to the IID audio tech, let's take a step back.
So the IID is actually a broadcast network, so a public broadcast network.
And within that network of regional broadcasters, there are two of them.
So the by Rasha run phone, be R and the investment phone.
And these two, if I'm right, but please correct me there.
They basically decided to found their own digital company, the one that you are working for.
And is this also the company that you joined them when you joined them as a data scientist?
Or is this something that evolved over time?
What was actually the context when you joined the company?
Have you joined by Rasha run phone and then transition to that new company?
And when was it actually so since when have you been dealing with recommender systems?
There is not only like by Rasha run phone and SWR under the hood of I did there are like many others like there is the hessier room.
I think there is like there is some, uh, Deutschland, Deutschland radio or, or some many other broadcasters in Germany.
It's like from every German state, there is a participant in the I R D club.
So this is one thing.
I started with buyer, a Sharon Fung in August, 2021.
But in April, back then it was like towards the end of the year, it was decided to form a company.
So that both by Rasha run phone and I still, you are invest in a company that will be basically handling the software development for the whole IRD team, but specifically for the buyer and phone and for SWR.
And the company is public value technologies.
When people ask me, I still say that I work for buyer Sharon phone because the pub is relatively new and it's still a daughter company from buyer Sharon phone.
We all the projects that we have are actually related to buyer Sharon phone.
Now we will start like in a couple of weeks with a very specific project with SWR.
So yeah, this is the story.
Okay.
Yeah, definitely makes sense.
I mean, the audio take is your product that you are responsible for in terms of recommendations as far as I understand.
So how was it shift actually?
So I have understood so far that you have been a long time expert for everything that is related to graph approaches in data science.
And I mean, recommender systems are not unrelated to graphs.
So you could resemble the interactions in a recommender system between users and items as a bipartite graph.
There are many graph approaches, graph neural networks, and so on and so forth.
Have you seen these similarities from the very beginning or what is it to join this field that you haven't worked in before from the perspective of a, let's say, graph expert?
So how was that shift?
Actually, it was a very smooth shift.
And this is actually how I expected it because during my dissertation I did.
Okay.
It was graph drawing.
So it was literally graph embedding in two dimensional space.
But a part of my dissertation, and this is basically the part that I kind of most proud of is a numerical optimization approach, which I proved to converge.
So it was like a mathematical approach which iteratively converges to the local minimum of a specific function.
So I was exposed during my PhD studies to the embeddings, like to the vectors.
And I'm someone who enjoys working with geometry, like with vectors.
I like exploring, I've been obviously graph drawing, it's all about geometry, exploring the proximities of the items, clusters, and so on.
And the first point of interaction between me and recommender system was actually the paper by Yoda Koren Yifan Hu.
This was this famous implicit library that many people know about.
And there there is like, they give you a function, and then they elaborate on the numerical optimization.
It was at least knowing the works of Yoda Koren, it was backwards engineered.
So it was literally the constraints and the function were given such that the optimization is done very fast.
So it was like a reverse engineer, it was not top down, but it's pretty good.
So it was like a very smooth transition for me to understand how it all operates.
So there is a function that needs to be optimized, there are weights inserted into this function, then in the end, there are inner products.
And then the only difference that I had to learn, and this is the cosine similarity, and if they are normalized, then there is only inner product.
So if you multiply everything, then you only extract the top ones, and these are the recommendations.
So everything made sense.
And as you said, it also made sense because this matrix, I consider it as an adjacency matrix of a bpartite graph, as you said.
So it was for me, the whole trouble of the matrix factorization was for me, link prediction, where this kind of inner products will represent the edge weights.
This is how I understood it.
And we will talk later about how I actually utilize this idea to improve the accuracy of one of our models.
Yes.
Okay, okay, great.
So that makes sense for me.
Talking a bit more about the IID audio tick, you already said and touched a bit on the items.
So actually, what is the content of the IID audio tick?
So far, we said it's audio related content.
So what does it mean in specific?
Can you elaborate a bit on that?
So what is it actually that you are recommending there?
There are various categories that you're recommending.
There is categories that are literally audio books, then there are categories that come from the, this is comedy.
Like one of the most famous is the one from Bastian Pastefka, which is, you probably know him for other listeners, which is one of the most famous comedians in Germany.
His podcast, like in the preview, they immediately is a red flag when we see it because it's the most popular content.
So this is how we notice that it's popularity biased.
Yes.
Okay.
So there is very interesting content on, I call it like philosophy category on religion also.
There is also kids content.
Since I mentioned kids content, there is also like, okay, we have some business rules.
Like if the user is showing interest in the kids content, then let's say more content or sexual content should not be recommended.
Right.
So there are some business rules that you apply in the end.
But yes, so there is also a sport category.
Like I call them reports related to sports.
I'm not really looking into sports.
This is what I would say about the content.
But there is also, at least in my, in my opinion, it is like a documentary style, very nicely organized and with a very high, how we say, so when you listen to it, it's like very high confidence that this is really based on science.
And even if there is a history category as a public broadcaster, we try to be, and the content is obviously so that we don't go into the polarities.
Right.
So to provide an objective view, whatever the topic is, right.
I would assume you go into the polarities, but you also try to balance them off to show different perspectives regarding a certain topic and then cover the different opinions that there are in your, in your audience, I would assume with regards to the, I would refer to them as genres that you were already elaborating on.
So kids, documentary, comedy, or sports.
What are the types of content items that you have?
So you have already mentioned that there are audio books.
I also remember you saying that there are podcasts.
So does this also mean that you have to disambiguate between different levels of items, something where we might relate to something as a full show and then that we have the episode level of certain stuff.
Right now the recommender system is only based on the episodes, right?
So every item is treated as an episode, the series of episodes, we call them programs.
So this is basically a podcast and we will talk later about how we adjusted the system such that it operates with the series of episodes organized into podcasts.
We basically have these two levels, episodes and the podcasts.
Okay.
And program is synonymous term for, for podcasts.
Yes.
Yes.
So you have these two levels like you would also encounter in other audio services that you have to deal with.
So you were already talking about the slap in the face of the management and I was kind of expecting something, but I want to let you go.
What is the slap in the face that you were referring to?
Yeah.
So we were basically proposing new ideas of how we can improve user engagement.
I have to be, I'm working on thin ice, but I have to be honest that it's not only with recommendations, but with all other proposals that you go to the management, it takes some time with the public broadcasters, like in big companies, it takes some time until it is discussed until it's evaluated.
And until you get the green light, we were pretty fast.
My team were pretty fast into like developing prototypes and we were showcasing and we still to this day, some very basic ideas are kind of not implemented.
I will mention an idea.
So this is practically every, every user is shown a fixed list of items of editorially curated list of items is presented to the user.
So we propose, okay, let's I have the website right open here and I do not only see a list of items.
I basically see that row concept that you see in many.
So I see Bastia and pastefka.
It's very top to the very left, but it's actually a crime podcast.
So it seems it's not like the standard pastefka more comedy like content.
So you have these rows and then each row is basically, I would say a ranked list of items.
So can you go from there?
It is fixed list of items for every user.
Okay.
So we proposed, let us personalize the rankings.
I gave an example.
So let's say that the 10th item is towards the end that the usual actually doesn't see.
And this is the sports item.
And the user is a heavy sport user.
Like he listens to the sports related podcast.
So he wants, he wants, he gave a very simple idea that I wanted their attention.
And I said, if we start them, we will push it towards the front.
So there is a higher probability that he will click on it.
This was a showcase and we literally used the embeddings that we have from our text factorization.
We just do inner products, we sort and that's it.
So this was one of the proposals, but yes.
But there are, there are a couple of more, I mean, that's not too bad.
So maybe let's start from that.
And I mean, there are definitely recommenders in place that I assume so that you have already brought into production.
So Mitza, can you walk us through the history and share with our listeners what there is actually on place?
So what kinds of general recommenders you are using and what use cases you use them for?
Okay.
So I started with the, with the very, very basic idea that was implemented first.
And that was there when I, when I arrived, this is the very simple content to content recommenders.
It means that it doesn't consider the user browsing history.
So it only depends on the last item the user clicked on.
And this is based on the vectors obtained by universal sentence encoder, which is subjected to so that the input was actual summaries.
So the summaries that you can actually see in the audio tape or a particular episode.
Okay.
So when I click on a certain item, then I see, of course the title.
I can also see which actual broadcaster this was generated from or displayed at.
And I definitely see that summary.
Exactly.
Just a couple of seconds.
Exactly.
So summaries were first utilized.
So then was the idea.
Okay, let's go with transcripts.
So let's see if we can extend everything with transcripts and then see if it improves the results.
So we did this with the, the Navy test and the results actually improved.
So when we subject the transcripts to it, another idea that we have, it's not in production is to utilize teaser images.
So literally it means that we will concatenate the vectors obtained by the USC with the vectors obtained with the image embedder.
We will probably like down weight the importance of the vectors obtained for the images, but the idea was, okay, let's see if this actually boosts the results.
So for you, this is under the section, English in halter, which is a similar content.
Is this actually the use case?
So basically when I look at a certain item, so I click a certain true crime podcast and then the very first row, I can see they're the only are, let's say set of recommendations, but consisting of two rows, and then I can also click to see more.
It's actually in my current view, two times six items where I see, as you said, similar content or in the, in harder.
So this is basically stemming from the recommendations exploiting the UC embeddings.
Isn't it exactly USC embeddings.
They were, they were obviously, they tried some other embeddings like a sentence spirit and so on, but USC proved to give the best results.
So this was, yeah, there are basically three things that you tried.
So one is using the summaries only to embed those.
Then you created transcripts for the corresponding episode and embedded the transcript.
And the third one is actually to concatenate the embedding stemming from the summary with the image embedding, or did you use something else?
That's it.
So either, either the summary or the transcripts with the image embedding.
So there was then of course, we will probably do some dimension reduction on the image embeddings and see how it, how it all works.
And the first one turned out to work best.
Transcripts.
Yes.
We did an A-B test.
It showed a better CTR.
So we decided to go for it.
And are you using those embeddings only for this use case to recommend similar items given a seed item or is it that you also use them, for example, to compose user embeddings or user representations?
There are two things that I will discuss with this record.
That's a good question.
We actually did try this and I feel embarrassed that I didn't think about it.
So we actually did this.
We calculated user embeddings based on the user browsing history and we actually incorporated the timestamps.
It was not like a simple average, but it was a weighted average where the weights were associated with timestamps and we obtained the user embeddings.
And then we have like a user to content.
We call them U2C.
So user to content embedder.
So this we actually tried, but the results were not that good.
This was basically taking the history of a user and then applying some decay by exploiting the timestamps of the corresponding interactions, which I guess we will also talk about what engagement and interactions are actually or how you define certain thresholds there.
But you were using the embeddings based on the transcript.
So the embeddings for this.
Okay.
I see.
I see.
And this turned out not to work or what is it that you compared it with?
We compared it with weighted matrix factorization.
So this is the famous implicit library, which was kind of tweaked, adjusted and so on.
It's a very flexible thing, powerful thing.
Offline results were pretty, pretty bad.
So we decided it's not a good idea.
We also did some other combination like going with the nearest neighbors there with each of the items that the user has selected, then concatenating the list and so on.
But it didn't work well in the end.
So any ideas why this didn't turn out to work or at least let's say sufficiently.
So I somehow get from you that it was pretty bad.
Yes, it was.
It was pretty bad.
But the thing is, I mean, we compare, I mean, it's always relative to something.
So we compared it to a model that comes from the user item interactions.
So there, this model is literally like the mathematics within the model is trying to match the similar users to the same side, right?
This is why the user item interactions.
And here we, okay, it's kind of doing the similar thing, but the spread of the items based on the content is different than it is in the in the matrix factorization, because there you don't utilize the content.
So the you literally there is no, it's a it's a bipartite graph, as you said at the beginning.
So there is no link between the items.
So this is why I think, but obviously it was like we were trying to utilize one scenario for a totally different objective.
It was a very wild attempt.
I mean, I would be very surprised that it actually provided some good results, but in the end, it didn't.
And I mean, the comparison might also be a bit unfair because you are somehow comparing, let's say content based user embeddings that are basically weighted averages of item embeddings with collaborative filtering approach where you used weighted matrix factorization. Exactly.
Exactly. But there is also another thing.
Yeah, go ahead.
This is the thing where so where do we utilize the embeddings also?
So we actually utilize them for the offline evaluation to calculate content diversity.
So let's say we evaluate the U2C system.
Let's say we have factorization machines, whatever.
So we produce a list of 10 items for each user and we want to see how this recommender behaves. We have two recommenders, let's say ease and factorization machines.
Okay, so now we are more in the collaborative filtering domain where we want to create personalized recommendations, something that is similar to an item, but that is similar to a user. Yes.
So let's and we want I mean, as a public broadcaster, our goal is to provide a diversified content.
So it's not like only about the precision.
So the intersection basically between the set of items that we recommend and the set of items the user clicks, but we want the content to be diversified.
This is important for us.
So what we do is we literally take the embeddings from the USC embeddings and we calculate the interlist similarity.
The higher the similarity, the more diversified content, obviously.
So it's and this is the metric that we use.
And maybe it's now time to say what are the metrics that we base our decisions on, at least in terms of the user to content recommenders.
It is precision.
And then we also we also always look at the diversity and novelty.
So diversity, I already discussed novelty is the inverse of popularity.
So we just count the numbers of occurrences.
We do the inverse and then we average everything.
We have the number and then we look at the three numbers because we haven't we could have come with a very single number that will combine the three.
But we look at these numbers and we say, OK, if this like accuracy is like compromised a bit, but we have a better diversity and then we go for it.
There are several objectives that are important for you as a public provider.
However, I would also say that nowadays there are also pretty important for private platforms, but also by e-commerce players who say, OK, we want to diversify our results because we want to provide some option for discovery for our users because discovery then also improves the retention of those users because they discover new content.
They keep engaging with the platform.
But talking about accuracy.
So you have already mentioned before that magic board of engagement.
How do you actually assess engagement or what are the signals of users?
We train on the on the historical data.
So we literally take all the events that happen.
We train the model on on the past week or past 10 days.
So we literally collect the user the user IDs.
We collect the item IDs.
So because it's it's it's about the audios, we collect the duration, the actual duration of consumption.
What do you mean when you talk about events, duration and consumption?
The events are literally the user interaction with the with the app.
Right. Whenever the user clicks play, pause, skip, whatever.
So it's all locked.
We have the information.
Right. And then we have a algorithm that calculates the actual time that the user has spent on this particular item.
Right. I see.
And it is calculated like in the past seven days, whichever number of sessions, it is aggregated.
It shows like a user interest in this particular item.
Right.
Okay.
Okay.
So I'm not not not fully understanding right now.
But if you watch something, if you if you if you listen to something today, 50 percent tomorrow, you listen to the rest 50 percent.
But you also proceed to listen once again, for some reason, the first 50 percent, then I will have like one point five.
Okay, I see.
I'm obviously now now telling what the system that was there in production.
It was the weighted matrix factorization, the implicit library, which is I have to be honest, it's a very powerful, very flexible thing.
And we will discuss later how we actually kind of managed to tweak it and to beat some state of the art models.
To summarize that to that point means when you talk about engagement, you basically look at the progress that users make on an episode level, meaning that there is an episode and you have some certain time window within which you look at how much I consumed from that episode, saying, for example, within two days, let's just assume I fully watched it.
And then there is one point zero for that pair of my user ID and the corresponding episode ID. Right.
And about the other events that users create.
So, for example, if a user might click something, there is to queue download share.
It's all taken into account.
So we do kind of aggregation.
So weighted some of all of these events.
So we did offline testing where we kind of try to linearly combine them and then various weights. And we found like a good combination of all of these events kind of where, for instance, share at to queue, download and play the role.
So this is the final number.
And it was it was very interesting for us.
For me, it was surprising because I was not like at the very beginning, I was not the heavy user of the audio tech.
But at to queue is actually the event that is most frequently fired.
Probably maybe for you is not a surprise.
But for me, it was like, OK, this was very interesting.
For me, the question is a bit more of isn't this a bit redundant and isn't it also a bit problematic?
Because, for example, if you take into account single clicks, this might be too noisy or too weak as a signal.
Or actually, I do understand that you take into account several different event types, but also progress that I make with a certain episode and then you look at them jointly.
But what is actually the target?
So do you use them to predict whether a user is fully or to a certain extent watching an episode?
So how actually do you determine these weights to assess how important these individual different signals are in predicting a certain user behavior like, for example, watching an episode? So what is actually the goal?
So I will start from the what I forgot to say is the whole infrastructure was devised, built such that the model that we used and that really proved quite good in comparison with all other collaborative filtering models is this implicit library.
Right. So there in the implicit library, we have to have a confidence interval, which is relative to user item pair.
Right. This confidence interval literally gives us how the user interest in a specific item.
So for this, we devised this weighting schema as all the combinations.
So how then we find the optimal kind of weighting schema is within offline evaluation and we did watch the highest accuracy based on the on the various combinations of everything.
Like so we have like seven days, seven past days is the training, like the following three days is the testing.
And then we literally look at the accuracy.
So how well you are actually doing to predict the interactions on the day after this seven day period?
So the model, if this is a question, the model is trained like every it's a batch system, and it's straight in every, I think half an hour.
But one thing that's very important thing is listening.
I mean, I'm discussing like the very basic stuff.
So they will ask, OK, why, why are we listening to the metrics factorization and so on?
So the thing is, we already have this like a row, which is a content based, right?
It's only specific content specific English related content.
And then there is this second row, which is the envelope in footage recommendations for you.
So we were actually constrained by the existence of the first row, which is a content based not to use any information from the actual content to improve the recommendations that are coming only from the user item interactions.
There is now a question, and this is this is what I posed.
What it actually made more sense that we only have a single row, like a single option recommendations for you.
Well, where we would combine the two like as it's with big companies, we will do some some fancy fancy algorithm.
Exactly.
We will do some fancy stuff and then we will like merge the two worlds and see how it behaves.
So I just want to like tell to listeners what is the reason because when I started like proposing these ideas and actually working, I started with factorization machines and incorporating the vectors and categories and so on.
It was like we already have, you know, we already have this.
Why would we like pollute this particular row?
So, OK, so I said, OK.
Yeah, I guess it definitely makes sense to mix different kinds of recommenders and not only look at content based but also to collaborative filtering based things and then maybe also join forces to create something hybrid.
However, I'm having difficulties finding actually the personalized row.
But this, I guess, stems from the fact that you are not logged in.
That I'm not logged in.
OK, I see.
So this is also another specificity of the system.
Right.
That means that for not logged in users, you won't serve any personalized recommendations.
It's another slap to the management.
So OK, but definitely something that I assume you are you are fighting for.
So I do understand better right now which different signals you incorporate also that you merge different signals that you then use as something to optimize for by, for example, using a weighted M.F.
However, it sounded a bit like this is not where you stopped at.
So you developed the system further from that point.
Yes, of course.
The thing is that the confidence intervals, right, with the values that you in the end compute, you can literally inject there the user preference for this particular item.
And what you can utilize, I just give a very basic example where you can utilize there is the actual timestamp, right, such that the items that the user has more recently consumed have a higher impact on the whole factorization process.
And another thing that I will mention now is this kind of merge of the two worlds, content based and the item based is, for instance, like if you have a if you have information on the on the categories of particular items, you literally inject a dummy user representing a category and you fill in the metrics on position where the items belonging to these categories are with once or with some other content, all the others with zero.
So that means, for example, you introduce genre as a user.
So for example, the comedy user and then every episode that is holding that genre of being comedy gets a one in the matrix.
And not necessarily, it doesn't have to be a constant, but you get the idea.
But the thing is, it's an extremely powerful, powerful model.
And we have tried state of the art, but it failed in our case.
But I was actually I was very pleasantly surprised by the last year RecSys paper by it was also by Stefan Dendlend, Yehuda Koren, and they argued about the unfairness of comparison with the original model.
They said that the state of the art is not fairly or their model, it kind of is not fairly compared to the state of the art.
And they showed that with careful tweaking of the of the model, you can actually.
obtain much better results.
Their main example was the increased dimensionality.
Like they go like into thousands of dimensionalities, like 1000, we go below 100 until we squeeze the last drop of the potential of the very simple method.
I'm not a big fan of going immediately with with the state of the art because state of the art comes from the industry specific.
Like if it comes from the big companies, they have very specific problem and they optimize for it.
Nowadays, I wouldn't call the two tower pattern or something like that as be very industry specific.
However, I share your thought there that and this also links a bit back to the history that we have seen with neural metrics factorization and the reproducibility of certain papers and then also seeing that neural MF could be beaten by properly tweaked baselines.
And I guess this is also a bit the direction where this argument comes from it.
I agree with it that I mean, you should not compare against something that is not properly tweaked because then your comparison is somehow a bit unjustified.
So and in the other regard, it would really say, OK, look at traditional standard algorithms.
I guess it's both not the right term to call it.
But if you look at something like implicit ALS, then it definitely makes sense to properly tweak it first.
And as you said, squeeze out the last bit.
However, at some certain point, you might also want just to discover new models and also get proper signals in them that it's not so easily to get into into ILS or something like that.
So maybe there are arguments for both directions.
Or what is what is your thought on this?
There's definitely so we did not actually stop right there with the way to the matrix factorization. We moved further.
So I have to mention that Hallecetex, which is the inventor of ease.
So embarrassingly shallow autoencoder.
So we exchanged several emails with him, collaborated with him on how to utilize ease.
Also, we implemented a sidekit networks page rank approach for recommendations.
And we tried several models like we compared them with the weighted matrix factorization.
The thing is that we found out that, for instance, ease inaccuracy, it was under weighted matrix factorization.
It comes with a high popularity bias in our case.
But with episodes, for instance, what I was disappointed is was to see that page rank was behaving pretty badly.
Why I decided to go with the graph based approach.
I also tried an approach where I considered the user item interaction matrix as a graph.
And then I do like a notovex.
So you control the amount of breadth research and depth research.
You do the embedding and then you do the inner products, cost and similarity and so on.
The thing that I was kind of disappointed is I come from the graph world and I would like to see how the graph, the very, very basic graph algorithm behaves in this case.
And this was the page rank.
And then we run the page rank.
The code is online.
Just you need just a little tweak.
And it was behaving pretty badly.
But then we studied the actual graph that was formed out of the user item interactions.
And we had like in the period of 10 days, we had thousands of disconnected components.
And obviously this was this was the reason.
But the game changed when we turned to recommend the podcast.
So the program sets, because now you have the same graph, but each episode is belonging to one of our podcasts where the number of podcasts is so much lower than the number of episodes.
And then the graph became very connected.
Like we had like from 1000 reduced to 12.
So it's a huge reduction.
And then a graph based approach worked like a charm.
This was what we saw in at least in offline evaluation.
So the ranking of the algorithms by accuracy.
So in the episode case and in the program set case, so serious episode was inverse.
So this was this was our very interesting, interesting finding.
Maybe if we talk a bit more about the several use cases.
So, um, I have understood so far that you add these similar content, but also the personalized recommendations on let's refer to it as a detailed page of a certain episode or something like that.
Is there also more of that, that you are incorporating in the landing page of the IID audio tech.
So the very first page that I see when I go to the audio tech, is there some ordering of the rows in place, or will I be shown with several different personalized row that might have different seats or something like that?
Or what is it you are doing on there?
Let's say the, the landing or the first, we haven't actually done any particular work in and personalizing the first page.
But as I said, so this is the huge goldmine for us for the, for the personal team, we are called the personalization team here at pub.
So as I said previously, so there was this personalized sorting, not only items in the rows are personalized sorting on the, of the rows, like you have in Netflix and yeah, but it's, it's the detailed page that contains the, um, it's actually the three sections, like the content based the, the personalized, and there is also the content based where the embeddings are coming from matrix factorization.
Okay.
So basically some item to item recommendations, but based on CF and exactly like a couple of weeks ago, there was German data science days and a lady was presenting a recommender for like you invest in some assets and then the recommender like gives you a recommendation of what, what other companies you should invest to or whatever stocks, whatever.
And she, she used actually, she actually used exactly this.
She used the model and she used the embeddings, which is a clever idea because you kind of pull out the similarity of the items based on the similarity of the users consuming the items.
So the similarity is stemming from the user item interactions, which makes sense in this, in this particular case.
Okay.
Okay.
So it's like some kind of people that invested in company a also invested in company B.
Okay.
I see.
So Mitzi, you touched on this topic a bit that you do not only optimize for accuracy, but that you also take into account diversity operationalize by taking the intro list similarity, but beyond that you have also popularity.
So what are the approaches that you use there to make things more diversified or to increase the coverage or to decrease the popularity bias?
So you probably noticed by now that I'm a big fan of the very simple approaches and this is, this is the simple yet effective.
This is the main drive.
I mean, everything, everything is driven by the numbers that I get in the end.
So what we do is inverse propensity waiting by the popularity scores.
I mean, we do some logarithmic or whatever other monotonically increasing function on the popularities, and then we re-rank the items.
We don't want to have a like a very high deviation from the original results.
It's, it's like adjusted, but this is how we proceed.
I haven't mentioned the business rules.
Like there is in the end, we, we go with the business rules.
Like if not three consecutive items cannot like belong to the same genre, right?
So only only the first one or the first two are picked and so on.
Like this is where it comes into play and kind of diversifies the content.
But inverse propensity scores is what we implemented.
It's obviously the like from the computational perspective, it's the easiest to do re-ranking in the end.
But of course you can multiply your rows, but it will take you some time.
So you use inverse propensity waiting then as a re-ranker after getting the Exactly.
...commandation lists for the users, because there are also some approaches where you might be using it within the processing, within the training of your recommender.
Have you also applied around with these several stages where to use inverse propensity waiting?
Because in general, if you look at debiasing, you have that differentiation into pre-processing, in-processing and post-processing.
In your case, it's a post-processing example, but what are your experiences there?
Have you tried different steps to incorporate IPS or what are your experiences there?
We have experimented only with the post-processing, but we are aware of the actual approach that from the very beginning optimizes for the novelty, let's say, for the novelty.
But we haven't experimented with it.
I think this was, and I'm not sure if this year or the past year, there was a very interesting paper where in the RecSys, where they actually, I think it even came with the code, discussed this approach of injecting the inverse, in the objective function.
So they minimized the objective function in the end.
But as I said, we did not utilize it there.
One very interesting thing with the page-rank approach.
Okay, with the page-rank.
So there was actually a contribution from your company posted a couple of months ago on LinkedIn and I found it.
And this was from one of the, I think, bachelor or master students who did address the popularity bias within the page-rank.
Yeah, actually, we played a bit around with the Twitter data set and her name was IFA Engel and she was actually my bachelor thesis student. So I had the pleasure to supervise her during her thesis and we were actually playing around with FairnessAware PageRank and also with extensions of it to apply it to the Twitter data that was provided a couple of years ago alongside the RecSys Challenge. So was it something that you could reuse or?
Yes, the idea.
So how were you able to use it?
It would be interesting.
Obviously, so the basic idea is that the degrees play a significant role.
We kind of utilize this within the PageRank walk where the edge weights were normalized by the degrees.
Then we had the algorithm in the end that was kind of the popularity debiased algorithm.
And this is something that you also used on your data, on your domain to compare it with the inverse propensity scoring as a re-ranker to the MF approach when you compared both of them in terms of popularity debiasing.
So what was the result there?
So with the episodes, I have to say with the episodes, in all cases, the MF behaves best. PageRank behaves best with the program sets for the reasons I discussed.
Yeah. Can you again specify what program sets are actually?
It's a series like you mentioned Bastian Paztevka, right? So let's say he has his podcast.
Like this is like a series of his episodes and all of his episodes are belonging to a Bastian Paztevka podcast.
So when we convert the actual data set, we literally replace every item, every episode, ID belonging to the Bastian Paztevka podcast with podcast ID.
This is the only change that we do in the algorithm.
And then in the end, it works perfectly.
We do aggregations, obviously, we do aggregations.
I mean, there are like confidence intervals are aggregated across the episodes.
Yeah, in the end, this is how it works.
Can you also let us know what are your next steps or the stuff that is ahead on your road map?
So I've already got a bit the impression that sometimes you would be more satisfied with things being a bit quicker.
But I guess you might have plans and you and your team, you're also trying out different things.
So maybe connecting it a bit to the main problems.
So you do have recommenders in place that are able to anticipate users taste and provide correspondingly recommendations that are not only relevant, but also diversified and to a certain degree popularity debiased.
But are you already satisfied with the diversity with the popularity debiasing and also with the accuracy or are there other goals that are getting more and more important for you?
So what is it that you actually deal with currently or want to deal with further in the future?
So this is, for instance, we did a study on the underserved user groups.
This was motivated by the publications from the past Rexus.
And we identified that there are some underserved users.
And by the way, we could talk about the cold start approach, the problem afterwards.
So we wanted to see if there are patterns like if they could really identify the underserved users.
I mean, obviously, with the metrics factorization, the more you use it, the better recommendations you get.
And you dominate all the others and so on.
So there are some underserved users.
So what could we do for the underserved user groups?
This is one area.
Another area is that the data analyst team actually came with the conclusion that during daytime, so during like from eight to five items of short duration are consumed preferably.
But after work items of longer duration.
Okay, so you can you can obviously do some filtering there to accommodate, but can we do better there?
And yes, so these are the approaches that we are exploring.
As I said at the very beginning, there is also first that we get the green light that we have a single section that is incorporating the user item interactions with the content data.
So we have a single section providing the recommendations to the user.
On the detail page or on the landing page?
Good question.
That's a good question.
Yes, I would be satisfied if it's a detailed page at the beginning.
So but let's let's see.
Yes.
So this is obviously there is like the state of the art development.
We also wanted to tackle and I couple of weeks ago, I finished a course from from Risha Metrora.
You also did this in the state of the art.
Rex is so very, very good ideas.
One thing also, this is a very, very good one, actually.
A PhD candidate will join us.
She did her PhD at the University of Bolzano in the user simulation for the Rex recommender evaluation.
So what we want the initial project will actually be to build this simulation system such that we save time for the offline evaluation, because we want to have like a filtering system that all the ideas can be tested and that we have a direct correlation with our offline evaluation.
So to speak, like with this simulation, user simulation approach with the A B testing, because there is a good correlation between these two approaches, according to her research.
So we are investing into having like evaluation methodology that will save us time and that like the promise of the method is that it has like a good correlation between the offline evaluation results with the A B test results.
And as you know, this sometimes very inverse.
I was already about to bring up that point because you said to speed up the offline evaluation process and I would have expected something differently because I haven't asked you so far how big of a problem offline online discrepancy is in your case.
So I would have assumed that the simulator somehow rather to circumnavigate that discrepancy, but this is something that you don't observe to such a great extent so that you can rely pretty much on the offline results and that they do correlate well with your online results or how predictive are your offline evaluation results of how a recommender performs online.
I cannot tell the actual numbers, but I can tell that it's like pretty good.
It gives like very good indication, but a few times I was very surprised what's going on there.
And then we try to track down what's happening and there are like various reasons of the offline evaluation and so on.
But in the end, it is the online results that go to production.
Have you also considered investigating off policy evaluation and learning that are gaining very high attention in the community?
No, no, we haven't.
Okay, not so far.
Okay.
So first thing is extending more into the simulation aspects of it to evaluate certain ideas more quickly and then pick the right one to test in an online setting.
Okay.
Maybe going a bit back to the aspect that you mentioned a couple of minutes before about also the future challenges.
So you mentioned, of course, the CS problem in cooperation of contextual signals, like for example, time of day and more to serve longer or shorter content.
The business rules.
I want to ask you a bit more about these underserved user groups because it's somehow of a, I would say, tricky field because in order to understand that some certain groups are underserved, you need to know something about them.
And sometimes, of course, you don't have access to, let's say, protected attributes.
So you, I'm not sure because I haven't gone through the registration yet, but I'm not sure whether you have access to, let's say, the gender of your users or something like that.
But how then do you detect that certain groups are underserved?
So as I said, so this was something that we will explore, but we have a very educated guess, so to speak, that there are definitely some.
Users and we can pretty much based on the model that we currently have, we can say, okay, if this model works quite well, then let's see if you can do better for the users that do not interact that much.
So the thing is what an approach that we did.
Okay.
We, for instance, we call it like take minimum approach.
And this is, this is a famous in literature.
So first we, we take the last seven days for the training, right?
And then we freeze the number of users.
And then we identified the users whose number of interactions is pretty low.
And for these users, we go into the past.
If it's like seven days, we go for them even for, for three months and we collect their interactions.
We pull these interactions into the game and in the end we provide recommendations for them.
And actually we AB tested this approach and it proved pretty good.
Okay.
So the assumption, I would say that is behind this approach as said for your, let's say, high active users that are frequently interacting with a platform, you want to take into account more of their short term feedback to provide more dynamically changing recommendations and those for which the seven day window doesn't provide enough evidence to infer the preferences on you go back more in the past, but then it's really about seven days or three months that you're going to take into account of interactions.
Yes.
Only for the, these underserved users, we study the best.
Which means that when talking about underserved groups and this is more about their activity that they have.
So kind of frequently active users and let's say less active users.
Exactly.
But then kind of non active users are those who are not getting the recommendations they deserve, so to speak.
This is the nature of the algorithm, right?
So this is the nature of the algorithm.
It's a known thing.
So this is why we said, okay, let's, if we are using it, if it, if it works well in the majority of cases, let's, let's see how we can actually tackle it.
To work well in, in all other cases.
That definitely makes sense to me.
And also many parts of our conversation also remind me of a paper that was presented at CIKM that was about a simulation and controlling popularity bias and also about how user profiles in terms of genre probability distributions might converge over time due to a feedback loop that you are getting.
Is this something that you also look into?
You have just mentioned, you also participated in the personalized recommendations at scale course.
And there of course, two tower patterns were also presented and touched on.
Also something you might want to think about in terms of coming up with hybrid approaches is creating dedicated user profiles.
So is this something that you want to deal with in the future?
Because in its easiest manner, you could say that user interacts with certain items and thereby interacts with certain genres.
So creating genre preferences for users, like you also mentioned in the very first part to detect or let's say distinguish the sports from the comedy user, but also directly using them with certain approaches.
Is this something that you are considering?
Yes, we will be considering this definitely.
So this was the actual aim of me participating in the course to collect all the ideas, to then present to the management the ideas, to get the green light and to finally try to beat the deviated matrix localization.
Because we were working very, very much to find a system that will beat it, but we have to prepare the ground to beat it.
I mean, if we are given like only a single section, you know, to provide the recommendations, then it would be a perfect scenario to incorporate all the ideas.
You were also touching on business rules and how you incorporate business rules in filtering recommendations.
Is it really that you filter recommendations at the very end?
So once you get them as the ranked list of items from a certain model, or do you also already use or try certain approaches that rather incorporate certain business rules?
Because I found it always a bit difficult to constrain a list of recommendations afterwards, because then I might be ending up with too few items.
If I would, for example, constrain the output to kids content, is this something that is kind of a real problem for you?
It is actually.
So we do the post processing in the end.
So we do the filtration in the end.
But obviously, depending on the business rule, it might happen that sometimes we get like, if we are to recommend 10 items, we are only left with two.
There are like programmatic ways of how we handle this.
We handle this approach.
So there are like statements that we take care of these cases.
But the point is we do business rules towards the end.
There was one very interesting business rule, kind of a request that when a user gets recommended an episode coming from a from a podcast, right, and the user has never seen this podcast, then you should recommend him the first episode from this podcast.
This was the requirement, you know, and it was it was easy one to do, you know, but the thing is, it makes sense in some cases.
But some of the episodes are usually like they come from the same genre, but the episodes like among themselves are totally like the world for itself.
You know, so it doesn't actually make sense.
So this is how we argued.
So they were they were like, yeah, I remember there was once a summary, I guess it was performed by Spotify on broad set of podcasts and looking into their length and so on and so forth.
And they also distinguish into these two kinds of podcasts that there are.
And I guess also that Apple Podcast does this where you say, okay, you have episodic content and there's the other one I can't recall its name right now.
But basically the content that you say and put there as a very reasonable counter argument to say there is also podcasts that can be consumed in any order.
You also said cold start.
So one of the main foundational problems in recommender systems for cold start users, what are your ideas there to use or to improve the audio tech?
This is a definitely an area, an area of improvement.
So this section, the content based section will always be provided.
But the the personalized section is where the problem is.
If the if the user is like completely new, if he has two or one interactions, we filter it out from the data set.
This is like we are cruel.
But as I said previously, the take minimum approach is for all other users that we scan the past history to see if there if there is some more interactions that we can pull from the past and integrate with the initial data set.
These are the current approaches.
But obviously, as you have identified, there is a large area to be explored with regard to the cold user problem.
In terms of diversity, I remember having had that discussion with Rishabh in one of our previous episodes that was around discovery and diversity so that you always want to provide users with the opportunity to discover new content.
But this doesn't necessarily translate into providing diversified content.
This is something that also resonates with you that you could based on the users and their histories distinguish into users with more narrow interest versus users with broader interests and that you might want to adapt a diversification approach to personal diversity traits of a users.
So, for example, that users with narrow taste profiles don't really want to see very diverse content and that you might also need to take into account this.
I guess there was a paper by Spotify where they were looking into this aspect.
So the general list versus a specialist.
Is this something that you also want to explore more or is it more of a let's first have a general solution for diversification?
It is definitely so good that you brought it up.
And it actually the idea is with us like six months that we built the kind of user class, the user profiles of the users that are like narrow users and they stay within their comfort zone and users that are exploring the content wildly.
This is definitely something that we will be looking at.
We will then have to I don't know if we if we will need to some how you say editorial expert knowledge in how on how to identify these.
I mean, probably we can do it by the embeddings and see the the actual browsing history and the embeddings to group the users.
But this is definitely something that we will be looking at.
And there is also this there comes into play, but I don't know how right now I don't know how to combine or separate the two.
There is this underserved user groups.
So it's also related to the model.
So if the narrow users or if the users that are exploring the content or are out of the comfort zone users, are they like underserved users?
And what can we do in that direction?
So these two objectives are kind of merging into one.
But it is definitely something that we will be we will be looking looking at.
Yeah, interesting stuff.
Seems like there is a lot of work still to be done and you won't be or get bored with it.
I say let's get the green light from the management first.
Let's get the green lights first.
What is your broader perspective on the field of recommender systems?
I mean, there's no day where none of us is not, let's say, fully overthrown by any chat GPT posts online.
But if you look at the field of recommender systems, what is your take?
Where is the field developing into which direction or where do you see major challenges?
The very interesting thing for me is the recommender system in e-commerce.
So where we can where one can basically study the user behavior like, OK, the user interaction with the web page, like how many times?
I mean, there is like a user hesitation.
Like if you study the events that the user has with a particular item, like if he views the item several times across X number of minutes, if he puts it to cart and if he removes it to cart, what should you do?
What kind of signal is it?
If he interacts with the search box, can we have the speed of typing?
If he scrolls down the page like this is the like if he's interested in this particular item, like he reads.
And also these are all the signals to the recommender, but this is signals to what what can we do?
Like if this is like a hesitant user, let's give him a gift.
Like if there are several items in his cart, let's let's give a related item or let's give some discount or, you know, push.
I mean, it's a little manipulative, I know.
But, you know, these these these things are very interesting to me.
And I mean, the same ideas, maybe not maybe definitely we could adjust and then use in our case.
But I think that the big questions, at least in my understanding, are answered and now utilizing these signals is where the recommender system will will go.
So this is around very user behavior center, like psychology and then studying the actual behavior, maybe even utilizing the cross-platform.
If there is access to it tastes or something.
So this is something very interesting to me.
I would love to see this applied, like if we have all these listeners, but which would not be that that easy to have with the public broadcasters.
But you get the idea.
Yeah. Yeah. Interesting and worth exploring.
Besides being a data scientist and a expert, there was also a third category that you put yourself into.
And this is actually that you are a novelist.
Tell us a bit about your expertise in writing novels.
OK, it's an expertise in writing novels.
It doesn't really sound, you know, it's it's like I always I like like writing fiction.
And I was I grew up in in Europe, actually, in Serbia.
So it's like a former Yugoslav.
It's a communist country.
And we were heavily influenced by the Russian classics.
And when I came to Germany, I was reading German classics.
And one of my favorite readers, actually, the very top of his Hermann Hesse.
And yes, so I said, you know, you know, from the from the perspective of whether I'm going to do something or not.
So I I look from the perspective of a 80 year old guy who looks at his life and says, OK, would I regret if I haven't done this?
And for me, it was OK, I would regret if I haven't written a novel.
And back then, in 2015, it was the time when I was OK, I was finished with my PhD and I had the capacity, I had had the I hope, the talents to write a novel.
And I said, OK, let's let's write a novel.
I was still under the experience of writing the PhD.
It is completely different business, but still.
So, yes, so I I've written a novel.
It was written in English.
It took me some six years to do it.
I collaborated with literary consultancy agencies from London.
It took five drafts, I think.
So it was a pretty, pretty intensive intellectually and emotionally intensive work.
Yes. So if you're interested about what it's about, I can provide a few sentences.
Yeah. Just tell us a bit about and when it will be available.
Yes, it is.
I'm a boss coming from Serbia.
So I was practically since my birth, I was in the war zones.
So I I wrote a novel about Bosnia, who is growing up in an area in Serbia during the Bosnian War.
So this was away from the from the mother country.
And he witnesses how the Serbian police murders his father.
And then as an eight year boy, of course, he forms a prejudice about the the Serbian nation and he has hatred for all the Serbs.
And this then he carries his whole life.
He and his mother, he carries his his whole life.
But obviously, you say the reflection of such of such a thought is this person restricts itself in life.
So when he's 27, 28, so 20 years after he's a student in in Sarajevo and so in the capital of Bosnia.
And his first neighbor is a Serb who has the same name as his father's killer.
So then there is a dynamic between the two.
But at some point he notices like it's it's like 300 pages.
It's it's a very like dynamic is that he notices, OK, he's a bit different, but then he explores a bit more.
This service is actually a good one.
So it goes like he moves him from one category to the other until he has a big problem.
And the Serb actually saves him from suicide.
And then they form a very strong bond.
So they become very good friends.
And the the story ends in the Marco, which is the name of the of the Serbian visiting Ismet, which is the name of the Bosniak and his mother in the hometown where the father was murdered.
So in overall, it is a story about reconciliation of the two nations.
And I see because I was in Sarajevo that the war is unfortunately still not over.
It's still in people's head.
And unfortunately, I think it will remain so for at least the following 20 years.
So I wrote it in English because I kind of knew that writing it in Bosnia would be a big failure, at least for the moment being.
But yeah, obviously, it's relevant nowadays.
It is for the status.
It is with an American agent.
So I managed to find an agent who was represented with the publisher.
So he's contacting the publishers and we will be also contacting the German German publishers or European, because the story is it's written in English.
But the American market is obviously the biggest.
But the story is closer to Europe. Right.
So I hope that we will have like a positive response from from one of the publishers.
If not, then I will because I want to write more novels.
I want to get rid I want I want to publish this one.
I will then put it as a Kindle and Amazon.
I hope it will treat me well with the cold start and stuff.
They recommend they will treat me well.
But yeah, it's it's it's it's actually an adventure.
You know, it's it's it's a process of self discovery.
It's emotionally intense, obviously.
And it's it's the work that structures your emotion and your intellect writing.
It's like the merge of the two worlds for me, like the the the mother and the father will be the philosophy ratio and the emotion.
It's the perfect vocation, at least the writing, at least for me as a personality.
OK, what is the title of the book?
Waters of the Green Waters of the Green.
The green is is the river where Ismet swimmed as a boy.
So it is in this in this town where he lived at the very beginning.
And in the end, both of them are actually they they are at the green.
So this is how it begins. This is how it ends.
So I hope you haven't spoiled too much, but it sounds definitely like a touching story.
And to be honest, you got me some goosebumps several times when talking about it.
So watch out for Waters of the Green once it's published.
I hope for the best and definitely sounds great and like a nice counterweight.
So on the one side being very theoretically writing scientific stuff.
And on the other side, going into that very different direction of writing.
It's if you would not only draft your scientific or your novelist work, but also the future of this podcast and in this podcast talking to people from the field of RecSys.
Is there some specific person that you might be interested to hear more about?
Have you done a podcast with Harald Steck?
Not yet. But OK, if this is a person, then I will take a note.
Yes, I would definitely love to hear.
So Harald Steck, Yehuda Koren and Stefan Reindler, have you done with any of them?
Not yet. But three names.
Maybe they are also already on my list.
Yes, that would be that would be very, very good to to have them here also.
They are definitely like leaders.
Yeah, definitely.
OK, so it was really a pleasure talking to you.
I guess we learned a lot about the work and also many interesting stuff in the work on the podcast and all your recommendations that you provide within the IID audio take.
And also nice to have a German product in this podcast that comes from Germany.
So thanks for participating in the show.
It was really a pleasure.
The pleasure is all mine.
And thanks for having me.
And yes, I really enjoyed.
I really, really enjoyed your time.
Thanks. And the last question, will you be going to RecSys this year?
Not this year.
OK, not this, but maybe next.
Next one, we are back to Europe.
Yes, this is this.
Thank you. Have a nice day and talk to you soon.
Thanks. Bye.
Thank you so much for listening to this episode of Rex Birds, recommender systems experts, the podcast that brings you the experts and recommender systems.
If you enjoy this podcast, please subscribe to it on your favorite podcast player and please share it with anybody you think might benefit from it.
Please also leave a review on Podjazer.
And last but not least, if you have questions, a recommendation for an interesting expert you want to have in my show or any other suggestions, drop me a message on Twitter or send me an email to Marcel at Rex Birds dot com.
Thank you again for listening and sharing and make sure not to miss the next episode because people who listen to this also listen to the next episode.
See you. Goodbye.