Cause & Effect

In this episode of Cause & Effect, Johannes Schickling had a conversation with David Golightly, Staff Engineer at MasterClass, to explore how his team built Cortex – a real-time voice AI orchestration layer that powers personalized conversations with celebrity instructors like Gordon Ramsay and Mark Cuban.

Effect is an ecosystem of tools to build production-grade software in TypeScript.

#Effect #TypeScript #Zendesk #softwareDevelopment

Song: Dosi & Aisake - Cruising [NCS Release]
Music provided by NoCopyrightSounds
Free Download/Stream: http://ncs.io/Cruising
Watch: http://ncs.lnk.to/CruisingAT/youtube

(00:00) - Intro & David's background
(04:56) - Discovering Effect & early impressions
(08:32) - Why RxJS wasn’t enough for Cortex
(16:15) - MasterClass On Call
(19:10) - Building the orchestration layer
(25:30) - Incremental adoption of Effect at MasterClass
(31:43) - Text-to-speech component
(40:08) - Error handling, observability, open-telemetry
(01:01:20) - Looking ahead: Effect 4.0 & the future
(01:08:00) - Closing thoughts

What is Cause & Effect?

Explore how software engineers use Effect to build reliable, production-ready software in TypeScript.

When I started looking

into Effect, I started seeing,

Well, this has a lot of promise and

potential if you use

it in the right way.

What I mean by that is you don't have to

necessarily go from 0%

to 100% overnight.

You don't have to completely

redesign the entire application

to take advantage of some of the benefits

that Effect has to offer.

That's how I started, was incrementally

adopting Effect in

parts of the code base.

And this was really

what enabled it to work.

Welcome to Cause & Effect,

a podcast about the TypeScript library

and ecosystem called Effect,

helping engineers to build

production-ready software.

I'm your host, Johannes Schickling, and

I've been building with

Effect for over four years.

With this podcast, I want to help others

understand the powers and

benefits of using Effect.

In this episode, I'm talking to David

Golightly, who's a Staff

Engineer at MasterClass.

In this conversation, we explore how

David has built Cortex,

MasterClass's AI voice chat system,

leveraging Effect streams heavily

to build a cutting-edge AI

experience. Let's get into it.

Hey David, so great to have you on the

podcast. How are you doing?

Doing great. Thanks for having me.

I'm really excited. The two of us had the

pleasure to now meet in person twice in

the course of the last half a year.

The first time we've met in person was at

the Effect TypeScript

Meetup in San Francisco.

And then a couple of weeks ago, we had

the pleasure to meet again in beautiful

Italy in Livorno, where you've also given

a talk about Cortex and what you're

building at MasterClass.

And it was such a pleasure to spend a

bunch of time together and talk about all

things TypeScript related beyond.

You also have some beautiful music gear

in the background. But today we're going

to talk about how you're

using Effect at MasterClass.

So would you mind introducing yourself

and sharing your background?

Yeah, sure. My name is David Golightly. I

am a Staff Engineer at MasterClass.

And I've been in the industry building

mostly web applications for

about almost 18 years now.

I've also dabbled in some mobile

applications, some back end, some other

embedded applications over the years.

But my main focus has been first

in the JavaScript world and then more

recently in the TypeScript

world for the last several years.

My actual background, though, I don't

have a Computer Science degree. My

education is in music.

And I grew up

surrounded by pianos.

My father is a piano technician and also

a piano professor who plays

classical, you know, everything you could

think of classical music.

He has played it. You know, he's played

with orchestras and he's, you know,

played solo concerts and chamber music.

And so forth. So just steeped in

classical music growing up in pianos and

also in repairing pianos, which I found

somehow translated

into programming sort of.

You're doing a lot of really repetitive

tasks. You have 88 keys. You have 230

some strings on a regular piano.

So there's a lot of repetitive tasks and

you quickly figure out ways to work more

efficiently when

you're doing piano repair.

So later on, after I got my degree in

music, I found that it wasn't well, well,

music is a passion of mine and it's

something that I still do.

It's not something I want to try to make

a career in. And so I quickly discovered

that, well,

programming is just right there.

And I actually enjoy doing it somewhat.

And so I devoted some energy into

learning how to do that professionally.

And, you know, all these

years later, here I am.

That is awesome. And such a funny

parallel since we never talked about

this, but my mother also

happens to be a professor in piano.

And so I likewise also grew up surrounded

by pianos and played the piano for quite

a while, but then also found myself more

attractive to technology.

But I think this is a fairly common

overlap of people who had their start in

music or a different kind of art and then

gravitated also towards programming.

And I think it's just a certain way of

like how someone's brain works that

really like there's a couple of people

come to mind to have professional music

background who are just like absolutely

brilliant in the engineering field.

So that's a very funny parallel. But

keeping it a bit more focused on like

Effect, what has brought you to Effect?

How did you first learn about it? And

what did you think about it when you

saw it the first time?

Yeah, so I was not necessarily going to

adopt Effect for the work I was doing,

say, maybe over a year ago,

which was more React focused.

It wasn't something that I was

necessarily considering.

You know, I think React is a

great framework and it totally

transformed how we do

front end development.

But it's not without its problems in

terms of state management and so forth.

But I was mostly happy with it for the

front end and we try to

keep things lightweight there.

But then I got asked to work on this new

project Cortex, which I'll

talk about a little bit more.

And that was going to be a server side

application that was also not a

conventional API server.

Instead, it was managing a lot of async

events, a lot of open WebSockets, more

than one of like several open WebSocket

WebSocket connections that

it needed to then coordinate.

So I was looking at a proof of

concept that somebody else at our company

had built that was really, you

know, what we call async hell.

That is just lots of callbacks, lots of

event listeners passing

around references to random things.

And we're starting to try to build into

it observability and error handling and,

you know, more advanced interruption type

features that involve coordinating the

state of multiple WebSockets.

And it just wasn't going to work.

It was not going to scale.

The implementation was just going to get

increasingly tangled.

And so that's when I started looking

around for, well, what is

the state of the art here?

I had used RxJS in the past.

And I think that, you know, for people

who've used RxJS, when they see Effect,

that's what their first thought is.

Oh, this is RxJS.

I've seen this before.

You know, this is this is familiar to me.

It's just another RxJS kind of knockoff.

I didn't really find it to be the case,

though, for several reasons.

But I think that what really caught my

eye with Effect was the

really active community.

Because when I'm looking to adopt a new

framework that is going to form the

backbone of how we build an application,

I don't want to have to

be inventing stuff out of whole cloth

I don't want to have to

resurrect things from a couple of

examples that don't

really apply to my application.

I don't want to be in

the dark and be on my own.

And I especially don't want to ask my

teammates to accept a framework where

they don't have the proper

documentation or the guidance.

They don't know where to find help.

And so really, I found that the community

seems to be Effect's biggest asset.

And just how helpful everybody is and how

enthusiastic everybody is

really ease the adoption process.

That is awesome to hear.

And that definitely also reflects my own

perspective and my own reality.

This is what drew me

to Effect very early on.

The community was much smaller at this

point and I tried to play a positive role

in like growing and

forming that community.

But it's also something I'm like super

excited about and super

proud of that we like together.

Yeah, broad Effect to

the point where it is today.

And it attracts more

like brilliant minds.

And it's also interesting that Effect

attracts a lot of experience engineers

like yourself, but also new engineers

that are drawn to

building something great.

And that community is such a compelling

aspect of that as well.

So maybe to linger on the RxJS part for a

little bit since yes, quite a few people

are kind of comparing Effect with RxJS.

For folks who are familiar with RxJS but

not yet with Effect, where would you say

there are parallels and the commonalities

and where would you say

things are much more different?

Well, in RxJS, the idea is

it's very stream oriented.

So you essentially are creating

everything is an observable.

It can emit events.

And this is really useful actually.

I think a lot of programming situations

can be modeled in this way.

You have especially the asynchronous

stuff like really anything side-effectey

is happening asynchronously.

You can't necessarily expect it or

predict it, but then you have to react to

it and do something in response to it.

You have user input,

user loads the web page.

You didn't know when or why

they were going to do that.

They click on a link or they decide to

watch a video or fill out

a form or what have you.

This is ultimately the timing and

coordination of these things

and these inputs are driven by humans.

And likewise, when you have systems where

you have connections coming from third

party services or you have other kinds of

asynchronous behaviors where you have to

connect users to each other.

A lot of the hardest parts of programming

happen when we have asynchronous behavior

that we have to model.

And this is why I think you look at

paradigms like HTTP or more specifically

like a model view controller that you see

in Rails that is still so successful.

And one of the things that that has

largely done is allow us to architect

applications around kind of one shot

actions where you get essentially a

function call via an HTTP request.

And your application now has to implement

the function call and respond with it.

But it's everything that happens between

the request and the response is largely

typically serialized.

It's sequential. It's

not in parallel, right?

Which is great. It makes things a lot

more easy to reason about.

You know, things happen in a sequence A,

B, C, D, E and then you're done.

But most of what we have to deal with,

especially in building web interfaces or

building applications like

Cortex, is not like that.

Because you have a lot of things that can

happen and you might be in the middle of

one thing and then something else happens

that you didn't expect.

And now you have to deal with that. And a

lot of this also involves kind of keeping

state updates going.

This is why honestly, this is why React

superseded jQuery is because they're

really good at this.

But getting back to your question about

RxJS, RxJS is kind of an even more next

level approach at this where you can

model everything as an observable.

You don't know when it's going to happen,

but you know that you might receive one

or more different kind of events.

And then how do you combine events from

one observable with events from another

observable when they need to interact to

produce some sort of output.

And RxJS is really built

around this sort of model.

The thing is though, RxJS is also trying

to be not just RxJS, but it's a reactive

framework that is cross-platform.

So you have the same APIs more or less

that are designed for JavaScript or

TypeScript that are then also ported over

to Java or ported over to Swift or ported

over to Kotlin or ported over to

whatever other language or framework.

And so I think there was an intentional

desire on the part of the designers there

to keep things language agnostic in their

designs, which in my opinion is kind of a

flaw because it means that it's

yes, you can transfer that knowledge

from one language to

another pretty easily.

If you know reactive programming, you can

do it in any language.

But it means that you're passing on a lot

of the strengths of each language that

you're in in order to make things as same

as possible between every

language that you're working in.

Effect in my mind is taking kind of a

very different approach, even though it's

based on the ZIO, which

is the Scala equivalent.

It was the inspiration for Effect.

My understanding is that now, Effect's

design is pretty detached from needing to

keep parity with ZIO.

Effect is basically about TypeScript and

getting really good at TypeScript, making

TypeScript the best it can be.

And there is no, as far as I know, I

can't speak for the core team, but as far

as I know, there's no attempt at saying,

"Well, what if we make Effect

for Java or Swift or Kotlin?"

It's not about that. It's just about what

is TypeScript good at and how can we be

really good at that.

So that's a really big point, I think,

because with RxJS in TypeScript, it never

felt like they really got how to make a

TypeSafe reactive stream

application really work.

the types

kind of sort of worked.

As far as I recall, there wasn't really a

great story around requirements

management or even really error handling

the way there is with Effect.

And so it lacked a lot of the

potential, it failed to take advantage

of a lot of the potential that TypeScript

has to offer in the way that Effect does.

Does that answer your question?

No, it does. And it

reflects also my perspective.

In fact, I've been using ReactiveCocoa

back in the days, and I believe that was

early on in Swift, and it was quite

pleasant to use, but it was very

deliberate for the purposes of streams.

And there was sort

of like a peak moment

where it got really into it.

But then I felt like a lot of impedance

mismatch where everything wanted to be

modeled as a stream, but

not everything is a stream.

Some things are just like a one-off

thing, and you just do an HTTP request

and you get it back.

Ensure that might fail, and eventually

you're just interested in one value, but

you kind of model your tree

tries, etc. as a stream, etc.

So this was my first exposure

really to this mindset, but it felt like

the gravity was too hard on

this very specific primitive.

And this is where I would rather like

create more distance between Effect and

RxJS, because Effect is not trying to

tell you, "Hey, everything is a stream."

No. Effect gives you a stream abstraction

that lets you use the stream abstraction

when you have a stream.

For example, when you use a WebSocket or

when you use something else, maybe you

want to read a file and use streams

through the bytes, etc.

This is where streams are a great use

case, but if you just want to do a

one-off thing that happens to be

asynchronous, then you don't need to be

forced that this is the

stream mindset, but this is where Effect

gives you different primitives.

And I think this is kind of the huge leap

beyond what RxJS gives you, and I think

this was always like the big caveat for

RxJS that not everyone buys into the

"everything is a stream" mindset.

And this is where Effect is much more

applicable to any sort of circumstance

where programming

languages is applicable to.

Very interesting to hear your perspective

on this. You've been

mentioning Cortex now a few times.

So before we dive into how it's built

specifically and how it leverages Effect,

can you motivate what Cortex is and what

role it plays within MasterClass?

Right, yes. So Masterclass has been

developing over the last year plus our

own in-house voice AI chat that has got

kind of a MasterClass twist to it,

and that you're not talking with

fictional, invented characters or

anonymous service bots, but we got

authorization from many of our

instructors who are working closely with

us to essentially clone

them into an AI persona.

This includes an LLM that is trained on

their ideas, their writings, their public

speaking, and it also

includes a voice clone of them.

And in some cases also, we have some

chefs like Gordon Ramsay is being

released soon, and we have a lot of his

recipes that are included, and he can

walk you through

making one of these recipes.

So it's a really interesting, interactive

way that I think enables MasterClass

instructors to make themselves really

available to a much broader audience than

they would be able to on

their own as individuals.

And so this has also presented an

enormous technical challenge because I

think when you are talking to an

anonymous AI, well, you might

be a little bit more forgiving.

You know that it's a bot that you're

speaking to on some level.

But with these real people who are very

invested in the integrity of their image

and their public persona, the bar is much

higher, I think, in terms

of really representing that.

And this is also like if you sign up and

you try to interact with Gordon Ramsay,

well, if you're a fan, how many times

have you seen him on TV?

You can watch the MasterClass courses,

you know, it's also

like more than an hour of content.

You know what he sounds like, you

know how he speaks, you know what he's

passionate about, what he doesn't like.

You know, this personality is something

that you've probably learned already. And

now we have to put that into an AI agent

or an AI bot that you can talk to.

And so I feel like the very similitude has

to really be there. And so that's been

really challenging just to kind of get

through the uncanny valley stage of that.

And so this is the background of what

we're building. If you want to try it

out, you can go to

oncall.masterclass.com, all one word

oncall, O-N-C-A-L-L, dot masterclass.com.

And you can sign up and try it out.

So this is the general product that we're

building. And we have a whole LLM team,

you know, a machine learning team that is

working on getting the voice models and

the LLM models really trained up to

represent each instructor.

And what I got brought on to do was

essentially to create an orchestration

layer because we have several different

components in the pipeline in order to

build this experience.

First, we have a speech-to-text component

where the end user's voice of microphone

stream is sent to a speech-to-text

service that then listens to the user's

speech and derives transcripts

from it of what they've said.

And then at certain intervals when we

believe that the user's done speaking and

is ready to hear what the AI has to say,

which is not a trivial problem, by the

way, we then send that transcript to the

LLM to generate a response.

And then the LLM generates text response,

and then we send that off to a text-to-speech

service to generate the audio for

that response in the instructor's voice

and then send that back to the user.

So we have several different

asynchronous services, the

speech-to-text and the text-to-speech

components of which are WebSockets.

And then the LLM is a HTTP streaming

connection essentially wrapped in an

async iterable in the JavaScript world.

So this is coordinating

between these different elements.

But as you can see, there's kind of a

step one, step two,

step three to this process.

What makes it kind of interesting is, as

I mentioned, we don't know if

the user is done talking yet.

The user can kind of start

talking again at any time.

What we want to do is shut down anything

the bot is doing at that point because

the user is supreme.

The user drives the conversation.

We don't want the bot talking over the

user when the user is

trying to say something.

And so it's really crucial that we are

very responsive to that and stop whatever

is in progress downstream, whether it's

at the LLM stage or

the text-to-speech stage.

And so this is

essentially what Cortex is doing.

Cortex is a Node.js application that is

connected to bi-WebSockets from our two

clients that we have, Web and iOS

clients, that connect

to Cortex over WebSocket.

And Cortex, in turn, provides WebSocket

interface that abstracts these multiple

services and receives the microphone

audio stream from the user

and emits back bot audio events.

So from the client's perspective, the

contract is simply send the user's

microphone audio to Cortex and receive

back the bot audio and then the clients

have to actually play

it through the speakers.

And that's kind of the, you know, the

whole the whole contract.

There's a couple of other details in

terms of, you know, if a tool call

happens or, you know, if state change

happens or something like that.

But that's the main gist of what that

WebSocket contract to

Cortex looks like under the hood.

However, Cortex has to orchestrate the

various services that I talked about,

execute tool calls and and do a whole

bunch of other, you know, state

management under the hood.

And so that's what we built with

Effect. It's 100 percent.

I mean, 100 percent. But like it's it's

pretty much the every piece of it is

built in Effect or

using Effect in some way.

That's awesome. So you've mentioned that

initially when you got into this project,

you got a first like proof of concept and

where it really like exposed you to all

of the complexity that's for solving this

actual product to

implement this actual product.

This complexity need to be tamed. Did you

work off that initial proof of concept or

did you rather roll

something from scratch?

And how did you go about that?

That's an interesting question because

I'm always trying to figure out how to.

Like re-architect a

piece of software, you know.

It's something you don't want

to have to do very often.

Also, I don't know, actually, if it's

me or just the stage of career I'm

at and the kind of problems that I get

brought in to help with.

But I often find myself looking at a

piece of software that is

in need of re-architecting.

I think it's largely because I was

specifically asked to

re-architect it in this situation.

But the proof of concept was interesting

because, you know, it was built by our

project lead on call, who

is now our VP of architecture, who's

just a brilliant person who has no

experience really with

JavaScript or TypeScript.

And asked chat GPT to help. I think it

was chat GPT to help with building it and

essentially just piece that all together

from chat GPT prompts.

There were a couple of things that came

out of that that were, for example, a

library that is like a private library

that isn't really maintained.

There wasn't really a

top level architecture.

you had individual pieces that

were kind of worked in isolation, but

then the glue between them became really

convoluted and hard to follow.

So so that that kind of became a problem.

But it worked. You know, ultimately, this

proof of concept kind

of worked for a demo.

It didn't handle interruptions very well.

It didn't have great latency.

I think in reality of like most

production grade systems are actually let

me take that back, probably not

production grade, but applications,

JavaScript applications that are running

in production are probably

That sort of state where is working well

enough on the happy path, but all of like

those higher level requirements that

really make a end user experience superb

that you can interrupt an AI model when

when it's talking to you or that's more

resilient to errors, etc.

This is where you need to go beyond that

convoluted everything

kind of pieced together.

And I think this is like a perfect point

to to start with, like

applying Effect to to go beyond that.

That's sort of like a messy state.

Yeah, this is why I wanted to

tame the asynchronous

behaviors in in Cortex.

And I was looking for a package that

would help me do that. And

this is when I found Effect.

And then at that point like I

said, I had seen RxJS. and I had

good experiences at

first building RxJS

But like kind of one of the common

complaints is that, well, now, like you

said, everything has to be a stream.

Now you have to model

everything in this way.

And it really completely changes how you

architect your software and you need a

lot of buy in from your team if you're

going to start doing this, because now

everybody has to learn this.

They can't just sit down and, you know,

start coding on whatever part, you know,

feature they're trying to do.

They have to first learn this framework

and that can be a big ask for folks.

And so that's something that I wanted to

be conscientious about.

However, when I started

looking into Effect, I started seeing,

well, you know, this has a lot of promise

and potential if you

use it in the right way.

And what I mean by that is like, you

know, you don't have to necessarily go

from zero percent to

100 percent overnight.

You don't have to completely redesign the

entire application to take advantage of

some of the benefits

that Effect has to offer.

That's kind of how I started, was

incrementally adopting

Effect in parts of the code base.

And this was really what

enabled it to work at all.

If I had to say, hold on, I need to take

this for a month and just completely

build it from scratch and nothing's going

to work and I can't deliver any new

features until I'm done.

That was not going to fly.

I was in a position

where I did not have to ask for

permission to go live in a cave for a

month or however long

it was going to take.

I always like avoid

being in that situation.

I don't think that's a good place for us

to be in as software engineers.

We need to be shipping like every day,

shipping features, shipping bug fixes,

shipping products constantly.

And like we never have the luxury to just

say, well, hold on, I'm just going to

like spend a month not shipping, you

know, to do this technical thing that I

can't really explain

why I have to do this.

Right. Nobody, nobody likes that.

So fortunately, Effect is not

a framework that is like that.

That's not what was

asked of us to adopt Effect.

Instead, it was you can start plugging in

Effect judiciously here and there.

And I started at the top level of the

application around really around the

WebSocket connection layer, around the

HTTP layer and the application runtime.

And that allowed us to kind of have a top

level of error handling that immediately

provided benefits for recovery and so on.

But we didn't stop there.

That was just the first step.

We started kind of rebuilding different

parts of the application.

Some of them were like isolated behind a

run promise or, you know, like a stream

that was, yeah, emitting

events into an event emitter.

But it allowed us to kind of rebuild

isolated parts of the application on

their own where like selectively where we

thought they had the most value to be

converted into Effect.

And often driven by product requirements.

So when I wanted to implement

interruptions, that was the reason to now

rebuild the speech-to-text component in

Effect, because it was going to be a lot

more work to do that.

If I was going to have to do it

the old way, it was like, really, it

wasn't like rebuild the whole thing

in Effect, the entire application.

Or just keep working the old way. It was

really for each isolated piece.

It was like, do I want to like add more

spaghetti to this application in this

component or can I just rebuild this

piece in Effect and make it much more

expressive and elegant while building the

feature that I'm being asked to build.

So it was much less of a refactoring

process and more of a incrementally

adopting it while also building and

delivering features.

I love that. And I think you've hit on a

couple of really nice points here.

One is that Effect, even though you get

a lot of benefits once you have

Effectified more and more of your

application, you're going to find that

you can delete a lot of code.

Everything just fits

nicely together in the same way.

If you incrementally adopt React, let's

say we're just in this transition of like

transitioning a larger code base from

jQuery, something else to React.

You don't need to do that in one like

long night, but you

can do that step by step.

And once you've reached a point where

everything is in React, then you can

delete a whole bunch of like glue code, a

whole bunch of like things that bridge

from the old way to the new way.

But it is also totally fine that you go

through this like transitionary phase.

But the other thing that you've mentioned

that I think is like a super great

insight, which is like, how do you

prioritize what to refactor with Effect

when or when to rethink

something and apply Effect?

And this is so elegant that you can like

go top down from like, hey, what is the

thing that we want to improve for users?

What is like the business outcome that we

want to affect here?

And then like looking at those, Effect

typically has

something in store for that.

If you want to improve performance, if

you want to improve reliability,

resilience, error handling, whatever it

might be, Effect typically

has something in store for that.

And then you can prioritize what to work

on depending on your end user needs.

And hearing here about the speech to text

aspect and applying Effect for that,

that sounds super interesting.

So I want to hear a little bit more like

zooming into from macro into a bit more

micro into this component.

Can you give a rough overview of like how

that thing was built before?

What you diagnosed the biggest things to

improve where and which primitives of

Effect did you use to

ultimately implement it?

Well, actually what I prefer to do is

zoom in on the text-to-speech component.

Oh, I'm sorry.

I remember that wrong.

But yes, no, it's okay.

You remembered it right.

It was you.

I said speech-to-text earlier, but I feel

like the text-to-speech component.

This is the component that I also talked

about in my Effect Days

talk and did a deep dive on.

I feel like the text-to-speech component

was kind of like a real unlock, really an

aha moment of like, wow, this is this is

what Effect can do in

terms of like saving

a lot of complexity from your code base.

As I mentioned, this is a

component where like it's a WebSocket and

we stream LLM thoughts to it.

So an LLM response, as you might be

familiar, it sends what are called chunks

in a chat completion response.

And the chunks are usually a token or two

tokens and they're not

complete words a lot of the time.

So then we're like accumulating the

chunks into basically what we call

coherent thoughts and the coherent

thoughts can then be sent to the text-to-

speech component to

generate the bot audio.

However, if there's an interruption, we

need to shut down the LLM and we also

need to shut down the

text-to-speech component so that we don't

continue to generate more thoughts based

on the previous thing that the user said

before they continued talking.

And now we want to start over and respond

to the most recent thing

that that user has said.

So the text-to-speech component now it's

a WebSocket connection.

When you send it a

coherent thought, that connection

will then respond asynchronously with one

or more different

events that you might get.

And we're basically just

streaming those up to the client.

But when there's

an interruption, we need to actually shut

down the WebSocket

connection, close the connection.

Abruptly, so we don't get any more

messages from it and then reopen it.

And then in that period of time, and it's

not usually very long, it can just be

like a hundred milliseconds or two

hundred milliseconds where we're waiting

for that WebSocket connection to open.

We hope that

we've created a connection, but we've not

yet received the open

event from the connection.

And it's in that time that we were often

getting the LLM was trying to send

messages to it, but it was erroring

because we were sending WebSocket

messages out to a

WebSocket that was not yet open.

So we had to now queue those messages to

wait for the open event from the

WebSocket connections and then flush the

queue when it was open.

So as you can imagine, this created some

code complexity in the pre-Effect

implementation and it was something that

Effect turned out to be actually very

good at because these are the kinds of

things that Effect has out of the box.

In Effect we were able to replace the

WebSocket handling code with the

WebSocket utility from

Effect from effect/platform APIs

And that has kind of a magical property

to it that you don't really ever have to

think about when the WebSocket is closing

and opening and you don't

have to wait for an open event.

What it essentially gives you is what is

called in Effect a channel.

And this became a primitive

that I became curious about.

It's something that I wish was a little

bit more first class in the effect world.

It's certainly used behind the scenes in

a lot of things like stream and other

APIs in the Effect world.

But this is what you get essentially when

you create a WebSocket connection using

the effect/platform API.

But then if you use the Effect stream

operator pipe through channel, you now

have a duplex stream, which is one where

you can start listening to other streams.

And then instead of doing like a run for

each or run collector, whatever you're

doing and you typically do it with a

stream, you now are piping the events or

the items in that stream out through the

channel through the WebSocket connection.

And then downstream from that pipe

through channel, you are getting incoming

events that are coming from that

WebSocket connection that you can then

emit further on down your application.

So this is great. But this is also what

it is also doing is this is abstracting

the WebSocket lifecycle

into the stream lifecycle.

So if you emit an error

upstream, it will close

the WebSocket for you.

And then if you have a stream.retry, it

will reopen the WebSocket for you

in the event of an error.

And because the streams are pull based,

you don't have to rely on queuing

explicitly. You aren't going

to miss any of your events.

When the stream is reopened, it will

start pulling those events

and behaving as expected.

So this really allowed us to

abstract all of the sort of tangled like

we had, I think, a promise that was a

reference to a promise that was kept

around and was

awaited in different places.

And it was it was a mess before. And now

we had a single stream that was just a

linear stream that where you had the

stuff going out, going in the top of the

stuff coming from the stream, coming out the bottom.

And it became very easy to reason about

what was happening. And you didn't really

even have to think about the stream

lifecycle at all. It was just you made an

error when you want it to

close and then just retry.

The WebSocket

connection is now handled by the stream

lifecycle. So you can use

the stream retry stream.

will be shut down when the

scope of the stream ends and the

WebSocket will automatically closed.

We also have a flush event that we send

out to the text-to-speech

service, which essentially says we're

done sending you a new speech for now.

So send us everything you got. And the

peculiarity of this particular service is

that they will then accept your flush

event and they will

promptly close the stream on you.

That's not really what we wanted, but I

don't know. They designed it this way.

And I don't really have, you know,

leverage to get them to redesign their

entire API. I just have to

work with what we have. Right.

This is a lot of application development.

You don't have the liberty to redesign

every API that you're working with. You

have to abstract it in some way. And so

this is what we're having to do here. But

the Effect stream

primitives make it really easy.

That sounds brilliant so far. And I think

what is so nice about this is

that it is A very clear and very

intuitive from a user perspective what

the system should do.

And as users, we're like all familiar

with when this goes wrong and how

frustrating it is. Like if I'm talking to

the AI, the AI goes off like in a wrong

direction that I don't want.

And I want to interrupt it.

And it doesn't act on this interruption.

You need to listen to it for another 20

seconds until I finally need to repeat

what I've just said.

And all of those things, they need to be

handled. And all of that is a lot of

complexity. And if you leave that

complexity unchecked, like you said, it

was a mess. And I think that nicely

reflects the majority of JavaScript

applications that are out there or

TypeScript applications.

And I think it's really like this

fine line to walk where you capture

all of like the existential complexity

that your use case requires and shaving

off like all the accidental complexity

and isolating this nicely.

And even to a degree where you say like,

okay, those services that you need to

call, they're not in the ideal shape that

you would like it to be. But then you can

just like wrap it and create your own

world that you're happy in and where you

can model your application in a nice way.

And yeah, I'm very happy to hear that

effect streams and the various primitives

that you've been using. You've been using

the WebSocket abstraction that Effect

gives you, and I

suppose also queues, etc.

That all of this has been so nicely

fitting together to model your use case.

So going a little bit beyond streams,

which other aspects of Effect have you

been using or which other kind of

superpowers have you been getting out of

Effect that have played a

meaningful role in the application?

Well, the error handling has been huge.

Honestly, we modeled all of our possible

errors. We have, I think, maybe up to 30

or so errors that the system can emit

that are domain specific tagged errors.

And those are decorated with a specific

error code and an error source, because

one of the things that will often happen

or that was happening originally, I

think, was, oh, we got an error. The

connection went down.

Well, we got an error and something weird

happened, and I don't know why, and now

I'm in a weird state. Oh, we got an

error, and the whole service just crashed

or something like this, right?

And even if you can just wrap everything

in a try catch, best case scenario, you

have some unrecoverable error, your

connection goes down, and you don't know

why. You just know, oops,

bye, and then all of a sudden.

And so it's frustrating for the user, and

it's also frustrating for the rest of the

team when they're trying to diagnose

what's going on, which we spent a lot of

time doing in our

development workflow internally.

And, you know, I'm a big fan of passing

the buck, so I don't like things to be my

fault. And I think

we're all in that domain.

I say this to joke, actually, I'm fine

with taking responsibility if it's my

fault, but I would rather things not go

wrong because of the

decisions that I made.

A lot of times early on, it was like,

oh, Cortex is down. Oh, Cortex emitted an

error that I don't understand.

And, you know, fair enough from a

client's perspective or from a test

engineer's perspective,

that's what it seems like.

But that doesn't really give you enough

information to troubleshoot because most

of the time it's not, you know, Cortex is

just a hub. Cortex is just passing events

from one service to another.

It's not Cortex really as the source of

the errors. Instead, what we see a lot of

the time is, oh, one of our

backend services went down.

Oh, a backend service emitted something

that we didn't expect. Oh, it's being

slow or something like this.

And now we're able to like create

targeted pinpoint errors whenever a

specific thing goes wrong

somewhere in the system.

And then those propagate up to our top

level error handling. And so if we have

something that's unrecoverable that

happens, we can now close

the connection like before.

But now we can send up detailed

information that allows our people to

diagnose what's the

problem. So it's not Cortex.

It's like, oh, our speech-to-text service

crashed or is out of memory or something.

And so now we aren't able to create calls

until we fix that

piece of infrastructure.

So that gives us a lot more information

that kind of actually saves a lot of time

debugging the system. It points you

directly to where the source of the

problem is instead of making

you go on a debugging hunt.

So that has been huge for us. The error

handling has been extremely valuable.

There are a lot of other errors that are

recoverable, but we want

it to log and report them.

So the whole error handling story in

Effect is fantastic, just surfacing when

things can go wrong and

forcing you to deal with it.

It has also meant, interestingly, I feel

like that, you know, like within Cortex,

not every function is an effect.

You know, not every single line of code

is a yield star. There's a fair amount of

just plain old, data manipulation

that is happening throughout the code.

It's data manipulation using

functions that are

synchronous and aren't going to throw.

Right. You could have very high

confidence that, you know, if you're

trying to get an index of a thing from a

string, you know, or

whatever, you're not going to throw.

you can do like a

lot of just kind of conventional

programming in areas that

are kind of safe and sandboxed.

And it doesn't mean that every single

calculation needs to be done in an

effect. It just gives you a safe place to

do that kind of work without having to

worry, you know, oh, if this throws or

oh, if this does something asynchronous

and I, you know, don't handle it, you

know, don't await it or whatever.

You know, usually those those kinds of

cases are they get a lot of attention

because we have to

think about them so much.

But that's not usually the most of the

work that we're doing. Ideally, right?

We're thinking about like, purely

functional transformations of data from

one state into another.

taking the input from

some kind of asynchronous effect and

sending it out to

some asynchronous effect.

But like the actual program, the business

logic is usually something that is like

pretty, you know, step by step, you know,

just just logic. Is usually when

we're not interfacing with an external,

you know, service or

some kind of side effect.

Then we can just write code like normal.

You know, we don't have to model

everything as a stream just to add up

some numbers or something.

Right. And I think that the super plain

way how you put it just write code like

normal. I think this is kind of the in a

nutshell, the difference

between Effect and RxJS

Where in RxJS you need to do everything

as a stream. And in Effect, you can write

code like normal. And another aspect of

writing code like normal

is trading errors as values.

This is we're all super used to just

passing around and

manipulating data. And somehow, we're

kind of brainwashed into

thinking that errors need to be like we

need something we need.

We're almost like paralyzed about like,

how should we deal with errors? But if

we're just trading errors as values as

well, errors as data and passing them

around and Effect just makes that easy by

giving us a separate channel

to deal with that error data.

And then, like you say, like you don't

want to you'd like to pass it along, then

it's just data that you pass along. So I

think that's just like code like it's

normal. I think that is like one of

Effect's superpowers.

And closely related to errors is like

having visibility into when errors happen

when something doesn't go as expected. So

and I think if I remember correctly, the

telemetry part the observability part has

also been a key aspect of

building Cortex and operating it.

So maybe can speak a little bit more to

how you do observability and telemetry

usually within MasterClass, particularly

within JavaScript applications and how

Effect has helped you to

maybe improve that even further.

Right. Yeah. So I sort of have to admit

that we don't have an excellent story for

the most part or before Cortex didn't

have an excellent story about

observability at MasterClass

in the JavaScript world.

We have a number of services.

We have we have a Raygun. We have New

Relic and we have Core Logics. We get our

logs to we send our logs to and we have.

So we have a bunch of observability

services for things like video. We have a

dedicated video monitoring service that

we integrate and

a few high value business.

Applications like that. We want to keep

an eye on, you know, error rates for

people visiting our home page or things

that are really

indicate business traffic, business

being impacted by

some technical problem.

However, usually that that amounts

to like something

that's easily reproducible.

And easily fixable usually and there's a

either some infrastructure that needs to

be restarted or code change that needs to

be rolled back or something like that.

Cortex really represents a new level of

complexity when it comes to understanding

what's going on internally. And I think

that a big reason for that is that it is

not a one shot HTTP server type of

application, but is instead

You know, a stream of streams and is

handling all of these asynchronous events

that are passing through it. It's not

directly doing much of any work. It's

just literally in between all these

services handing events

from one service to another.

So, as I mentioned before, when

things go wrong, they're mostly not going

wrong in Cortex. And likewise with

observability

where the system is spending time

doing work, it's mostly not spending time

inside of Cortex doing that work.

Cortex is instead, waiting for

events from other services. And so what

we're interested in measuring is not

really the conventional, I think,

telemetry story when it comes

to building tracing and spans.

I think this is a story that is also

baked into the default Effect telemetry

story, right? When you have a Effect dot

with span with a name, you know, and you

and you and you wrap

that around an Effect.

Well, that's great. That ensures that

that span or that Effect, the time that

it executes is going to be represented by

a span in your trace. And for most like

one shot type of actions that you might

perform, that works great,

which is most of the time.

If you're doing actual work within an

effect, within a one shot effect, then

that is the conventional way that you do

telemetry. We're not really doing that at

all in our telemetry

implementation in Cortex.

Instead, we're interested in measuring

time between events that are coming from

outside services. Cortex is the only

place where we can really gather this

information because it's the hub.

But Cortex isn't sitting around. We don't

have like an effect that is, you know,

sleeping until it gets a certain

notification from or an event from

another stream.

It wouldn't make any sense

to build the application that way.

And so it doesn't really make a lot of

sense to build our telemetry that way.

I suppose what we're doing with

Open telemetry is a little unconventional

in that a span doesn't represent Cortex

doing that work. Instead, it represents

usually really represents like the LLM is

working or the text-to-speech

service is working.

And we're just waiting for that. But it's

measured from Cortex, not from these

other services. But because

it's really all streams.

What we have to go on isn't an effect

that we're measuring. It is literally the

time between two events in those streams

that we're measuring. And so we really

had to roll a lot of our

own telemetry handling.

But, you know, we were going to have to

do this anyway,

ultimately, because when you have.

Let's say we're not using Effect. We're

using the original approach, the

non-Effect approach that is event

emitters everywhere, you know, WebSocket

event handlers and so forth.

You get a transcription from the speech

to text service and you want to start the

span of time that it takes that you're

measuring that it takes to.

Generate a response

to that transcription.

Well, you can have a central hub where

like maybe you're

collecting all of these events.

But you're starting that span in one

event handler and then you're entering

you're ending it in a different event

handler for a different service.

And so you need a place where you're

holding on to those references might be a

centralized location that is listening to

all of these events.

But then it becomes kind of tangled up

because you're having to.

Keep these references around and

keep them alive, from one

event handler to a

completely different event handler.

And this is an area where.

Yeah, we had to roll some of our own code

in Effect do it this way.

But I feel like Effect kind of made it

easier to do it this

way anyway.

Allowing us to have a kind of a

connection level, long lived reference to

these spans and then just manipulate

spans in what is essentially a stream dot

tap where we are listening to all of the

incoming events and then just

Starting and stopping them based on

which events are occurring.

It's not been perfect, honestly.

It has been a little error prone

sometimes and we've had to go in and kind

of tweak things when we

have unexpected behaviors.

It's an area that has provided immense

value for us, however.

It's given us a lot of insight into what

people are experiencing

if people are experiencing.

Really slow behaviors, slow responses.

If people are experiencing the bot is

talking over me or this sort of thing.

If we have errors somewhere in the

system, we can see exactly where and when

that happened and in what service and in

what part of that

services work it happened.

And we're able to trace, you know, what

was the sequence of of chunks

that were emitted by the LLM.

You know, how long did it take

for us to get that first

chunk out of it out of the LLM.

You know, comparing the user message in

the bot response and you know, if the

user interrupted the bot, how much of the

bot speech did the user here?

And so a lot of these questions that are

really of interest to the business and

also for us technically.

When it comes to how on call is being

used and how people are experiencing it

are really answered by the telemetry that

we've built using

Opentelemetry and Effect.

But,it's a very

custom system that I I don't know that it

has its optimal form yet.

And I also don't know that is necessarily

going to apply to

everybody in the same way.

I don't know. This is like I said, it's

it's it's a very custom system that is

built for our use case that will not

apply in most conventional applications.

But I think that's OK.

There's always special cases.

This makes sense. And, where

Opentelemetry or the previous

technologies that is based on Open

Sensors and Open Tracing, I believe those

were the two predecessor technologies

that merged into Opentelemetry.

Where they're historically coming from is

like from distributed tracing. And that

is typically in a

microservice kind of architecture.

We have one service request response

style calling into another

one, calling into another one.

So I think where those systems or this

technology shines historically, at least,

is on a request response pattern where

you just get that burned on charge that

that you know from like a

performance profiler in a single threaded

environment or a multi-threaded

environment, now we get it

over like a network boundary.

So this is where those shine. But going

beyond that for different modalities,

like for long running streams or I've

been also experimenting with using

OpenTelemetry in a front-end setting

where like a front-end session, you don't

have that request response, but you have

a front-end session.

For example, think about how you use

social media. You might be doomscrolling

for a very long time. So is your entire

session, is that the trace that you have

with possibly like thousands of spans?

Or where do you make cut basically? How

do you design your trace and your spans?

I think that is still something that

we're figuring out as an industry.

And it's been cool to hear about your

usage of it. And I think this also speaks

to the flexibility of Effect. Yes,

they're like default patterns make it

really easy and kind of trivial to

instrument your app out of the box.

But if you want to instrument it in a way

that's a little bit more specific to how

you would like to give your trace

meaning, that's possible as well.

Maybe taking a step back here for folks

who are curious about Opentelemetry and

like generally see the value in

observability, but maybe haven't taken

that leap yet themselves instrumenting

their app, which sort of guidance would

you offer to people what to focus on,

maybe what to initially leave out of the

picture just to get going?

Oh, that's a really good question. I feel

like the answers are going to be really

specific to your use case. And in the

case of an application like Cortex,

extremely custom. And we have spent a lot

of time refining and iterating on our

telemetry implementation. But most

applications probably

don't need that, honestly.

I think, especially in the

JavaScript world, there's both browser

and Node based auto instrumentations that

are available that do a lot out of the

box. So I feel like a lot of what I would

want to start with are the ends of the

application when your code is calling out

to another service or when you receive

events from the user.

Because that kind of determines the shape

of that user session or that interaction

that you might be

interested in measuring.

And then it will identify kind of

hotspots of like, oh, the user waited a

long time for this response or whatever,

what's up with that? And then you can

start to drill further.

And then the other thing that I think is

really super valuable is distributed

tracing where you are propagating a trace

across service boundaries. And sometimes

this is just, you know, you're

instrumenting in your browser application

or in your iOS application.

you're making calls out to your API service and

you want to see what's going on in the

API service as well during that time

period. You're propagating the trace from

your client to your server so that when

you see them all together,

you can kind of piece together.

Oh, the client called out to the server

and then the server made these database

calls and you can see that all in one

distributed trace. That's really super

valuable. So just focusing on the ends,

either incoming events or outgoing

service calls, and then making sure you

have your distributed trace propagation

set up correctly. Those would be the top

things I would recommend.

Right. I agree. And I think the benefits

of having a good observability story

for your application and for your system

is so manifold. Like it helps you with

correctness to kind of understand like,

oh, like something is not going well.

That you're not just like

completely in the dark and looking for

the needle in the haystack, but that you

actually have a great foundation to

figure out what went wrong.

That is the, I think the foundation where

people start leaning on observability

beyond just console log. But then also

like doesn't just help you with making

things correct or diagnosing when

something was not correct,

but also making it faster.

Like otherwise you might just know like,

okay, that API request has taken two

seconds, but why? And sometimes there's

like really counterintuitive situations.

It's very simple and very

easy to realize, oh,

this is why it's so slow.

And also speaking of AI, this will like

be a perfect foundation to give an AI

even more context what is going wrong and

let the AI iterate further and help you

make your app more

reliable and more performant.

One of the frontiers that we're going to

be exploring, I think now that we've

cracked the seal on observability is

integrating it with our existing data

analytics or our end user analytics that

we are already collecting.

In MasterClass we're a really good, robust data

team that is, you know, while respecting

like anonymity and user privacy is still

really invested in

understanding the user journey.

What are people doing? You know, why are

they there? What is enticing

them? What's driving them away? And these

sorts of questions that are really

foundational to understanding the best

way that we can

deliver value to our users.

Integrating this with a sort of Opentelemetry

will give us even more insights

of like, oh, did a user try to

load a page, but then they bounced

because it took too long to load and

things like this that will give us an

integration level between the sort of end user metrics

that we've been using and also the technical

implementations behind the scenes that

are underpinning that user experience.

I'm really looking forward to being able

to explore that further. And I feel like

it has tremendous

potential to offer value.

That sounds awesome. So speaking of a bit

more forward looking perspectives

I'm curious now that you've been part of

the Effect community, I think at this

point way beyond a year already and super

impressive what you've been able to build

with Effect in that short period of time.

What are the things that you're looking

forward most to when it

comes to the Effect ecosystem?

I'm really looking forward to seeing

Effect 4.0 come out. That looks awesome.

A lot of the improvements they've made to

the bundle size and to the implementation

complexity look really promising.

And, you know, I really admire how

responsive the Effect team has been to

the community, to the needs of the

community and really listening to

feedback, incorporating it, iterating.

That's really, I think, been vital to any

platform like this, getting any traction.

It's a very ambitious platform. It just

has to be said, the design of it to

encapsulate so much information about

what's going on at an atomic level in

your application, but then extending that

out into really

building whole frameworks.

The HTTP API, the Schema, the tracing

integration, the possibility that the

database abstractions, the networking

abstractions like WebSockets, file

system, node runtime,

there's React integrations.

Really, you name it, there's just tons

and tons of, and there's a whole bunch of

stuff coming in down the pipeline,

Cluster and AI integrations.

The paradigm is really extensible and it

has proven itself really robust and able

to handle all these different scenarios.

But now making all of those things all

work, you know, that's

an incredible challenge.

I hope something that they

succeed at. It's just so much.

But I think that with this large and

growing community, I think there will be

people using every aspect of that.

Probably not everybody is going to use

every piece of the Effect

ecosystem. I am not, certainly.

And most people will only use a small

sliver of it. And that's fine. That's all

you really need to do.

But when you have hundreds or thousands

of engineers all building different kinds

of applications with it, now you start to

get a lot more signal on

for this specific use case.

Here's what I'm trying to do.

Here's how I've approached it.

And that information, feeding that

information into the design of the

platform is going to be tremendously

valuable to making it more extensible.

But anyway, I'm really interested in the

Effect 4.0 developments that I think have

come out of this exact kind of feedback

integration and iteration.

And also, I'm really excited about things

like the, I think there was at the Effect

Days, there was some, there was a demo

that I think Mattia did.

Mattia, who did about the error reporting

and that was integrated into the editor

when there's like type errors in Effect.

Sometimes those can be cryptic and it's

nice to see that they're working on

making those a lot more

human readable and friendly.

It's nice to see the dev tools getting so

much love. It's nice to see the

documentation getting so much

improvement, so often very thankless.

Honestly, more than any specific

technical feature of the API, I just love

to see the developer

experience getting some attention.

And it makes things so much easier on

everybody to start building really cool,

ambitious things if you get really good,

clear, understandable errors at build

time or in your editor.

And if you have really good debugging

tools that let you understand what's

happening, if you have really good

documentation that has use cases and

examples of all kinds of different types

of patterns that you might

want to take advantage of.

This is the sort of, I think often unsexy

work of building a framework like this

that is so fundamental to people, like

end users being able

to get value out of it.

And it seems like the Effect team is

taking this seriously. And that to me,

more than almost anything, is what gives

me excitement and hope about

the future of this framework.

100%. I mean, I couldn't agree more.

They're all super ambitious efforts on

their own. And luckily, we have a small

but super talented and absolutely

brilliant core team of folks who are

working on the various pieces.

You've been mentioning the docs. We have

Giulio working more or less full time on

the docs as well as on the Effect Schema,

which we could probably also fill an

entire show on just talking about the

Effect Schema, which I

think you're also using.

Oh, yeah. We're using Effect Schema all

over the place. I didn't

even talk about that. Yeah.

Yeah. But then also the the dev tools.

I'm super excited that

Mattia will be joining.

Well, at the time of this recording, a

couple of days from now, but when the

time comes and the recording airs, I

think Mattia will have already started

working full time on the dev tools and

other pieces and improving

the developer experience.

And then at the core of all of it, Effect

4.0 and what it will enable. That's

really been the distillation

of all the feedback that we've gathered

over the last couple of years.

And I think really tackle some of the

most common points of feedback or

frustration head on. And I think it's

really like a huge improvement over

what's already really great and useful.

So I'm fully on board with with

everything you've you've shared. And

yeah, I think I can't thank you enough

for sharing all that what you've been

building with Cortex at MasterClass.

It's really, really impressive.

What you've been able to build in such a

short period of time speaks to

experience, but also speaks to just like

how capable the building blocks are. And

like when like experience and ambition

comes together with great materials.

I think this is how we're going to get

great experiences and great applications.

So thank you so much for doing that work

and sharing it.

And thank you so much for coming on the show today.

Well, thank you so much for having me.

And I just you know, I have to say that

so far, I think the on call product has

been a huge success and Cortex has been

very, very crucial part of that.

And I feel like it would not have been

possible without the Effect ecosystem and

also the support of the Effect core team

and the community to helping us get

there. I think that has played a fundamental

role. So thank you. I mean, to you,

Johannes, and thank you to the broader

Effect Team and the community for all of

the support and assistance and enthusiasm

and building this incredible framework.

It has really been a game changer.

That is awesome to

hear. Thank you so much.

Thank you.

Thank you for listening to the

Cause & Effect Podcast.

If you've enjoyed this episode, please

subscribe, leave a review

and share it with your friends.

If you haven't done so already, you can

join our Discord community.

And if you have any questions, feedback

or suggestions about this episode or

about Effect in general,

don't hesitate to get in touch.

See you in the next episode.

More episodes

Chapters

What is Cause & Effect?