A weekly podcast about all things PostgreSQL
Michael: Hello and welcome to PostgresFM, a weekly show about
all things PostgreSQL.
I am Michael, founder of pgMustard.
As usual, I'm joined by my cohost,
Nikolay, founder of
Postgres.ai.
Hello Nikolay.
Nikolay: Hi Mike.
Michael: And today we are also
joined by a special guest, Andy
Atkinson, Andrew Atkinson, as we
were saying earlier, who has
been a software engineer for 15
years at the likes of Groupon,
Microsoft, and Fountain, a user
of PostgreSQL for the past 10,
speaker at several conferences,
and now a published author with
the brand new book, High Performance
PostgreSQL for Rails.
It's a pleasure to have you on,
Andy, welcome.
Andrew: It's a pleasure to be here,
thanks for having me, guys.
Really fun.
Michael: Nice, well, I'm gonna
let you in on a little secret
here that I didn't tell you beforehand.
You are actually our first guest
who has been requested by a
listener.
So somebody specifically asked
for you to come on, which is awesome.
Andrew: Wow.
Michael: And with the topic Postgres
plus Rails.
So that's what we're going to be
talking about.
Andrew: I guess asking my mom to
do that really worked out.
No, that's great to hear.
Michael: Awesome.
Well, yeah.
Who better at the moment, you know,
you've got this stuff fresh
in your mind.
So yeah, in terms of where to start,
I thought it'd be really
interesting to hear from your side,
how popular a choice is Postgres
for Rails?
Like what's it competing with and
yeah, how's that been changing
over time?
Andrew: Yeah, good questions.
Well, the ORM in Ruby on Rails
is called Active Record and it
supports Postgres and MySQL/MariaDB, and SQLite.
So whenever you're generating a
new Rails app, that's one of the
first decisions you make, I believe
by default it's SQLite.
But most folks that have apps in
production, I still think it's
usually, they're using Ruby on
Rails, they're usually working
with MySQL or Postgres.
In kind of prepping for this, there
are a couple of surveys where
folks have been responding to surveys
about how they deploy their
apps.
And there was a 2022 survey from
Planet Argon, which I had
pulled up here.
And there were about 2,600 responses.
And from 2014 onward, Postgres
has been the most popular relational
database with Rails apps.
And I guess what I've seen is kind
of this shift, starting in
maybe the early 2010s, where I
think in large part, thanks to
Heroku having Postgres support
and Heroku being a really popular
choice, easy place, as you guys
have talked about it before,
made deploying your Rails app onto
the internet much easier,
pretty much a git push command,
and took away a lot of operational
toil that a lot of folks might
otherwise take on themselves.
And I think a lot of folks may
have switched to Postgres just
because it was part of the package
there, and maybe they weren't
deeply using relational database
features.
And so I think there was a lot
that had to do with Heroku.
And then of course, as also, as
you guys have talked about, Postgres
gaining a lot of features, a lot
of performance over the last
10 years.
So I kind of noticed this general
setup where a lot of the big
Ruby on Rails, the companies that
famously use Ruby on Rails,
like GitHub, Shopify, Basecamp,
that kind of started in the 2000s,
they tended to choose MySQL at
the time and they've stuck with
it for the most part.
They maybe are using, you know,
clustering solutions and things,
but companies that started more
in the 2010s or mid-2010s, I
feel like a lot of times they're,
they're running Postgres.
And so I'm certainly, you know,
now I'm doing independent consulting
with Rails teams that use Rails
and Postgres.
And I'm certainly, you know, I'm,
clients are coming to me that
are using Postgres.
So for whatever that's worth, there's
definitely companies out
there using it.
Michael: Yeah, I was going to ask
Nikolay as well and then I
realized that all of us have a
super biased sample but that does
that definitely does make sense.
That data you mentioned that survey
is really cool as well.
I saw the question below was
about which one would you like
to be using in production?
And it changes just 2 years earlier.
So you can see that there's that
desire and people wanted to
switch or wanted to use it in production.
And then it did eventually take
over a couple of years later
and hasn't lost first spot since,
which is nice.
You mentioned the ORM already.
Should we dive into that?
How are folks generally sending
queries across to Postgres from
Rails?
Nikolay: And where is Postgres?
If Ruby on Rails, then Postgres
is under Rails or where?
Andrew: Yeah, so with Active Record.
Oh yeah, where is it?
Let's see.
Where is
Nikolay: the place for it?
Michael: On again, isn't it?
Ruby on Rails, on Postgres.
Andrew: Yep.
Nikolay: The basis for Rails, okay.
Modweb.
Andrew: Yeah.
As a Postgres enthusiast, I tend
to think a lot about the database
operations and the relational data
model and almost like Rails
is kind of wrapping, developers
are writing Ruby code that's
doing database interaction and
other, other things.
But I almost tend to think about
things now more from a database
first perspective.
So it's almost like Postgres with
Ruby on Rails or something.
But I think, you know, depending
on how you tend to, most developers,
I think though, it's the other
way around.
They're mostly, most Rails teams
I've worked on, they're writing
Active Record code.
In the early days, I've been around
in Ruby on Rails for a long
time, almost 15 years with some
divergences.
But in the early days, there actually
was a little more competition
around the ORM.
Some of these projects are still
around, but Sequel, SEQUEL, and
Datamapper, some other ORMs that
were written in Ruby and allowed
developers to create queries and
evolve their schema with an
alternative to Active Record.
Different pros and cons, performance
benefits, maybe, that kind
of thing.
But Active Record sort of steamrolled
everything or consolidated
everything over time, as can happen
with open source projects.
So most developers then, in my
experience, they're thinking,
they're working more with objects
and interactions and algorithms
and they're building background
jobs and working with message
queues and things like that.
So they're not necessarily writing
a lot of SQL.
And as I was saying, most Rails
teams, they're going to write
Active Record.
So Active Record then generates
SQL and it can do things for
us as queries are generated, like
annotate them and say where
they're coming from in the app,
which is nice.
If we look at queries within Postgres
and we wanna kind of go
backwards and say, well, where
did this come from in the app?
And then another big way Active
Record is used of course, is
it is the de facto schema evolution
tool as well, which is interesting
because I think Rails developers
just kind of take that for granted.
Like, of course I control the schema
And of course I could ruin
everything with a bad migration
or whatever, you know, like an
incremental schema change.
But that's not always the case
when I've worked on other teams.
Like I've worked at, when I was
at Groupon, actually, the main
applications I was working on were
Java and there were DBAs and
we used Postgres.
But sometimes due to the scale
there, I mean, usually developers,
unless you're working on a small
microservice maybe that you
might have ownership of, if you're
working on any of the core
services there because of the scale
of the operation or how the
company was set up, usually there
were DBAs that would do riskier
database changes and that sort
of thing.
So you might actually just kind
of provide them an example of
the change that you want, and then
it might just be done and
they let you know, maybe via a
ticketing system or something.
So I think, you know, it's interesting,
like if you come to,
you know, your background might
be where as a Rails developer,
you may at a large company, you
may not do as much of the direct
schema control, but certainly for
a lot of small to medium companies
that is what developers do as well.
So they do need to be informed
about good schema design, of course
creating indexes, constraints,
all those sorts of things.
And that's where it kind of, that's
where then the ORM can start
to be limited in its scope, you
know, and you got to kind of
go beyond and learn like, okay,
well, what are the capabilities
I have at my disposal within Postgres?
Nikolay: And at some point you
need to switch from schema rb
to structure SQL, right?
Andrew: Yeah, Michael asked about
that.
Nikolay: I saw several companies
switching.
Why do we have schema rb by default
at all?
Because everyone is switching at
some point while growing, right?
Andrew: Yeah, well, and I kind
of agree with you, but I didn't
always feel that way.
It's actually an interesting little
like microcosm of the whole
spectrum, I think, of being more
of an application developer,
working within your programming
language, which is Ruby in this
case, and kind of thinking of the
relational database as just
a thing that you don't really directly
work with, but it's just
there to like store your data and
access your data.
And then on the other extreme,
kind of like I was saying, like
you're thinking of completely in
the database operations, you're
thinking about the schema, the
queries, the indexes, you're running
query plans in your head.
You're thinking about, you know,
how do we, we have this high
growth table.
Should we use partitioning?
Should we split it out?
Like that's the opposite end of
the spectrum.
And yeah, I think so the, to briefly,
for anyone that's not familiar
with the schema rb and the structure
file by default in Ruby
on Rails.
As you make changes, let's say
you add a table or you add an
index, et cetera.
You generate what Rails calls a
migration, and that would be
that incremental change.
So here's the new table definition.
Here's the index definition.
And it's expressed in Ruby, but
of course, it generates SQL statements
to run and they run against your
local database.
And then what happens is your local
database is then dumped,
its schema definition or its
database definition is then dumped
into a file.
And by default, that's a Ruby file.
So it kind of like translates it
back to Ruby and represents
it as Ruby.
And it kind of insulates you from
the SQL.
But really in Postgres, what's
happening is it's just running,
well, I should say, if you move
to the SQL form, what it's doing
is it's running pg_dump.
And it's basically just taking
the raw pg_dump output and putting
that into a SQL form of the file
with a little bit of extra stuff
at the end, which are those migrations
that you're creating.
Each of them have a version.
So it dumps those versions as insert
statements into the end.
But otherwise, it's basically just
the pg_dump output.
So yeah, what happens a lot of
times for teams is they start
out with the Ruby schema file,
they start to use things that
are beyond what it covers.
And so they might use like a materialized
view or like triggers,
triggers,
Nikolay: maybe, right?
Andrew: Yeah.
So stuff like that.
And, and then what can happen too,
is the open source community
can, can spring in and can say,
"Oh, we can fix this, we can,
we can actually extend the Ruby
form of the file.
So you can keep using it, but you
gotta also run this Ruby gem
with your app.
And now we can express triggers
in the Ruby file.
But I kind of tend to just encourage,
okay, at that point just
switch away or maybe even just
start that way.
But just switch to the SQL file
because it's going to give you
the highest fidelity information.
It's essentially what pg_dump is.
Nikolay: And if some database guy
is performing code review,
it's much easier to see the changes
in Structured SQL than in
this language you don't fully understand.
Basically, if you do it in the project
with SchemaRB, you end
up asking, provide me a full log
or a full dump of everything.
Andrew: Right.
Not to get too philosophical, but
as we're on a team building
an application, the code is kind
of our method of communicating
with the computer and with the
team members as well, you know,
as we express the domain concept.
Or in this case, as we're expressing
the database design.
So if your team has people that
are more, you know, database
only people then giving them a
Ruby file is, I mean, they, they
would have to learn it and it doesn't
actually have all of the
information in it often.
So yeah, it's a little, it's a
little weird.
However, as I was writing the book,
I was thinking about a lot
of these kinds of things too because
I have been on a lot of
different teams.
And I was kind of trying to not
be too preachy about one thing
or the other, just kind of saying
like, well, if your team is
mostly Rails developers, you know,
just know that there are limitations
to the Ruby schema dump, that there
may be some information that
you're not getting in here.
However, you know, and then I sort
of made the pitch for the
structure file maybe.
But yeah, that's a good point.
Like if you're working with a DBA,
you know, you can hand them
a dump file and they'd have a clear
understanding of what's in
the database.
Nikolay: Yeah, I'm curious if things
would be changed with LLM
and so on, which can easily translate
or something.
But it's a different story.
Andrew: Yeah, that's an interesting
idea.
Nikolay: Do you have an understanding
why do we need to keep
two tracks of changes?
So first is each migration is kind
of different, and if we keep
all of them, we always can build
from zero, from ground, we can
build our final schema, right?
Just in steps.
But also we keep this structure
SQL, or schema RB, doesn't matter,
which is a snapshot of the latest
version.
Why do we need both?
I was very curious, I was trying
to understand this all the time.
Why do we need both?
Andrew: Yeah, well some people
even say, so as your incremental
changes contribute to the single
file representation, schema
RB, and like if we're talking about
the Ruby one, some people say
that like the schema RB, it is
basically the same as the incremental
changes.
It's just everything at once.
It's kind of this intermediary
Ruby representation of your database's
structure.
And you don't actually need both,
really.
The incremental changes do serve
as the log, which you'd otherwise
have to...
You could get through Git.
Like if you looked at, you know,
if you pulled changes as a developer
on a team and you notice that the
schema would be changed, but
there wasn't the incremental file
that represented the change,
you could look into the Git history
and try to discern what happened,
but I guess the file kind of makes
it easy.
And also when Ruby on Rails was
started, we didn't really have
Git, so we were stuck with like
subversion or something else.
So, you know, I think Git made
doing that kind of version to
version investigation easier in
my experience anyways, like being
a, I think a better version control
tool.
But yeah, I mean, technically you
don't need it.
And so what some teams do after
a while is they just throw away
the incremental ones.
And so Ruby on Rails also allows
you to just load the schema.
If you're, if you're setting up
a brand new machine, or a new
developer on the team, you can
just load directly from that single
file, the schema.rb.
So that would be like a structure
load command.
And that would work the same regardless
of which type.
So Michael, you were asking about
like a beef.
I think there's, it's like not
a real beef.
It's like a, it's like a faux beef
that programmers invented,
I think.
But it's kind of like, do you want
to preserve the beauty or
elegance of the Ruby code, or do
you want to just like, you know,
throw it away for the ugliness
of an SQL file, maybe what some
Ruby programmers might say, Or
what I might say is like, I actually
think the SQL file is pretty.
It's elegant.
You know, it's got, like I said,
it's like high fidelity.
It's all of the information, you
know, and you can even customize
it.
Active Record allows you to pass
flags through to pg_dump.
So if you need even more information
or you want to change the
output, actually I did think of,
there is one reason maybe the
beef emerged is if you have two developers
running slightly different
versions of Postgres, so maybe
the same minor major version,
but slightly different minor versions.
Postgres changes the pg_dump format
over time, different, like
it could be the ordering or I think
it's usually the ordering.
And what can happen is you get
these annoying diffs as a developer
where two developers are essentially
it's noise.
It's not a real meaningful change,
but it's like, Oh, the triggers
we added now they're on top of
the constraints.
And before they were after the
constraints or something like
that.
I'm just making that up, but I
have a solution for that though
too.
Lucas Fiddle created, this is several
years old, but it's a Ruby
gem that basically does some post
processing on that dump process
and does like an explicit ordering
of all the content.
So that if you have a team that
uses that tool, then you should
have consistent ordering amongst
each other.
Michael: And this was for the .sql?
Andrew: Yep, that's for the SQL
version.
Michael: Makes sense.
Yeah.
Cool.
I could see Nikolay exploding as
you were talking about how beautiful
the Ruby was.
And so for our podcast listeners,
I felt like that couldn't go
unsaid.
Nikolay: There are 2 of the ugliest
languages in the world, JavaScript
and SQL.
They also happen to be the most
popular ones.
Andrew: That is interesting, isn't
it?
Or even C, right?
Like C is still, you know.
Nikolay: Another thing is that
Ruby was created or else in 1995,
same as Java, JavaScript, and PostgreSQL
95 and what else, right?
So many things same year, just
a random fact.
Yeah.
True.
Michael: Yeah.
So you mentioned way back about
some limitations of the ORM.
I think it's worth talking about
like ORMs have a bad reputation
if you talk to database folk.
Is it worth talking about some
of the like more common issues
there or like ways around those
issues or limitations when you
need to break out that kind of
thing?
Andrew: Sure yeah.
Well that was actually a pretty
big premise I wanted to cover
in the book is to show people Show
readers like there are these
other things in PostgreSQL that you
may not be aware of if you've
limited your kind of research
area to just what's supported
in ActiveRecord.
Michael: Mm-hmm.
I
Andrew: mean one that comes to mind
right away is table partitioning.
I mean, there's not really any
support in ActiveRecord for table
partitioning.
It doesn't mean you can't do it
with a Rails app, but you might
run into a couple of small issues,
especially prior to recently,
composite primary keys became supported
in ActiveRecord.
But I was performing a table partitioning
project on a Rails, older
Rails project about a year ago,
and there were some issues with
assumptions code would make about
primary keys, for example,
just like there's only one column
that is the primary key definition.
But yeah, so it's worth noting
that, you know, if you have a
high growth table that you wanna
look at table partitioning for,
you would be likely doing that
a bit on your own with writing
SQL commands and or kind of maybe
researching like a Ruby gem
that you would add into your Rails
project that has done some
of that work for you around creating
the table structure or making
sure your queries have the partition
key column or things like
that.
Yeah, I mean, definitely ActiveRecord.
I think ORMs generally try to bring
some of what the database
does into the application.
I think that's fair to say, like
for example, triggers in Postgres
that have, you know, that can trigger
different trigger types that fire
at different times and that have
different scopes.
There's a whole set of things called
Active Record lifecycle
callbacks that are similar in their
purpose where, you know,
you might want to persist an object,
which would be taking an
in-memory Ruby object, you know,
turning it into an insert or
update statement, basically.
But you could intercept that kind
of before that event happens
and do something within the application
code with an Active Record
callback.
So you could have like a before
save or a before commit.
And some developers might not then,
they might try to design
things within that scope where
maybe a trigger could be a solution
where like if two applications were
sharing the same database,
which is not a great idea sometimes,
but maybe you'd want to
put the trigger in instead of having
it at the application level
so you don't have to duplicate
code or that kind of thing.
Nikolay: What about transaction
control in this case?
Is it inside one transaction and
Rails controls this or can you
do like trigger outside of transaction
so it's not guaranteed
that it will be consistent?
Andrew: Yeah, yeah.
Active Record supports a transaction
concept that maps pretty
much straight up to Postgres, you
know, begin, commit or rollback
transaction.
And actually recently learned you
can pass in, you can do transaction
control.
Like if you want to change the
transaction.
Yeah, the isolation level.
Yes, thanks.
If you want to change the isolation
level, I actually just recently
realized ActiveRecord lets you,
supports that, you can pass it
in as an option.
And then I kind of verified it
myself and made sure that the
SQL statements were generating
those things.
But yeah, I guess if you do want
to have some kind of nested
transactions or if you want a little
more control, then I think
part of what, you know, you certainly
might want to take that
on yourself.
And, But I wanted to actually then
tie that into, I think, part
of why ActiveRecord has been successful
is it doesn't, at a certain
point, you might just say, well,
I just want to write SQL, but
I still want maybe some ActiveRecord
objects to work with.
And ActiveRecord doesn't prevent
you from just writing SQL within
your ActiveRecord code.
And so that can be beneficial if
you want to just say, well,
I just want to write my own SQL
statement here for a query, or
I want to even just use it as kind
of an interface to run some
commands, like maybe, you know,
opening a transaction, although
that is supported directly.
But you can write SQL commands
within ActiveRecord as a string
that then get invoked or get sent
through the ActiveRecord connection
pool, etc.
And then what you can do is you
can leverage then taking a result
set and taking advantage of mapping
all of those database types
into your Ruby object types and
have an object to work with.
Or you can even use primitive types
like having a simple lists
and strings and that sort of thing.
So it's kind of like, it allows
you to, has these nice helpers
to do things like, you know, perform
joins and limit fields and
things like that.
But it also doesn't prevent you
from just saying like, I wanna,
I'm gonna take over here.
I'm just gonna kind of write my
Nikolay: own SQL within this.
Huge recursive CTE I want to write.
Andrew: Yeah, you could do that.
Yeah, there's.
Michael: What proportion of the
time do you tend to see, for
yourself and for others, do you
find that you're, Yeah, you're
using one versus the other.
Andrew: Yeah, it's interesting
because like 10 years ago, I did
see more of, I'd say, writing plain
SQL within ActiveRecord.
And I think it was because a lot
of folks that were using Ruby
on Rails then, they had more that
were more senior engineers,
they had experienced, or they had
experience with working with
SQL and other databases, and it's
just how they worked, you know?
And ActiveRecord was also more
limited in its capabilities.
And then I think there's been this
blend maybe over time, or
this interesting parallel tracks
of things happening.
One could be ActiveRecord has gained
more, it's continually adding
more and more helpers.
The documentation refers to it
as helpers, but like, you know,
as I was mentioning before, recently
common table expressions
or CTEs gained support, which I would
have thought maybe they would
have been around for a lot earlier,
but ActiveRecord has a first-class
method, a helper method for CTEs
now, and then composite primary
keys I mentioned.
So ActiveRecord has gained more
support even when you move beyond
one database.
If you want to work with multiple
databases, you can configure
that in your application.
There's even the ability to take
advantage of automatically sending
read-only queries identified by
the HTTP verb for your web app
to a read replica.
If you have that configured, Active
Record lets you set up a,
what they call a writer and a reader
role, and then they even
added that to the sharding capabilities
too, where if you have,
if you take advantage of horizontal
sharding in Active Record,
you can do some automatic shard
distribution.
And so I think because there are
more capabilities that ActiveRecord
supports, there also, I tend to
see less writing of SQL, but
again, I think it kind of comes
back to the team's composition
too.
If the team's very familiar with
SQL and also myself as I've
gotten more experienced with SQL.
I think my patience for like, okay,
is this supported in ActiveRecord?
record?
Like, okay, fine, I'll do it in
Active Record because that will
work well for the rest of the team
or whatever.
But if I can't find it pretty quickly,
I'll just write it as
an SQL statement.
It's no big deal to me.
Michael: Makes sense.
Nikolay: I worked with Rails projects
a lot, and one of them is
GitLab.
Shameless plug-in of ads for their
migration helpers.
.rb is great, and documentation
is open and great.
But someone in the Rails community
should finally rename disable_ddl_transaction!, disabled DDL transaction, because
people, just because of this weird,
I don't want to say word
or literal, a lot of people tend
to say we are going to deploy
this without transactions, in non-transactional
mode.
Postgres cannot work without transactions.
So it's just becoming like a single
transaction, each statement
becomes a single transaction.
Michael: It's not transactional, right?
And I don't like this part of Rails
at all, for many, many years,
and it should be renamed.
But at the same point, you know,
levels of understanding.
I also understand at some point,
like, I understood, creating this
concurrently, and you need disable_ddl_transaction to run create index
concurrently.
It's not transactional, right?
Because it can leave leftovers.
So things are difficult.
But I see people like Ruby developers,
they say we will execute
this without transactions.
It just breaks my ear, Postgres
ear.
It's not right.
Is it possible to change at all,
to create pull requests or something?
Andrew: Oh, you definitely could.
I've actually seen...
I mean, yeah, we could after this
call, we could create a pool
request to Rails and try to get
that approved, but it'd be an
uphill battle, I think.
Nikolay: It's probably a huge battle.
But I already have 1 battle won,
and I remember the 37signals
team, or Basecamp, or who is that?
Yeah.
They added support of the logic
when we have replicas, asynchronous
replicas, and a write happened
in a session.
So you need to stick to the primary
for some period of time,
because otherwise if a replica
is lagging, you won't see your
own write immediately after.
So they implemented this.
Before that, many companies, and
I also implemented this logic
in several languages, in PHP and
Java, in my past.
But I see this path finally Rails
is going through, and it's
obvious to me that they implemented
this lag as a constant written
in code.
It should be configurable.
So there was a big battle, and
DHH supported my proposal.
I was happy to see, like, let's
make this configurable.
So maybe we can win 1 more battle,
right?
Maybe.
Maybe.
It's not a question, sorry, it's
just my story with Rails.
Andrew: Yeah, well, it is actually
a good, I like that the design
of, so this replication lag, there's
like a resolver class concept
and the documentation talks about,
actually I wanna get back
to your question too, Michael,
about are there any other benefits
we didn't really cover?
I like how the documentation says,
hey, we purposefully have
this simple resolver class.
And if you have greater needs,
then they've built in an extension
point.
They've said like, okay, just create
your own resolver class,
implement this method, and then
you can do whatever you want.
And if I remember correctly, that's
how the automatic replica
switching and the shard switching
works, is they both have kind
of an official extension point so
that you do kind of get some
default behavior.
They took a guess at what a reasonable
replica lag would be,
I guess.
And then, but they said like, if
you want to do something else
or omit some certain, you don't
want to make this type automatic
or you do want to make this type
automatic, etc., you could
do that in your own implementation
class.
Just so we could cover this quickly
too, Michael, you were asking
before about other benefits, like
if you do stay kind of within
the Rails world.
I think like, because I wanted
to mention that a lot of these,
what I would call enhancements
for like higher scale operations
are coming from developers at GitHub
and Shopify and companies
that do have, you know, internet
scale operations, and they're
still using Ruby on Rails with
their relational database.
And they're building a lot of things
into ActiveRecord.
And then I think those are, you
know, great gifts that we receive
as users then of the framework,
even if we don't have that scale,
we have some pathways that we could
grow into that if we need.
And we also get them the benefit
of the ORM there where it's
like, well, multiple database adapters
are supported by the framework.
So you need to make sure that this
works in both MySQL, even
though they don’t use Postgres
at GitHub, from my understanding.
The Active Record capabilities
support multiple relational databases.
So we kind of get that, even though
they're not using Postgres
within the framework.
Michael: Yeah, that's really cool.
And it used to be like, I'm old
enough to remember when people
used to ask does Rails scale and
think, you know, like that,
those old questions.
And it just feels like we have
so many good examples now that
doesn't seem to come up as often
anymore, which is nice and refreshing.
I did actually almost want to ask
from the other perspective
quickly, though, what, like, as
a Postgres community, what can
we keep doing or what can we do
better to make sure Postgres
stays as the number 1 choice
for Rails or how can we make
things easier for developers at
both large and small organizations?
Andrew: Yeah, that's a good question.
I do think that there's a little
bit of a recent popularity with
SQLite within Rails.
It's gained some features that
I think make it more scalable.
And then also, depending on your
deployment setup, it may offer
you a simpler deployment configuration.
And then on the MySQL side, I think
that there's, you know, there's
certainly I hear a lot about PlanetScale.
They're, they've been doing great
with, they're the only, I actually
tweeted this a while ago, but like
they somehow made foreign
key constraints look cool.
I don't know if you saw that video
or not.
They have this well-produced video
where they're like, boom,
we added foreign key constraints.
And I was like, it was actually
kind of cool.
And I mean, I think that databases
are, it's hard to make, to
generate a lot of enthusiasm maybe
around it.
But I would say that like, you
know, some of the companies offering
Postgres as a service these days,
there’s a number of new startup
companies.
They're kind of either advertising
more like we can help you
with multi-tenancy or we can help
you with full-text search or
we can help you with these other
sorts of like capabilities or,
You know, usages that Postgres
supports.
It just might be a lot that you
need to kind of build yourself.
And so I do think Postgres has
like the raw ingredients for a
lot of stuff, but I do think that
if you're not going to use
a managed service and you want
to take this on, you got to do
a lot of work to build skills and,
you know, like if you want
to build in your own full-text
search at a good scale with good
performance, Postgres has a lot
of built-in capabilities and
is extensible.
You can add extensions.
But I think like, you know, just
continuing to maybe think about
the developer experience, I guess,
like at least for Postgres
users that are web applications,
anywhere the Postgres community
can contribute, you know, guides
and tutorials or if companies
that use it successfully with some
of these use cases, if they
can publish on their engineering
blogs, like I always love to
read those.
Like DoorDash is a company I know
that is a Rails and Postgres
company and has a lot of great
engineering blog posts.
And I think those are ways that
show pathways to leverage some
of these capabilities.
Yeah, and then I guess I've been
thinking about, I think the
open source solution still around
end to end query observability
is still like, it can sometimes
be limited in my experience where
I'll like, you know, we, we know
of tools like pg_stat_statements,
but then we want to collect samples
and it can be a little bit
difficult.
You kind of go back and forth between
relying on logs using system
catalogs, but Postgres keeps investing
in that there's new catalogs
and there's new things.
And so I think like anytime, any
of those sorts of things that
make it easier to see what my query
workload is like, where are
the costs coming from, especially
if they're connected to things
I can take action on.
Like, Nikolay, you mentioned recently
how pg_stat_statements shows
the number of rows returned.
You could go through and look at
queries where you could say,
well, we should really probably
add limits to these queries.
Things like that can make really
meaningful...
You can draw a real clear line
between, like, here's information
that's being provided to you within
Postgres that helps you out
on as you're building and scaling
and operating your system.
Nikolay: But this approach is reactive.
If we see pg_stat_statements on production,
it's already happening.
It would be great to try to guess
based on Rails code.
It's not about migrations already,
just regular some code, Rb
file, which serves some page or
API endpoint.
And we can just look at it, we
can say, oh, the limit is missing
here, right?
Yeah.
People at GitHub, for example,
they build a lot of additional
things in CI which help them.
Andrew: Yeah, that's right.
Nikolay: I think something can
be found in public.
So basically, for example, a select
plus 1 thing, which is very
common for ORM, it can be automatically
detected before you deploy.
And even if you test it on a very
small database, not on a full-size
database, or probably workload
generated is not yet in production,
but we already can see something
is dangerous here.
I'm trying to say this is probably
not Postgres's job.
Andrew: Yeah, it's probably not
Postgres's job.
I agree with you.
I was kind of thinking there's
the Ruby open source community,
there are some great command-line
tools.
I'll shout out rails-pg-extras
is one.
And a common interception point
where you might want to take
a more of a proactive approach
would be when you're creating
a migration.
Like, oh, do you realize now you've
created an inconsistency
between your schema and your application?
Like maybe you were checking over
here it was a string type and
over here you've created an integer
type, that kind of thing.
Or that's not the most exciting
example, but that command-line
time, like when the developer is
working, is kind of one touch
point.
Or like you said on CI, you shipped
off some code to a system
running tests and it starts to
detect things and gives the developer
feedback before it goes to production,
that's always a good thing.
Nikolay: Yeah.
Well, I'm trying to reflect on my
experience with Rails.
I can show you another one from 2017
I found.
So Shopify is on MySQL, right?
Yep.
You mentioned this.
So let me describe it.
There's a gem or library, I don't
know, called Delayed Job.
Delayed Job, just a single one.
And in 2017, I saw the issue.
I had a client in 2017 which was
on Rails and on RDS already.
It was maybe around the first time
I had a good production experience
with RDS, which I actually found
liking because it provided some
good automation for cloning, right?
For experiments.
Experiments matter a lot.
So I found this problem.
It was not performing well, delayed
jobs, under scale.
So you have a lot of jobs, a lot
of entries, also like high insert
rate. You need to have a high processing
rate.
It was super easy for me to find
what was the problem and I also
found this issue. It was discussed
on GitHub since 2013, saying
with half a million jobs it gets
really slow.
Half a million is not a huge number
at all for Postgres, it's
a small number.
So I popped up saying it's super
easy, like you just create an additional
index, this 1, and you have already
selected for an update, just
add 2 words, keep locked.
And index plus this, you have a massive
boost for your library.
You don't need to migrate to Sidekiq
or Kafka or something.
That's it.
You don't need to learn PGQ from
Skype.
And they didn't do it.
Now I understand, because it's
from Shopify, and they're from
MySQL world.
They started discussing some Postgres
versions, and I see that
I got a bunch of likes, and I keep
having email notifications,
people thanking me because this
saved their life, because this
improved.
But it looks like not only we talk
about the connection of Ruby and
Postgres worlds and the problem
of how to connect better, but
also this idea of ORM, let's be
abstract, let's keep agnostic
from the database.
This makes life worse actually.
We cannot use what Postgres can
offer.
What do you think about this?
So again, not a question, a reflection
of my experience, but maybe
it's a good discussion I think.
Michael: Well, I don't think you'd
be able to use Postgres if
there was only, if Rails only supported
one relational database
it would not be Postgres I don't
think, based on who runs the
Rails project.
Andrew: Yeah it certainly wasn't
the most popular in 2005 when
Ruby on Rails was kind of really
getting out there for the first
time, which now is a long time ago.
And yeah, I mean, so it does provide
you that indirection point
where you can say, well, we have
a generic adapter and then we
have the MySQL adapter.
And yeah, and maybe it doesn't
support skip locked.
So maybe we can't do skip locked
at all because it's not supported
across all three of our databases,
which is a trade-off.
Nikolay: You can write an if.
If it's Postgres, add two words.
Andrew: Yeah, and actually that's,
there's been, I should have
had a couple of examples ready
to go, but there's actually a
couple of things that are actually
only supported in Postgres
that ActiveRecord supports, which
is pretty exciting because
it's like a little bit of a philosophical
switch to me where
you say like, okay, well we have
these really useful capabilities,
but we're just going to support
them in Postgres only.
You know, sorry, MySQL users.
And that way then you get the best
of both of those categories.
But of course, like, then we aren't
really getting into this,
but there's a whole discussion
about SQL standards and stuff
like that.
So I know like, for example, the
returning clause is not supported
in, my understanding is not supported
in MySQL.
Nikolay: I just checked the docs
for MySQL, like latest version,
and they already started to support
skip locked, so.
Okay, yeah.
Andrew: Well, and it could be that
a framework that chooses to
offer conditional support, as long
as it's, of course, like,
you know, well tested and supported,
then that could actually
apply pressure to another open
source database to add support
for a SQL standard command, right?
It's like if this gains support,
probably not one framework like
Rails, but if everything Rails
and Laravel and Prisma and all
the language communities are all
clamoring for additional support.
Nikolay: You mentioned SQL standard,
but the problem here is
that SQL standard doesn't care
about performance at all.
Nothing about indexes and SKIP LOCKED is not there.
They don't care about performance.
And here we talk about performance,
right?
So it's not easy sometimes.
There should be some addition to
standard talking about performance.
But I don't see how it can happen.
Andrew: Yeah.
I think it's great that you've
provided that feedback, Nikolay,
into the community.
I think there is, to the earlier
question too, I think like Postgres
is going to, you know, has its core
set of objectives with each
release, you know, where the development
effort investment is
going and that sort of thing, which
is going to be a completely
different track than Ruby on Rails.
But like if we have some overlap,
you know, the folks that tend
to maintain the Postgres adapter
code are Postgres fans in the
Ruby on Rails community.
They happen to also write Ruby.
And it's great to have those kind
of overlapping folks.
There's kind of this discussion
around, I know we don't have,
we're out of time to get into this,
but there's an example that
comes to mind too of an in-clause
SQL query that has a large
list of values.
And then some folks notice that
from Ruby on Rails and on Postgres.
And that was one where I felt like,
I think Nikolay, we might have
both tweeted about this at some
point, but like there was kind
of this crossing the streams for
me where I saw people, Postgres
people I follow and Ruby on Rails
people, were like, you know,
we need to do something about this
and improve query performance
for these types of queries.
Nikolay: Right.
Also, .plug, or how is it called?
Plug, right?
This is like when you retrieve
everything, then you do something
like Postgres can do it much more
efficiently.
Don't do work instead of database
on application side, right?
But I guess it's a small thing,
any application language can
be used for this not efficient
approach.
What do you think about the future
of Ruby in general?
Andrew: I think it's achieved that
kind of cockroach status where
it just never goes away now.
I don't know.
There just seems to be enough,
not cockroach DB, but the actual
insect.
There's such a large amount of
companies that there's just a
lot of work to maintain the applications
if they're successful
that use Ruby on Rails.
Of course, the growth is not what
it was in the early days.
But there's a huge community of
Rails developers and there are
new people entering it.
So if someone enters it because
maybe it's like their bootcamp
program or they joined a company
that's using Rails, despite
being 20 years old as a framework
for Rails anyways.
First of all, I think Ruby is a
great language outside of Rails,
but despite being a 20 year old
framework, like there's a very
steady development clip.
There's new features being added
that are exciting.
I just think they tend to be more
front end oriented.
Like Ruby on Rails these days is
trying to capture more attention
from what might have been JavaScript
only apps before and offer
a lot of capabilities on as you're
building your web application
screens, like giving you a lot
of interactivity and low latency,
but doing it within Rails and kind
of getting back to some more
of the full stack kind of application
building capabilities.
But there are things happening
too on every release related to
Postgres and I certainly track
those and it still feels active,
just not where everyone's flocking
to.
Michael: I thought the active might
be a Rails joke.
Andrew: Oh yeah, it could be, yeah.
Michael: Yeah.
On a serious note though, and thanks
so much for all your time,
where can folks go to learn more?
Andrew: Yeah, If anyone's interested
in this topic area and they'd
like to explore the book that I
wrote, it's at pragprog.com.
That's the publishers Pragmatic
Programmers.
And I do a fair amount of blogging
on Postgres and Rails topics.
My blog is andyatkinson.com.
And I'm also on most of the social
media apps.
This year I'm going to be at Sin
City Ruby and PG Day Chicago,
as well, so probably more for Postgres
people.
If anyone's at PG Day Chicago,
I hope to make it to some more
Postgres events, but I love the
in-person events too and being
Able to meet more community members
and learn from other people.
Nikolay: And the book is called
High Performance PostgreSQL for
Rails, right?
Andrew: Yep, that's right.
Michael: Awesome work, awesome
publisher as well. And thanks
again.
Andrew: Thanks for having me.
A lot of fun, guys.
Thank you.