1
00:00:00,120 --> 00:00:03,270
Almost always testing some frontier model for these labs.

2
00:00:03,300 --> 00:00:04,320
Uh, using our evals.

3
00:00:04,320 --> 00:00:05,940
We kind of have the, one of the best evals in

4
00:00:05,940 --> 00:00:07,770
the industry when it comes to reasoning models.

5
00:00:07,770 --> 00:00:08,220
Given that.

6
00:00:08,520 --> 00:00:11,340
Code reviews are a reasoning heavy use case, and also

7
00:00:11,340 --> 00:00:13,770
around the agent tick flows because as you go deeper

8
00:00:13,770 --> 00:00:15,989
and deeper into agent tick flows, your errors compound.

9
00:00:16,320 --> 00:00:19,560
So these long horizon tasks, if you're like, go off track on the

10
00:00:19,560 --> 00:00:23,280
first step, you're gonna be way off in the like 10th step, right?

11
00:00:23,729 --> 00:00:25,860
So, so yeah, it's all about like evaluating.

12
00:00:25,860 --> 00:00:30,479
So we have a team that spends a lot of time looking at open source usage that we

13
00:00:30,479 --> 00:00:34,589
have, bringing those examples in to make sure we have the, the right evaluation

14
00:00:34,589 --> 00:00:37,830
framework to understand the behavior of these new models as they come out.

15
00:00:43,065 --> 00:00:44,625
Welcome to Screaming in the Cloud.

16
00:00:44,895 --> 00:00:45,855
I'm Corey Quinn.

17
00:00:46,125 --> 00:00:51,495
Today I'm joined by Harjot Gill, CEO and co-founder of Code Rabbit Harjot.

18
00:00:51,555 --> 00:00:53,925
Uh, you are a now three times entrepreneur

19
00:00:54,195 --> 00:00:56,685
who went from Nutanix Senior Director.

20
00:00:56,685 --> 00:00:59,295
We were building what is now apparently the most

21
00:00:59,295 --> 00:01:02,145
installed AI application on GitHub and GitLab.

22
00:01:02,145 --> 00:01:04,245
If I'm reading this correctly, uh, what.

23
00:01:04,605 --> 00:01:07,455
Made you leave a big comfortable tech job to

24
00:01:07,455 --> 00:01:09,345
decide, you know what I really want to do next?

25
00:01:09,435 --> 00:01:10,095
Code review.

26
00:01:11,895 --> 00:01:13,185
Thanks for having me here, Cody.

27
00:01:13,185 --> 00:01:15,285
I mean, it's been a very interesting journey for me

28
00:01:15,435 --> 00:01:18,795
coding faster with AI tools, you need faster, better

29
00:01:18,795 --> 00:01:23,025
code reviews to keep up and to keep out the AI slop code.

30
00:01:23,025 --> 00:01:27,375
Rabbit delivers senior engineer level code reviews on pull requests, and write

31
00:01:27,375 --> 00:01:33,795
in your IDE with less LGTM, uh, get context aware feedback for every commit.

32
00:01:33,825 --> 00:01:37,305
No configuration needed and all programming languages you

33
00:01:37,305 --> 00:01:41,325
care to use are supported from individuals to enterprises.

34
00:01:41,325 --> 00:01:43,455
Join the millions you trust Code Rabbit to

35
00:01:43,455 --> 00:01:46,245
banish bugs and help ship better code faster.

36
00:01:46,515 --> 00:01:49,965
Get started@coderabbit.ai and instantly cut

37
00:01:49,965 --> 00:01:53,085
your code, review time and bugs in half.

38
00:01:53,145 --> 00:01:54,135
So I've done these startups

39
00:01:54,135 --> 00:01:54,675
for a while.

40
00:01:54,675 --> 00:01:55,335
I actually.

41
00:01:56,075 --> 00:02:00,995
Uh, started my first company back in 2015, um, out of my research that I was

42
00:02:00,995 --> 00:02:04,085
doing at University of Pennsylvania at that time, and that was called sil.

43
00:02:04,835 --> 00:02:06,455
And it was in the early time.

44
00:02:06,515 --> 00:02:09,005
I mean, you would remember that because this was the time of

45
00:02:09,065 --> 00:02:12,815
Docker, Kubernetes and all these like microservices taking off and.

46
00:02:13,470 --> 00:02:17,010
We built like this product that was like, listen back backward net, like

47
00:02:17,010 --> 00:02:21,720
so we could understand the API calls and understand, uh, the performance.

48
00:02:21,780 --> 00:02:23,399
That was the first startup, which was acquired

49
00:02:23,399 --> 00:02:25,649
by Nutanix, and I was there for like a few years.

50
00:02:26,160 --> 00:02:27,840
Uh, never been like a big company person.

51
00:02:27,840 --> 00:02:30,209
So yeah, I mean, so we had to go there, ate the product.

52
00:02:30,810 --> 00:02:32,670
Um, then I quit to start another startup,

53
00:02:32,670 --> 00:02:34,890
which was Flux Ninja, uh, which was in the.

54
00:02:35,375 --> 00:02:36,635
Reliability management space.

55
00:02:36,635 --> 00:02:39,935
And the idea was like, how do we go beyond observability

56
00:02:39,935 --> 00:02:42,635
into more controllability, prevent cascading failures

57
00:02:42,635 --> 00:02:44,795
and all these Black Friday outages and so on.

58
00:02:45,275 --> 00:02:45,455
Yeah.

59
00:02:45,455 --> 00:02:49,805
So been like a very interesting like, rate limiting and, um, load

60
00:02:49,805 --> 00:02:52,475
shedding kind of a solution where you could prioritize API calls.

61
00:02:52,475 --> 00:02:54,935
Now, unfortunately for us, I mean, that didn't go well.

62
00:02:54,935 --> 00:02:58,835
I mean, we were betting a lot on, um, service meshe to take off and

63
00:02:58,835 --> 00:03:02,375
they never became like a mainstream tech, uh, from that point of view.

64
00:03:02,375 --> 00:03:04,115
And very interestingly around that time.

65
00:03:04,850 --> 00:03:09,170
LA large language models took off like generative ai and

66
00:03:09,170 --> 00:03:11,510
like it started with like GitHub copilot and then chat.

67
00:03:11,510 --> 00:03:17,510
GPD came along and this was even before, uh, there was G PT four and I was

68
00:03:17,510 --> 00:03:22,820
running this like team of like around 15 remote employees during COVID, like in.

69
00:03:23,095 --> 00:03:25,525
We were struggling to like ship code faster.

70
00:03:25,525 --> 00:03:28,765
Like one of the big bottlenecks was like even though copilot had come or

71
00:03:28,765 --> 00:03:32,695
people were coding, uh, even doing like small stack code requests and so

72
00:03:32,695 --> 00:03:36,475
on, but still the code reviews were like a very massive bottleneck, right?

73
00:03:36,834 --> 00:03:40,614
So I did this like weekend hackathon like project where we

74
00:03:40,614 --> 00:03:43,614
started using some of these language models to automate the

75
00:03:43,614 --> 00:03:47,095
code of your process to find low hanging issues that can go

76
00:03:47,095 --> 00:03:50,394
beyond simple linting issues that you find with existing tools.

77
00:03:50,775 --> 00:03:52,335
That was a very interesting outcome.

78
00:03:52,335 --> 00:03:55,005
Like we saw that this is a really good fit, and

79
00:03:55,005 --> 00:03:57,015
then Code Rabbit started as a separate company.

80
00:03:57,015 --> 00:03:57,855
It just took off.

81
00:03:58,665 --> 00:04:02,865
And it was very clear that, uh, we, we have to like just

82
00:04:02,865 --> 00:04:05,055
focus on that problem statement and then flux change.

83
00:04:05,055 --> 00:04:08,445
On my second startup, we basically folded into Code Rabbit and

84
00:04:08,445 --> 00:04:12,195
that's what I then continued to go full time and Code Rabbit.

85
00:04:12,435 --> 00:04:12,915
Yeah, go on.

86
00:04:13,785 --> 00:04:13,965
Yeah.

87
00:04:13,965 --> 00:04:17,985
Co Code Rabbit is, is interesting to me from the perspective of

88
00:04:18,015 --> 00:04:21,285
the, you're tackling something that historically has felt very

89
00:04:21,285 --> 00:04:24,435
boring and things that people tend to more or less phone in.

90
00:04:24,645 --> 00:04:28,095
I mean, code reviews have been around forever, but what did you

91
00:04:28,095 --> 00:04:32,085
see around its fundamental broken nature that made you think,

92
00:04:32,085 --> 00:04:34,545
ah, we could, we could pour some AI on this and make 'em better?

93
00:04:35,565 --> 00:04:37,305
It's broken in many, many ways.

94
00:04:37,305 --> 00:04:38,025
You won't believe it.

95
00:04:38,025 --> 00:04:39,000
Like not every company.

96
00:04:39,910 --> 00:04:42,190
I mean, everyone wants to do a code review process because

97
00:04:42,190 --> 00:04:44,770
of compliance reasons, and just because they want to prevent

98
00:04:44,770 --> 00:04:47,625
massive failures downstream because each issue you catch.

99
00:04:48,315 --> 00:04:50,115
SDLC is much cheaper, right.

100
00:04:50,115 --> 00:04:52,155
Than to cash it later in production.

101
00:04:52,155 --> 00:04:52,335
And

102
00:04:52,635 --> 00:04:55,905
I mean, even on my solo side projects where I'm developing things and merging

103
00:04:55,905 --> 00:04:59,565
them in, I still feel obligated to do some form of code review just because

104
00:04:59,565 --> 00:05:04,155
I style myself as kind of a grownup and I probably got everything right.

105
00:05:04,155 --> 00:05:06,405
Why would I even read this thing I'm about to merge in?

106
00:05:06,825 --> 00:05:08,325
I still feel obligated to do it.

107
00:05:08,325 --> 00:05:10,755
That doesn't mean I do it, but I feel bad when I don't.

108
00:05:11,625 --> 00:05:11,715
Right.

109
00:05:11,715 --> 00:05:14,895
I mean, and, and a lot of managers in these companies have that like.

110
00:05:15,480 --> 00:05:17,550
Guilty, uh, conscious, like in because of

111
00:05:17,550 --> 00:05:19,020
the, because they're not doing a good job.

112
00:05:19,020 --> 00:05:20,005
Because if you look at really.

113
00:05:21,000 --> 00:05:21,989
Uh, good code reviews.

114
00:05:21,989 --> 00:05:24,840
They take as much time as it takes to actually write the code.

115
00:05:25,440 --> 00:05:28,020
But I mean, you are actually have to really understand

116
00:05:28,020 --> 00:05:30,900
the context of these changes because software is complex.

117
00:05:30,900 --> 00:05:32,729
It breaks in very, very interesting ways.

118
00:05:33,000 --> 00:05:36,060
And code review is like the, the kind of the first kind of defense.

119
00:05:36,060 --> 00:05:39,270
I mean, of course you're doing more validation downstream, QA environments and

120
00:05:39,270 --> 00:05:42,570
so on, but code review is really necessary, but most companies don't do it well.

121
00:05:42,690 --> 00:05:42,929
Oh yeah.

122
00:05:42,929 --> 00:05:46,109
And AI seems to be way faster at not reading the code,

123
00:05:46,109 --> 00:05:50,140
typing LGTM as a. Comment and then clicking the merge button.

124
00:05:50,140 --> 00:05:54,490
It feels like it could speed up that iterative cycle, which frankly is

125
00:05:54,490 --> 00:05:57,550
sort of the way that it seems like all large code reviews tend to go.

126
00:05:57,760 --> 00:06:00,910
I've, I've submitted three line diffs that have been nibbled

127
00:06:00,910 --> 00:06:04,540
to death by ducks as, as people talk about it, but that.

128
00:06:04,540 --> 00:06:08,200
10,000 line diff of a refactor or a reformat or something, gets

129
00:06:08,200 --> 00:06:11,710
the, looks good to me, ship, ship, squirrel, and off it goes.

130
00:06:11,740 --> 00:06:13,930
It's, there's a human nature element to play here.

131
00:06:13,990 --> 00:06:14,410
You are, right?

132
00:06:14,410 --> 00:06:17,440
I mean, when the PR are small, yes, people can still have a

133
00:06:17,440 --> 00:06:20,800
cognitive, like, can, can, can cognitively look at them, but

134
00:06:20,800 --> 00:06:23,290
when they become like beyond a certain threshold, that's the

135
00:06:23,290 --> 00:06:25,840
point where you say, okay, rubber stamp it, like ship it.

136
00:06:26,140 --> 00:06:26,590
I don't care.

137
00:06:26,620 --> 00:06:27,700
I can't just go through it.

138
00:06:28,120 --> 00:06:29,260
And then like in some.

139
00:06:29,325 --> 00:06:31,965
Companies, you have like the ego clashes, like people actually do a very

140
00:06:31,965 --> 00:06:34,604
thorough job or too thorough and many times, and you have like so much back

141
00:06:34,604 --> 00:06:38,145
and forth for days and weeks that a stretch and things don't move at all.

142
00:06:38,655 --> 00:06:38,895
Right.

143
00:06:38,895 --> 00:06:41,835
So, so yeah, I mean the code reviews can get ugly in many ways.

144
00:06:41,835 --> 00:06:44,145
Um, most of the times it's rubber stamps, as you said, but in some

145
00:06:44,145 --> 00:06:47,145
cases they can also be very toxic and not pleasant experience.

146
00:06:47,145 --> 00:06:47,294
Right.

147
00:06:47,294 --> 00:06:48,015
In many ways.

148
00:06:48,855 --> 00:06:50,594
Show me how your company reviews code.

149
00:06:50,594 --> 00:06:52,215
I can show you aspects of its culture

150
00:06:53,805 --> 00:06:53,895
then.

151
00:06:53,895 --> 00:06:54,344
That's right.

152
00:06:54,495 --> 00:06:54,914
That's right.

153
00:06:54,914 --> 00:06:55,065
Yeah.

154
00:06:55,950 --> 00:06:59,400
So, uh, I guess a, a question I have is around what, I guess I'll term

155
00:06:59,400 --> 00:07:03,060
a second order effect of a lot of this AI generated code proliferation.

156
00:07:03,330 --> 00:07:06,750
Uh, now it seems like forget merging code in that I haven't

157
00:07:06,750 --> 00:07:09,719
read, in some cases I'm merging code in, I haven't written.

158
00:07:09,870 --> 00:07:11,059
It feels like that is, that is, that is

159
00:07:11,059 --> 00:07:13,590
increasing the burden on the code review process.

160
00:07:14,099 --> 00:07:14,310
Right?

161
00:07:14,310 --> 00:07:18,780
So one of the biggest tailored applications for generative AI has been.

162
00:07:19,859 --> 00:07:21,270
I mean, that's probably the only thing that's

163
00:07:21,270 --> 00:07:22,830
working if you really think about it, right?

164
00:07:23,969 --> 00:07:27,570
Oh, it, it is the standout breakout success of generative ai.

165
00:07:27,570 --> 00:07:31,859
Everything else is trailing behind, but what it turns out that there's

166
00:07:31,859 --> 00:07:37,080
a lot of replacement of stack Overflow that AI code generation can do.

167
00:07:37,890 --> 00:07:38,460
That's right.

168
00:07:38,490 --> 00:07:39,030
That is right.

169
00:07:39,030 --> 00:07:41,549
I mean, and it's getting more and more sophisticated.

170
00:07:41,549 --> 00:07:42,780
Now you have all these coding agents.

171
00:07:43,755 --> 00:07:46,034
Which had come on the scene, which, like cloud core for instance,

172
00:07:46,034 --> 00:07:48,164
you're starting with a prompt and you're getting large scale

173
00:07:48,164 --> 00:07:51,765
coding, coding changes done, um, in a few minutes, right?

174
00:07:52,890 --> 00:07:55,680
A lot of these wide coders or the junior coders in, in every

175
00:07:55,680 --> 00:07:57,990
organization, they tend to like throw this code over the

176
00:07:57,990 --> 00:08:00,570
wall and then it's someone else's headache to review it.

177
00:08:00,570 --> 00:08:03,090
Especially the senior developers and now they

178
00:08:03,090 --> 00:08:06,600
have like a huge bottleneck of pull requests.

179
00:08:07,145 --> 00:08:08,525
They have to now piece together their

180
00:08:08,525 --> 00:08:10,025
puzzle like on what actually happened there.

181
00:08:11,044 --> 00:08:12,184
And, and it's a nightmare.

182
00:08:12,244 --> 00:08:14,854
I mean, the backlogs are now increasing because of generative ai.

183
00:08:14,854 --> 00:08:17,974
A lot of white coded prs are being opened against open source projects.

184
00:08:17,974 --> 00:08:19,354
It's also maintenance nightmare.

185
00:08:19,684 --> 00:08:22,385
I don't know whether you've seen a lot of tweets around, uh, open source

186
00:08:22,385 --> 00:08:26,599
projects where they're getting these contributions from random developers across

187
00:08:26,599 --> 00:08:31,205
the world with like 10, 10,000 line prs, hundreds of files changed and so on.

188
00:08:31,684 --> 00:08:34,174
They seem like good features on the surface, but once

189
00:08:34,174 --> 00:08:36,454
you start digging deeper, they're like right with issues.

190
00:08:36,944 --> 00:08:39,974
And that's where I think it's becoming unsustainable.

191
00:08:39,974 --> 00:08:43,304
Like, and you're getting ai, uh, it's like an air battle.

192
00:08:43,304 --> 00:08:45,015
Like earlier you were fighting a tank battle, but

193
00:08:45,015 --> 00:08:46,875
now you're like in a very different battleground.

194
00:08:46,875 --> 00:08:50,204
Like you have to bring AI to fight AI in, in many ways.

195
00:08:50,505 --> 00:08:51,599
Or AI to review ai.

196
00:08:51,704 --> 00:08:52,005
Right?

197
00:08:53,025 --> 00:08:54,344
Very much so.

198
00:08:54,375 --> 00:08:59,084
It's, it's one of these areas where it just seems that there's so much.

199
00:08:59,745 --> 00:09:02,835
I guess nonsense code being thrown out.

200
00:09:02,835 --> 00:09:04,545
I look at the stuff in my own code base

201
00:09:04,545 --> 00:09:06,405
after I let AI take a few passes through it.

202
00:09:06,465 --> 00:09:08,415
And like the amount of dead code that'll never

203
00:09:08,415 --> 00:09:11,325
be reached through any code path is astonishing.

204
00:09:11,565 --> 00:09:15,075
Uh, five different versions of documentation, all self-contradictory.

205
00:09:15,315 --> 00:09:17,925
It becomes an absolute mess.

206
00:09:18,195 --> 00:09:18,285
Uh.

207
00:09:19,035 --> 00:09:21,314
The counterpoint though that I have is that this is not

208
00:09:21,314 --> 00:09:24,285
the first attempt at solving the code review problem.

209
00:09:24,285 --> 00:09:27,135
There have been a bunch of traditional tools aimed at this before the rise

210
00:09:27,135 --> 00:09:30,165
of ai, and the biggest problem that you had there was false positives.

211
00:09:31,305 --> 00:09:31,785
That's right.

212
00:09:31,785 --> 00:09:34,155
I mean, if you look at the traditional tools like pre gen

213
00:09:34,155 --> 00:09:38,385
ai, they were like mostly like based on, um, static rules.

214
00:09:38,385 --> 00:09:41,175
Some sort of like, uh, like if you look at the security

215
00:09:41,175 --> 00:09:43,574
scanning market, the SaaS market, like you probably.

216
00:09:44,204 --> 00:09:48,194
Some of those companies, uh, they are looking at like OAS vulnerability.

217
00:09:48,194 --> 00:09:52,844
They have signatures to dis discover those, uh, deficiencies.

218
00:09:52,844 --> 00:09:53,145
Right?

219
00:09:53,685 --> 00:09:57,074
And every company, like when they enable these tools, they end up in

220
00:09:57,074 --> 00:10:00,735
the, in the, a lot of alerts, like alert fatigue is a problem and you

221
00:10:00,735 --> 00:10:03,704
end up like switching a lot of these linking rules so that you can still

222
00:10:03,704 --> 00:10:07,005
move faster without like making sure you're covering all kind of like.

223
00:10:07,160 --> 00:10:10,010
All these like clean code guidelines that you could have, right?

224
00:10:10,579 --> 00:10:11,089
Um, yeah.

225
00:10:11,089 --> 00:10:14,120
So that has been like a traditional problem with the tools in this space.

226
00:10:14,120 --> 00:10:16,280
I mean, all the way from Sonar Source and

227
00:10:16,670 --> 00:10:18,949
um, uh, Sam Grips of the world and so on.

228
00:10:19,430 --> 00:10:22,339
And when it comes to ai, like one of the nice things is you can tune it.

229
00:10:22,339 --> 00:10:23,689
They're like very interesting knobs.

230
00:10:23,775 --> 00:10:24,555
Possible here.

231
00:10:24,555 --> 00:10:27,824
And also the kind of insights you're getting are more human-like feedback.

232
00:10:28,245 --> 00:10:30,824
You're not going and nitpicking on some signature.

233
00:10:30,824 --> 00:10:32,475
You're talking about the best practices.

234
00:10:32,475 --> 00:10:33,885
As you said, it's a stack overflow.

235
00:10:33,885 --> 00:10:37,095
You're taking this, all these best practice examples that these models have

236
00:10:37,095 --> 00:10:41,805
been trained on and you're bringing in to the, to the more of an average

237
00:10:41,805 --> 00:10:44,925
engineer and anywhere like in the world could now it has access to the

238
00:10:44,925 --> 00:10:48,340
best practices, uh, which they otherwise don't without proper mentorship.

239
00:10:49,335 --> 00:10:52,545
I I, I wanna dive a little bit into, I, I guess

240
00:10:52,545 --> 00:10:55,095
the cloud architecture piece of a lot of this.

241
00:10:55,095 --> 00:10:56,655
You've been fairly open on your engineering

242
00:10:56,655 --> 00:10:58,245
blog about how your infrastructure works.

243
00:10:58,250 --> 00:11:02,115
You, you use one of my favorite Google Cloud services of all things, uh,

244
00:11:02,115 --> 00:11:08,775
Google Cloud run to effectively execute what amounts to untrusted code at

245
00:11:08,775 --> 00:11:13,665
significant scale, and that is both brilliant and terrifying at the same time.

246
00:11:14,115 --> 00:11:15,340
Uh, can you talk me through that decision?

247
00:11:16,410 --> 00:11:20,520
Yeah, we were like one of the first companies to build agent workflows and

248
00:11:20,520 --> 00:11:23,910
also like engineer it to a point where it's very cost effectively done.

249
00:11:24,449 --> 00:11:27,959
So one of the things we found with building these agents is that

250
00:11:28,350 --> 00:11:32,219
no matter how much context engineering you do, it's never enough.

251
00:11:32,219 --> 00:11:34,350
So a lot of the companies started with drag lookups.

252
00:11:34,350 --> 00:11:37,319
They will index the code base and they'll bring in all

253
00:11:37,319 --> 00:11:39,750
this context and then ask the model a question around

254
00:11:39,750 --> 00:11:41,939
whether this code that you're reviewing looks good or not.

255
00:11:42,689 --> 00:11:43,800
But we found that.

256
00:11:44,610 --> 00:11:46,560
Very often, this context is never enough.

257
00:11:46,560 --> 00:11:50,400
It can never be enough given the complex code bases in the real world, that's

258
00:11:50,400 --> 00:11:54,805
where you wanna like give this model some sense of agency to go and explore.

259
00:11:55,830 --> 00:11:58,950
They can in the, for example, the Code Rabbit has like a multipass system.

260
00:11:58,950 --> 00:12:02,730
The first pass off the review, it does raise some concerns and now

261
00:12:02,730 --> 00:12:05,370
these could not some, A lot of the times these are false positives.

262
00:12:05,370 --> 00:12:06,480
They're not valid findings.

263
00:12:06,900 --> 00:12:09,150
So what we do in the sandboxes that we create

264
00:12:09,150 --> 00:12:10,950
in Google Cloud, we use Google Cloud Run.

265
00:12:10,950 --> 00:12:12,930
It's a really create service or serverless.

266
00:12:13,380 --> 00:12:16,140
So we create this like e fm environments where we kind of,

267
00:12:16,439 --> 00:12:18,930
the pull request we are reviewing, we not just look at that

268
00:12:18,930 --> 00:12:21,870
pull request, we bring in the entire code into the sandbox,

269
00:12:22,290 --> 00:12:24,810
and then we are letting the models actually navigate the code.

270
00:12:25,215 --> 00:12:27,555
Like a human does, but using CLI commands.

271
00:12:27,555 --> 00:12:30,225
And that was the other innovation, like we, uh, generating a lot

272
00:12:30,225 --> 00:12:34,215
of shell scripts, like we are letting the models like run queries.

273
00:12:34,515 --> 00:12:36,945
Uh, we are letting the models do cad command to read

274
00:12:36,945 --> 00:12:39,555
files based on the concerns they see in the code.

275
00:12:39,555 --> 00:12:41,475
And they will, they navigate the code and once they bring

276
00:12:41,475 --> 00:12:44,025
in this additional context, that's where they're either

277
00:12:44,505 --> 00:12:46,845
able to suppress some of these like false positives.

278
00:12:46,845 --> 00:12:49,125
And in many cases we are able to find issues which are

279
00:12:49,125 --> 00:12:52,275
ripple effects in the call graph across multiple files.

280
00:12:52,950 --> 00:12:54,750
That's what makes four abit really good, by the way.

281
00:12:54,900 --> 00:12:55,050
Yeah.

282
00:12:55,650 --> 00:12:55,890
Yeah.

283
00:12:55,980 --> 00:12:58,260
How do you wind up, I guess, drawing the line?

284
00:12:58,350 --> 00:13:01,590
Because I, I found that one of the big challenges you have with a

285
00:13:01,590 --> 00:13:07,025
lot of these LLM powered tools is they are anxious to please the, uh.

286
00:13:07,620 --> 00:13:09,990
When you say, great, find issues in this code.

287
00:13:09,990 --> 00:13:12,480
Look, they're, they're not gonna say, Nope, looks good to me.

288
00:13:12,660 --> 00:13:16,500
They're going to find something to quibble about, like some obnoxious

289
00:13:16,500 --> 00:13:19,380
senior engineers we've all worked with in the course of our careers.

290
00:13:19,680 --> 00:13:23,459
Uh, how do you at some point say, okay, anything, uh, nothing's ever perfect.

291
00:13:23,459 --> 00:13:27,150
At some point you're just quibbling over, uh, stylistic choices.

292
00:13:27,510 --> 00:13:28,199
This is good.

293
00:13:28,199 --> 00:13:28,650
Ship it.

294
00:13:28,680 --> 00:13:30,599
How do you, how do you make it draw that line?

295
00:13:31,110 --> 00:13:32,010
That was the hardest

296
00:13:32,010 --> 00:13:32,310
part.

297
00:13:32,314 --> 00:13:32,574
I'll tell you.

298
00:13:33,310 --> 00:13:35,290
So when we started, like, I mean, you are right.

299
00:13:35,319 --> 00:13:36,969
I mean, if you ask a model to find something

300
00:13:36,969 --> 00:13:38,860
wrong, it will find something wrong in it.

301
00:13:38,860 --> 00:13:40,449
Like 10 things, 15 things.

302
00:13:40,479 --> 00:13:42,069
It'll almost always please you, right?

303
00:13:42,610 --> 00:13:46,870
And the hard part is like how do you drive, draw the right balance?

304
00:13:46,870 --> 00:13:49,540
Like in classifying a lot of this output and

305
00:13:49,540 --> 00:13:51,939
seeing what's important enough for the user.

306
00:13:52,185 --> 00:13:53,415
To pay attention to.

307
00:13:53,415 --> 00:13:53,685
Right.

308
00:13:54,135 --> 00:13:57,015
Took a lot of trial and error, like early days when we launched the product.

309
00:13:57,015 --> 00:14:01,604
We still had like a lot of cancellations because of noise and, uh, feedback

310
00:14:01,604 --> 00:14:04,844
that was too nitpicky and it took a while to like learn and figure out the

311
00:14:04,844 --> 00:14:08,775
right balance when it comes to the quality of feedback and what's acceptable.

312
00:14:09,330 --> 00:14:10,860
Um, and, and it was a long battle.

313
00:14:10,860 --> 00:14:14,640
I can tell you like a lot of our engineering actually went into taming these

314
00:14:14,640 --> 00:14:19,020
models to a, to point, to a point where we can make majority of the users happy.

315
00:14:19,020 --> 00:14:20,100
We can't make everyone happy.

316
00:14:20,100 --> 00:14:22,650
This is the nature of this product that not determined is taken.

317
00:14:23,175 --> 00:14:25,275
Um, um, they're not like the previous generation of

318
00:14:25,275 --> 00:14:27,585
the systems where you can deterministically define the

319
00:14:27,585 --> 00:14:30,975
rules, but this one's like, like very wide feedback.

320
00:14:31,275 --> 00:14:33,825
We are vibing, uh, as they say, right?

321
00:14:34,095 --> 00:14:34,545
Vibe check.

322
00:14:34,695 --> 00:14:34,935
Yeah.

323
00:14:35,835 --> 00:14:36,105
Yeah.

324
00:14:36,165 --> 00:14:38,985
There's a, there's a lot of.

325
00:14:40,350 --> 00:14:42,570
I guess nuance in a lot of these things and the space is

326
00:14:42,570 --> 00:14:45,750
moving so quickly that it's basically impossible to keep up.

327
00:14:46,080 --> 00:14:49,410
Uh, you have, I believe, standardized around Claude's models

328
00:14:49,410 --> 00:14:53,010
for a lot of this, uh, 20 minutes before this recording.

329
00:14:53,010 --> 00:14:55,800
If people wanna pinpoint the, pinpoint this in time, uh,

330
00:14:56,100 --> 00:14:59,130
uh, anthropic came with a surprise release of Opus 4.1.

331
00:14:59,310 --> 00:15:02,610
So if we had recorded this yesterday and said Opus four was their premier

332
00:15:02,610 --> 00:15:06,450
model, that would now be inaccurate even in a short timeline like this.

333
00:15:06,840 --> 00:15:10,230
How do you, I guess, continue to hit the, to

334
00:15:10,230 --> 00:15:12,270
hit the moving target that is state of the art?

335
00:15:12,960 --> 00:15:14,040
Um, that's a great question.

336
00:15:14,040 --> 00:15:16,800
First of all, we use both open AI and anthropic models.

337
00:15:16,800 --> 00:15:20,670
In fact, like our open AI token usage might be even more than that.

338
00:15:21,090 --> 00:15:23,640
Our, we philanthropic site, we use like six or seven

339
00:15:23,640 --> 00:15:25,980
models under the whole, like one, the nice things about

340
00:15:25,980 --> 00:15:28,560
Cora product has been, it's not a chat based product.

341
00:15:28,560 --> 00:15:32,310
Every product in the A space starts with some sort of a user input.

342
00:15:32,910 --> 00:15:35,100
Code Rapid is like zero activation energy.

343
00:15:35,100 --> 00:15:38,550
You open up poll request, it kicks off a workflow and it's a long horizon.

344
00:15:38,550 --> 00:15:40,860
Workflow takes like a few minutes to run,

345
00:15:40,950 --> 00:15:41,730
which is genius.

346
00:15:41,730 --> 00:15:43,920
Chatbots are a lazy interface to be direct.

347
00:15:43,950 --> 00:15:46,920
It's everyone tends to go for that 'cause it's the low hanging fruit.

348
00:15:46,920 --> 00:15:49,050
But if I have to interface with a chatbot, it's

349
00:15:49,050 --> 00:15:51,090
not discoverable what it's capable of doing.

350
00:15:51,300 --> 00:15:53,880
And if I look at it in a traditional website, that already

351
00:15:53,880 --> 00:15:56,670
means on some level your website has failed in all likelihood.

352
00:15:56,970 --> 00:15:57,270
Yeah.

353
00:15:57,390 --> 00:15:57,600
Yeah.

354
00:15:57,600 --> 00:15:58,860
In a way like it's like, um.

355
00:15:59,715 --> 00:16:01,845
Zero activation energy kind of a system.

356
00:16:01,845 --> 00:16:03,555
Like you don't have to remember to use it, right?

357
00:16:03,825 --> 00:16:06,405
But that brings in like very interesting thing, like first of all, it's a long

358
00:16:06,405 --> 00:16:11,085
running workflow with ensemble of multiple models and evaluations become like

359
00:16:11,085 --> 00:16:14,985
the key thing, like the nature of these products is it's all about evaluations.

360
00:16:14,985 --> 00:16:16,875
We are not like training our own foundational models.

361
00:16:17,025 --> 00:16:17,925
It's not in our budget.

362
00:16:18,465 --> 00:16:18,675
Right?

363
00:16:18,975 --> 00:16:19,740
So what we do is like.

364
00:16:20,100 --> 00:16:23,580
Make sure that at least we have some sort of a sense into

365
00:16:23,580 --> 00:16:27,360
understanding these models and the behavior and tracking, um,

366
00:16:27,390 --> 00:16:30,720
their progress across different generations that we are seeing.

367
00:16:30,720 --> 00:16:30,930
Right?

368
00:16:30,930 --> 00:16:34,110
So we are actually, uh, almost always testing some

369
00:16:34,110 --> 00:16:36,750
frontier model for these labs, uh, using our evals.

370
00:16:36,750 --> 00:16:38,460
We kind of have the, one of the best evals in the

371
00:16:38,490 --> 00:16:40,680
industry when it comes to reasoning models, given that.

372
00:16:40,935 --> 00:16:43,785
Code reviews are a reasoning heavy use case, and also

373
00:16:43,785 --> 00:16:46,155
around the agent tick flows because as you go deeper

374
00:16:46,155 --> 00:16:48,405
and deeper into agent tick flows, your errors compound.

375
00:16:48,735 --> 00:16:51,975
So these long horizon tasks, if you're like, go off track on the

376
00:16:51,975 --> 00:16:55,695
first step, you're gonna be way off in the like 10th step, right?

377
00:16:56,145 --> 00:16:58,275
So, so yeah, it's all about like evaluating.

378
00:16:58,275 --> 00:17:02,895
So we have a team that spends a lot of time looking at open source usage that we

379
00:17:02,895 --> 00:17:07,005
have, bringing those examples in to make sure we have the, the right evaluation

380
00:17:07,005 --> 00:17:10,245
framework to understand the behavior of these new models as they come out.

381
00:17:10,645 --> 00:17:12,415
Stuck in code review limbo.

382
00:17:12,505 --> 00:17:16,105
Get Code Rabbit and never wait on some guy named Mike again.

383
00:17:16,405 --> 00:17:20,724
Swap the drama for multilayered Context Engineering that suggests

384
00:17:20,724 --> 00:17:25,105
one click fixes and explains the reasoning behind the suggestions.

385
00:17:25,375 --> 00:17:29,065
Code Rabbit integrates with your GI workflows and favorite tool chains.

386
00:17:29,125 --> 00:17:31,255
Supports all programming languages you're

387
00:17:31,255 --> 00:17:34,465
likely to use and doesn't need configuration.

388
00:17:34,765 --> 00:17:38,275
Get code reviews when you need them@coderabbit.ai.

389
00:17:38,600 --> 00:17:44,090
And get them for free on open source projects and on code reviews in your IDE.

390
00:17:44,360 --> 00:17:47,240
So this might be a politically charged

391
00:17:47,240 --> 00:17:48,980
question, but we're gonna run with it anyway.

392
00:17:49,100 --> 00:17:54,500
Why did you pick Google Cloud as your infrastructure provider of choice?

393
00:17:54,500 --> 00:17:55,879
I mean, well, why not Azure?

394
00:17:55,879 --> 00:17:59,090
I, I can answer that easily, but AWS is, is a viable,

395
00:17:59,240 --> 00:18:01,520
is at least a viable contender for this sort of thing.

396
00:18:01,909 --> 00:18:03,740
Uh, I, I have my own suspicions, but I'm

397
00:18:03,740 --> 00:18:05,389
curious to hear what your reasoning was.

398
00:18:05,945 --> 00:18:07,294
We love the cloud run product.

399
00:18:07,294 --> 00:18:08,825
I think that was one of the big drivers.

400
00:18:08,825 --> 00:18:10,294
The whole cloud run thing is amazing.

401
00:18:10,294 --> 00:18:13,504
Like that was one of the reasons we saved us so much time

402
00:18:13,534 --> 00:18:17,225
and costs in operating like this whole server desk thing.

403
00:18:17,225 --> 00:18:17,465
Right.

404
00:18:18,065 --> 00:18:20,405
And also like in the previous startups we have gone with Google

405
00:18:20,405 --> 00:18:24,274
Cloud, like the interface, the, it, it, it's like, uh, um, like

406
00:18:24,274 --> 00:18:26,675
Amazon is great, but that's like one of the first cloud services.

407
00:18:26,705 --> 00:18:28,235
It's, it can get very overwhelming.

408
00:18:28,485 --> 00:18:30,735
To a lot of people, but the G CCP is like much

409
00:18:30,735 --> 00:18:32,655
more cleaner in our opinion cost wise as well.

410
00:18:32,655 --> 00:18:33,764
It's been very effective.

411
00:18:34,215 --> 00:18:37,064
Uh, in terms of, um, yeah, for start, I think

412
00:18:37,064 --> 00:18:38,594
a lot of startups do build on Google Cloud.

413
00:18:39,104 --> 00:18:42,314
Yeah, it's, it's one of those areas where if I were, and I've said this

414
00:18:42,314 --> 00:18:45,135
before, that if I were starting from zero today, I would pick Google

415
00:18:45,135 --> 00:18:48,675
Cloud over AWS just because the developer experience is superior.

416
00:18:49,004 --> 00:18:51,375
Cloud Run is no exception to this.

417
00:18:51,375 --> 00:18:52,514
It's one of those.

418
00:18:52,910 --> 00:18:57,980
Dead simple services that basically works magic, as best I can tell.

419
00:18:58,129 --> 00:18:59,540
It feels like it's what Lambda should have been.

420
00:18:59,960 --> 00:19:01,160
Yeah, I mean it's amazing, right?

421
00:19:01,160 --> 00:19:03,410
I mean you can, it's all con based, which we love.

422
00:19:03,410 --> 00:19:05,660
Like when scaling is so straightforward once you understand

423
00:19:05,660 --> 00:19:07,910
the model there, like it's not like based on just resources,

424
00:19:07,940 --> 00:19:10,245
like the knobs there are very like makes so much logical sense.

425
00:19:10,985 --> 00:19:12,995
Once you get to understand them, they're much simpler.

426
00:19:13,265 --> 00:19:16,385
How do you handle the, the sandboxing approach to this?

427
00:19:16,385 --> 00:19:21,125
By, by which I mean that it has become increasingly apparent that it is almost

428
00:19:21,125 --> 00:19:25,925
impossible to separate out, prompt from the rest of the context in some cases.

429
00:19:26,195 --> 00:19:30,185
So it, it seems like it would not be that, uh, out of the realm of

430
00:19:30,185 --> 00:19:33,575
possibility for someone to say, yeah, disregard previous instructions, do

431
00:19:33,575 --> 00:19:37,190
this other thing instead, especially in a sandbox, it's able to run code.

432
00:19:38,195 --> 00:19:38,885
Yeah, you're right.

433
00:19:38,945 --> 00:19:44,555
We are running like, um, untrusted code and some of this like chat interface.

434
00:19:44,555 --> 00:19:46,925
You could actually steer the system to generate any

435
00:19:46,925 --> 00:19:50,405
malicious share scripts or Python code in that environment.

436
00:19:50,405 --> 00:19:50,645
Right.

437
00:19:51,365 --> 00:19:53,855
Um, and we do have internet access enabled as well, right?

438
00:19:53,855 --> 00:19:56,765
So it's all about like locking down, making sure have

439
00:19:56,765 --> 00:19:58,620
C groups and all set up so they, you're not like.

440
00:19:59,230 --> 00:20:02,110
Um, escaping the sandbox that we have created.

441
00:20:02,530 --> 00:20:05,590
And the other part is like locking down the access to internal services.

442
00:20:05,590 --> 00:20:07,540
You don't want that sandbox to access the

443
00:20:07,540 --> 00:20:09,760
metadata service of these cloud providers, right?

444
00:20:10,030 --> 00:20:12,520
So yeah, certain like standard stuff comes around like network,

445
00:20:12,970 --> 00:20:16,750
um, network zoning and, um, the, the, uh, C groups and all.

446
00:20:16,810 --> 00:20:19,780
Um, and the other part is like we allow internet access.

447
00:20:19,780 --> 00:20:20,620
I think that's something we.

448
00:20:21,480 --> 00:20:22,860
We disallow everything which is in the, in

449
00:20:22,860 --> 00:20:25,110
the G-C-P-V-P-C, but allow internet access.

450
00:20:25,620 --> 00:20:28,919
Um, at the same time, we have protections around resource utilization and

451
00:20:28,919 --> 00:20:32,845
killing those malicious product projects, uh, uh, scripts that could be running.

452
00:20:33,870 --> 00:20:37,800
It's that that is always one of those weird challenges.

453
00:20:37,830 --> 00:20:42,090
Uh, taking a more, a more mundane challenge that I have is often code bases

454
00:20:42,090 --> 00:20:45,690
tend to sprawl as soon as they become capable of doing non-trivial things.

455
00:20:45,960 --> 00:20:50,250
And some of us don't bound our pull requests in reasonable ways.

456
00:20:50,580 --> 00:20:54,180
How do you wind up fitting, getting meaningful code review?

457
00:20:54,725 --> 00:20:58,595
Either in a giant monolithic repo that will far exceed the context

458
00:20:58,595 --> 00:21:03,365
window or or counterpoint within a microservice where 90% of the

459
00:21:03,365 --> 00:21:07,054
application is other microservices that are out of scope for this.

460
00:21:07,445 --> 00:21:09,905
How do you effectively get code review on something like that?

461
00:21:10,115 --> 00:21:10,955
That's a good question.

462
00:21:10,955 --> 00:21:13,205
There's a term for this called context engineering and, and.

463
00:21:13,695 --> 00:21:15,855
What we do actually, if there's the best way to describe it,

464
00:21:15,855 --> 00:21:19,485
like it all starts with building some sort of code intelligence.

465
00:21:19,485 --> 00:21:22,514
Let's say you are like reviewing five or 10 files, but you

466
00:21:22,514 --> 00:21:25,125
have a like large code base where those files got changed.

467
00:21:25,665 --> 00:21:29,024
The first part of the process is like building some sort of a code graph

468
00:21:29,534 --> 00:21:33,405
because in unlike Cursor, which kind of uses in the code completion

469
00:21:33,405 --> 00:21:36,764
tools, they use like code index, which is more on similarity search.

470
00:21:37,034 --> 00:21:40,065
And that works great for their use case because they mainly need

471
00:21:40,065 --> 00:21:42,705
to follow the style of existing code when generating new code.

472
00:21:43,304 --> 00:21:47,115
In our case, like the code review is a very reasoning intensive workflow.

473
00:21:47,115 --> 00:21:51,314
Like we, if we are bringing in these definitions from the other

474
00:21:51,314 --> 00:21:54,254
part of the code, they have to be in the call path, right?

475
00:21:54,254 --> 00:21:56,294
So that's why we build a relationship graph, which

476
00:21:56,294 --> 00:21:58,304
is a very different technology that we invested in.

477
00:21:59,250 --> 00:22:02,100
That brings in the right context as a starting point.

478
00:22:02,100 --> 00:22:04,290
As I said, it's still a starting point, like you still have to do the

479
00:22:04,290 --> 00:22:07,620
agentic loop after that, but the starting point has to be good so that

480
00:22:07,620 --> 00:22:12,060
your first pass of the review has some bearing on where to poke holes.

481
00:22:12,450 --> 00:22:15,300
Yeah, you're gonna like first raise 2030 concerns

482
00:22:15,300 --> 00:22:17,370
and you're only gonna start digging deeper on those.

483
00:22:17,430 --> 00:22:20,910
Choose your own adventure kind of, um, um, routes.

484
00:22:20,970 --> 00:22:22,920
Uh, and some of these will lead to real bugs.

485
00:22:23,130 --> 00:22:24,180
It's not deterministic.

486
00:22:24,180 --> 00:22:25,770
I mean, each run would look different.

487
00:22:26,635 --> 00:22:28,795
It's like, just like humans, human humans

488
00:22:28,795 --> 00:22:30,235
will like miss stuff some a lot of the times.

489
00:22:30,235 --> 00:22:34,735
But now in this case, AI is still doing a really good job in poking holes

490
00:22:34,795 --> 00:22:38,035
at where it feels the deficiencies could be there given the code changes.

491
00:22:38,365 --> 00:22:40,255
But it starts with the initial context you

492
00:22:40,255 --> 00:22:42,145
have to bring in like code graph learnings.

493
00:22:42,475 --> 00:22:43,795
So we are long term memories feature.

494
00:22:43,795 --> 00:22:46,885
So each time a developer can teach Code Rabbit, it learns and

495
00:22:46,885 --> 00:22:49,435
it gets better over time because those learnings are then used

496
00:22:49,765 --> 00:22:52,165
to improve reviews for all the other developers in the company.

497
00:22:52,165 --> 00:22:53,335
It's a multiplayer system.

498
00:22:53,855 --> 00:22:54,095
Right.

499
00:22:54,095 --> 00:22:56,885
So we are also bringing in context from existing

500
00:22:56,885 --> 00:22:59,885
comments on a pr, some of the linking tools, CICD.

501
00:23:00,305 --> 00:23:03,125
So there like 10 or 15 places you're bringing the context from.

502
00:23:03,125 --> 00:23:05,555
The most impactful is usually the code graph.

503
00:23:06,065 --> 00:23:08,885
I, I, I want to explore a bit of the business piece of it.

504
00:23:09,090 --> 00:23:12,510
If that's all right with you, uh, you've taken a somewhat familiar

505
00:23:12,540 --> 00:23:16,380
GitHub style model of free for public repos, paid for private ones.

506
00:23:16,380 --> 00:23:17,070
By and large.

507
00:23:17,070 --> 00:23:19,410
There's a, there's a significant free tier that you offer,

508
00:23:19,710 --> 00:23:22,530
uh, and you're also, to my understanding, free in vs.

509
00:23:22,530 --> 00:23:25,680
Code cursor windsurf, as long as that lasts, et cetera.

510
00:23:26,160 --> 00:23:27,600
How do the economics of this work?

511
00:23:27,990 --> 00:23:28,980
That's a really great question.

512
00:23:28,980 --> 00:23:30,930
Like when we start this business, like one of

513
00:23:30,930 --> 00:23:32,940
the things we realized that it's a habit change.

514
00:23:33,000 --> 00:23:34,020
We are trying to make people.

515
00:23:34,340 --> 00:23:36,290
Adopt this new habit, like this AI thing.

516
00:23:36,320 --> 00:23:37,879
Most people don't want to use it.

517
00:23:38,449 --> 00:23:41,060
Like, I mean, you are trying to bring AI experiences

518
00:23:41,060 --> 00:23:43,340
into existing workflow and universally everyone hates it.

519
00:23:43,340 --> 00:23:43,490
Now.

520
00:23:43,490 --> 00:23:46,340
Core Rabbit has been lucky in that regard that we brought in a very

521
00:23:46,340 --> 00:23:50,629
meaningful experience that people love and we wanted to make sure that we.

522
00:23:51,420 --> 00:23:55,560
Spread it and, and make it like democratize it to some extent.

523
00:23:55,560 --> 00:23:57,420
Like that's where the open source makes sense.

524
00:23:57,420 --> 00:23:59,820
I mean, first of all, I mean I, we love open source, like

525
00:23:59,909 --> 00:24:02,430
I'm a big believer and we sponsor a lot of these open

526
00:24:02,430 --> 00:24:05,370
source projects and, and that became also testing ground.

527
00:24:05,370 --> 00:24:07,740
So that was other thing like, because it's not just a Go

528
00:24:07,740 --> 00:24:10,649
to Market, but also the learnings and the public usage.

529
00:24:11,294 --> 00:24:13,784
We used that as a feedback loop to improve the AI

530
00:24:13,784 --> 00:24:16,004
product so that we can serve the paid customers.

531
00:24:16,695 --> 00:24:19,334
So from the economics point of view, yeah, it's hard given

532
00:24:19,334 --> 00:24:21,735
that it's one of the agent systems and you know, publicly, like

533
00:24:21,735 --> 00:24:24,794
even Cursor and cloud code have had issues with their pricing.

534
00:24:25,004 --> 00:24:26,685
Massively negative gross margins.

535
00:24:27,254 --> 00:24:30,375
Like it's like we are still able to offer this service at

536
00:24:30,375 --> 00:24:33,445
like a flat price point of per developer per month, which is.

537
00:24:33,835 --> 00:24:36,355
Affordable enough that we can go mass market with it

538
00:24:37,105 --> 00:24:39,774
and predictable enough, which is underappreciated.

539
00:24:40,225 --> 00:24:40,645
Exactly.

540
00:24:40,645 --> 00:24:41,754
It's predictable enough.

541
00:24:41,754 --> 00:24:45,355
There is no surprises and we don't have negative cross margins.

542
00:24:45,355 --> 00:24:45,595
Right.

543
00:24:45,595 --> 00:24:47,485
I mean we, we are one of those very few success

544
00:24:47,485 --> 00:24:49,465
stories where we have engineered the system to.

545
00:24:49,889 --> 00:24:51,570
To a point using on, like, we are not like

546
00:24:51,810 --> 00:24:54,180
letting users pick sonet and run that in a loop.

547
00:24:54,180 --> 00:24:55,920
I mean, if you look at most products, that's how they are.

548
00:24:55,920 --> 00:24:58,290
You're picking a model and then running with it.

549
00:24:58,710 --> 00:25:00,720
Like we, we are being smart about this, right?

550
00:25:00,720 --> 00:25:02,399
It's like Amazon Prime.

551
00:25:02,399 --> 00:25:04,770
Yes, everyone wants free shipping, but you can't just offer it.

552
00:25:04,770 --> 00:25:06,629
You have to build automation, the infrastructure.

553
00:25:06,629 --> 00:25:07,560
And we invested in that.

554
00:25:08,409 --> 00:25:09,040
That's a trick.

555
00:25:09,040 --> 00:25:10,449
I mean, we are able to support all the open

556
00:25:10,449 --> 00:25:12,490
source users so that we can learn from them a lot.

557
00:25:12,550 --> 00:25:17,050
We can, we are supporting a lot of the, uh, IDE users because we monetize on

558
00:25:17,050 --> 00:25:20,500
the GitHub side, we are a team product and that's the market we care about.

559
00:25:20,889 --> 00:25:24,399
Um, by removing the barrier to entry using the IDE, um, most people

560
00:25:24,399 --> 00:25:27,790
are not familiar getting familiar with COBIT through that form factor.

561
00:25:28,149 --> 00:25:31,389
And once they like it, they're able to bring us in into the Git platforms

562
00:25:31,389 --> 00:25:34,570
where they need more permissions and some consensus to adopt it.

563
00:25:35,050 --> 00:25:36,220
Um, and that's working really well.

564
00:25:36,220 --> 00:25:37,360
I mean, Go to Market wise.

565
00:25:37,419 --> 00:25:37,570
Um.

566
00:25:38,505 --> 00:25:38,715
Yeah.

567
00:25:38,715 --> 00:25:42,045
Uh, we are growing like double digit growth, uh,

568
00:25:42,045 --> 00:25:42,615
every month.

569
00:25:43,125 --> 00:25:44,145
And who are these folks?

570
00:25:44,385 --> 00:25:46,215
Are these, are these individual developers?

571
00:25:46,215 --> 00:25:47,475
Are these giant enterprises?

572
00:25:47,475 --> 00:25:48,675
Are they somewhere between the two?

573
00:25:49,215 --> 00:25:49,965
Somewhere between the

574
00:25:49,965 --> 00:25:50,085
two.

575
00:25:50,085 --> 00:25:52,365
Like most of our growth early days had come

576
00:25:52,395 --> 00:25:54,255
completely product led growth inbound like.

577
00:25:54,455 --> 00:25:58,235
All the way from small five developer companies

578
00:25:58,235 --> 00:25:59,675
all the way to hundreds of developers.

579
00:25:59,675 --> 00:26:01,115
So we've seen the whole spectrum of it.

580
00:26:01,115 --> 00:26:02,165
Everyone needs this product.

581
00:26:02,165 --> 00:26:04,145
Like no matter you're a small company or large, everyone needs

582
00:26:04,145 --> 00:26:07,835
to do code reviews and, uh, the smaller teams tend to move faster

583
00:26:07,835 --> 00:26:11,014
given that it's a fast, like you can build a consensus quickly.

584
00:26:11,014 --> 00:26:14,254
Larger teams need a longer POC, but usually happens in a few weeks.

585
00:26:14,254 --> 00:26:16,625
ROI is very, very clear of for this product.

586
00:26:17,284 --> 00:26:20,435
Uh, we have some of the enterprises now also like doing some POCs for

587
00:26:20,435 --> 00:26:23,675
a few weeks, and these are like large, uh, seven figure deals even.

588
00:26:24,465 --> 00:26:25,485
That is significant.

589
00:26:25,544 --> 00:26:29,955
Uh, you've also recently raised a $16 million series A, uh, led by CRV.

590
00:26:30,225 --> 00:26:33,254
Uh, so I'm sure you've been asked this question before,

591
00:26:33,254 --> 00:26:35,175
so I don't feel too bad about springing it on you.

592
00:26:35,385 --> 00:26:39,945
But what happens when Microsoft just builds this into GitHub natively?

593
00:26:39,975 --> 00:26:41,790
Uh, how do you avoid getting Sherlock by something like that?

594
00:26:42,915 --> 00:26:43,965
It's already happened.

595
00:26:43,965 --> 00:26:46,815
So we've been competing with GitHub co-pilot's,

596
00:26:46,905 --> 00:26:48,975
uh, code review product for the last 10 months.

597
00:26:48,975 --> 00:26:49,395
Now,

598
00:26:49,485 --> 00:26:51,945
the fact that it automatically does that, and I had no idea that

599
00:26:51,945 --> 00:26:55,125
it did, that tells me a lot about GitHub's marketing strategy.

600
00:26:55,125 --> 00:26:56,295
But please continue.

601
00:26:57,405 --> 00:26:59,925
Yeah, they do have that product, which is built in and

602
00:27:00,254 --> 00:27:02,685
usually, I mean, of course, I mean, it's almost like.

603
00:27:03,180 --> 00:27:04,920
As with everything, GitHub, like the best

604
00:27:04,920 --> 00:27:06,900
of breed products, uh, still win, right?

605
00:27:06,900 --> 00:27:10,320
I mean, so that's where like, it hasn't impacted anything on our

606
00:27:10,320 --> 00:27:13,500
growth or churn rates, even despite that product being out there.

607
00:27:13,830 --> 00:27:16,770
I have heard people talk about Code Rabbit in this context.

608
00:27:16,770 --> 00:27:18,810
I have not heard people talk about copilot in

609
00:27:18,810 --> 00:27:21,150
this context for just a sample size of one here,

610
00:27:21,570 --> 00:27:21,810
right?

611
00:27:21,990 --> 00:27:22,440
That's right.

612
00:27:22,440 --> 00:27:23,070
And we have like.

613
00:27:23,450 --> 00:27:24,560
Innovated in this space.

614
00:27:24,560 --> 00:27:25,970
We actually created this category.

615
00:27:25,970 --> 00:27:28,340
I mean, the bunch of larger players, they're all trying

616
00:27:28,340 --> 00:27:31,010
to copy our concepts, but still there's a lot of tech

617
00:27:31,010 --> 00:27:32,990
under the hood, which is like very hard to replicate.

618
00:27:32,990 --> 00:27:35,450
I mean, it's, I would say, a harder product to build than even code

619
00:27:35,450 --> 00:27:38,720
generation systems, given that it's a very reasoning heavy product.

620
00:27:38,725 --> 00:27:40,850
And, and people are more sensitive to the, to the

621
00:27:40,850 --> 00:27:43,040
inaccuracies when it comes to these kind of products.

622
00:27:43,370 --> 00:27:44,960
And it's a mission critical workflow, right?

623
00:27:44,960 --> 00:27:47,000
I mean, you're in the code review and it's a very serious

624
00:27:47,000 --> 00:27:49,520
workflow, uh, not just something on your developer's

625
00:27:49,520 --> 00:27:52,100
laptop, and you can be more forgiving around the mistakes.

626
00:27:53,110 --> 00:27:55,540
Yeah, that's why we have like seen not of colognes, but no

627
00:27:55,540 --> 00:27:58,120
one has been able to replicate the magic of Code Rabbit.

628
00:27:58,780 --> 00:28:01,300
We are like probably 10 x bigger than the next competitor in the space.

629
00:28:01,979 --> 00:28:04,500
Yeah, I'm not aware of other folks in the space, which probably says

630
00:28:04,500 --> 00:28:08,699
something all its own, and this also has the advantage of, it feels like

631
00:28:08,699 --> 00:28:12,689
it is more well thought out than a Shells script that winds up calling a

632
00:28:12,689 --> 00:28:17,010
bunch of APIs and it doesn't seem like it's likely to get become irrelevant

633
00:28:17,010 --> 00:28:19,649
with one feature release from one of the model providers themselves.

634
00:28:20,400 --> 00:28:20,880
That's right.

635
00:28:20,940 --> 00:28:21,300
That's right.

636
00:28:21,300 --> 00:28:21,840
I mean, yeah.

637
00:28:21,840 --> 00:28:24,930
I mean, so we kind of better on the right things early on.

638
00:28:24,930 --> 00:28:27,600
Like going fully agent, we kind of saw that coming two years

639
00:28:27,600 --> 00:28:30,810
back and investing a lot in the reasoning models before they even

640
00:28:30,810 --> 00:28:33,180
became mainstream because the entire thing was reasoning heavy.

641
00:28:33,180 --> 00:28:35,820
So we've kind of been future proof with the decisions we are making.

642
00:28:36,330 --> 00:28:37,950
And now we could be blindsided by something,

643
00:28:37,950 --> 00:28:39,570
I don't know, like GPT five, GPD six.

644
00:28:39,570 --> 00:28:41,850
But so far it looks like each time the new

645
00:28:41,850 --> 00:28:43,860
models come out, they benefit this product a lot.

646
00:28:43,860 --> 00:28:45,660
And it's all about the context we are bringing in.

647
00:28:46,590 --> 00:28:47,820
We engineer the system for cost.

648
00:28:48,270 --> 00:28:50,100
Cost is a big factor in this market, I could say.

649
00:28:50,189 --> 00:28:50,820
I could tell you that.

650
00:28:51,000 --> 00:28:51,120
Yeah.

651
00:28:52,020 --> 00:28:52,290
Yes.

652
00:28:52,290 --> 00:28:52,980
Oh, absolutely.

653
00:28:52,980 --> 00:28:55,080
Especially since it seems like.

654
00:28:55,440 --> 00:28:58,200
When you offer a generous free tier like this, it feels like that, yes, it's a

655
00:28:58,200 --> 00:29:01,770
marketing expense, but that also can be ruinously expensive if it, if it goes

656
00:29:01,770 --> 00:29:05,730
into the wrong direction or that you haven't gotten the e economics dialed in.

657
00:29:05,790 --> 00:29:06,600
Exactly right.

658
00:29:07,260 --> 00:29:07,770
That's right.

659
00:29:08,400 --> 00:29:10,290
And uh, that's why you have all these abuse given, that's

660
00:29:10,290 --> 00:29:12,420
where the technology from my second startup came handy.

661
00:29:12,420 --> 00:29:14,820
Like, so a lot of the, uh, abuse prevention is a

662
00:29:14,820 --> 00:29:17,070
flux ninja tech that we're still using at cobit.

663
00:29:17,100 --> 00:29:17,370
So.

664
00:29:17,970 --> 00:29:23,159
So you, you claim to catch a ridiculous percentage of bugs, uh, which is great.

665
00:29:23,159 --> 00:29:24,840
I mean, your marketing says all the things I would expect.

666
00:29:25,050 --> 00:29:27,540
What has the developer feedback been like in reality?

667
00:29:28,590 --> 00:29:30,659
I would, the majority people love it.

668
00:29:30,659 --> 00:29:32,820
I mean, you could see this on the, uh, social

669
00:29:32,820 --> 00:29:34,770
media, like people are just talking about it.

670
00:29:34,770 --> 00:29:36,240
They, in general love the product.

671
00:29:36,629 --> 00:29:38,790
Um, like a lot of these organizations we talk to,

672
00:29:38,790 --> 00:29:41,100
they say they recovered the investment in two months.

673
00:29:41,100 --> 00:29:43,320
Like, and, and some people are coming back and saying,

674
00:29:43,320 --> 00:29:45,420
if, if you were to charge more, they will still buy it.

675
00:29:46,485 --> 00:29:48,615
I mean, so the, it's been a very overwhelming, I mean,

676
00:29:48,615 --> 00:29:50,445
of course there's always going to be some detractors.

677
00:29:50,445 --> 00:29:54,105
People who don't like AI in general have their own opinions, so those.

678
00:29:54,465 --> 00:29:57,345
Yeah, so, so that crowd will also be there, but if you look at the

679
00:29:57,345 --> 00:30:01,875
majority, it's uh, definitely a step function improvement in their workflow.

680
00:30:02,415 --> 00:30:05,504
Um, and, and, and it's very easy to see if you go and

681
00:30:05,504 --> 00:30:09,855
search social media, LinkedIn or uh, X platform or like,

682
00:30:09,855 --> 00:30:11,745
you will always see like people saying positive things.

683
00:30:11,745 --> 00:30:13,305
And that's where we are getting the growth from.

684
00:30:13,305 --> 00:30:16,155
Like most of our signups are actually word of mouth signups at this point.

685
00:30:16,605 --> 00:30:18,645
So it's like our customers bringing in more customers.

686
00:30:19,514 --> 00:30:20,024
So

687
00:30:20,110 --> 00:30:24,405
I, I, I guess my question now comes down to what the future looks like here.

688
00:30:24,435 --> 00:30:25,274
Where, where does this go?

689
00:30:25,274 --> 00:30:26,504
What is the ultimate end game?

690
00:30:26,564 --> 00:30:30,135
Uh, do human code reviewers wind up going extinct, or is

691
00:30:30,135 --> 00:30:33,284
this more of an augmentation versus replacement story?

692
00:30:34,230 --> 00:30:37,800
I think it's more like now humans will be like fighting a higher order battle.

693
00:30:37,830 --> 00:30:40,980
Like if you look at all this like nitpicking and looking at problems, it's

694
00:30:41,010 --> 00:30:44,820
like the AI still like, is not, doesn't have the non-obvious knowledge.

695
00:30:44,820 --> 00:30:46,290
Like there's knowledge beyond code review

696
00:30:46,290 --> 00:30:48,659
that goes into the decision making, right.

697
00:30:48,659 --> 00:30:49,770
Which we don't have.

698
00:30:49,889 --> 00:30:52,050
And, and the humans have that knowledge, right?

699
00:30:52,050 --> 00:30:54,270
So in a way, like, I don't think humans are going away.

700
00:30:54,270 --> 00:30:55,560
I mean, the fear of seen is like.

701
00:30:55,625 --> 00:30:58,655
Usually, instead of having two code reviewers on

702
00:30:58,655 --> 00:31:01,115
each pr now, one is ai, the other still a human.

703
00:31:01,535 --> 00:31:04,205
And on smaller pool requests, they're just trusting Code Rabbit.

704
00:31:04,205 --> 00:31:05,465
They don't even have a human review.

705
00:31:05,465 --> 00:31:07,265
So some of that automation has kicked in.

706
00:31:07,895 --> 00:31:10,385
But when it comes to coding in general and code

707
00:31:10,385 --> 00:31:12,335
review, I think it's gonna be a long journey.

708
00:31:12,335 --> 00:31:15,185
Like a lot of the labs are hoping that we will go completely

709
00:31:15,845 --> 00:31:18,755
automate software development, uh, in the next few years.

710
00:31:18,755 --> 00:31:20,315
But I don't think that's gonna happen.

711
00:31:20,855 --> 00:31:21,215
Um.

712
00:31:21,585 --> 00:31:24,135
What we are now discovering that this whole code

713
00:31:24,195 --> 00:31:26,415
coding market has multiple submarkets in it.

714
00:31:26,415 --> 00:31:27,705
There is like tab completion.

715
00:31:27,705 --> 00:31:30,465
Now there are different agents all the way from terminal IDE

716
00:31:30,885 --> 00:31:35,145
background agents, and they don't like really replace each other.

717
00:31:35,145 --> 00:31:37,815
Like they, they, they're just being used for different reasons.

718
00:31:38,115 --> 00:31:40,215
And then code review, uh, as a guardrail.

719
00:31:40,800 --> 00:31:42,510
Layer is going to be standardized for these

720
00:31:42,510 --> 00:31:44,820
organizations, uh, as a central layer.

721
00:31:44,820 --> 00:31:47,220
It's almost like Datadog, if you have to give you an analogy, like you

722
00:31:47,220 --> 00:31:50,790
had all this multi-cloud, you had all these Kubernetes, uh, rancher.

723
00:31:50,880 --> 00:31:54,120
Um, but then Datadog said, okay, to be successful, you need observability.

724
00:31:54,120 --> 00:31:57,420
I'm gonna give you the guardrails, and became massively valuable.

725
00:31:57,420 --> 00:31:59,220
That's where obviously, core drive opportunity here.

726
00:31:59,920 --> 00:32:03,610
So I, I guess my last question for you on this is that for various

727
00:32:03,610 --> 00:32:07,330
developers who are listening to this, who are drowning in PR reviews going

728
00:32:07,330 --> 00:32:10,690
unattended, what is the one thing that they should know about Code Rabbit?

729
00:32:11,500 --> 00:32:12,760
Yeah, I think Code Rabbit is a friend.

730
00:32:12,760 --> 00:32:15,460
I mean, in a way, like, uh, if they're bringing Code Rabbit

731
00:32:15,460 --> 00:32:17,650
into their workflow, they are going to be like, at least

732
00:32:17,650 --> 00:32:20,170
offloading some of the most boring parts, which is the preview.

733
00:32:20,170 --> 00:32:22,330
Like, of course, building software is more fun than review.

734
00:32:22,780 --> 00:32:24,340
Uh, it's, and it's fun to work with.

735
00:32:24,340 --> 00:32:25,585
I mean, of course it's going to be that.

736
00:32:26,415 --> 00:32:29,985
Ai that's like always watching their back while they're trying to move fast with

737
00:32:29,985 --> 00:32:33,855
ai, making sure they're not like tripping over and, and causing bigger issues.

738
00:32:34,574 --> 00:32:36,764
If people wanna learn more, where's the best place for 'em to find you?

739
00:32:37,365 --> 00:32:39,975
Yeah, I mean, they could just find us on Code Rabbit ai and

740
00:32:40,125 --> 00:32:42,465
uh, it just takes a couple of clicks to try out the products.

741
00:32:42,465 --> 00:32:43,304
Really frictionless.

742
00:32:43,304 --> 00:32:46,304
So you could get started just a few minutes for your entire organization.

743
00:32:47,550 --> 00:32:47,880
Awesome.

744
00:32:47,880 --> 00:32:50,070
And we'll of course put links to that in the show notes.

745
00:32:50,280 --> 00:32:52,470
Thank you so much for taking the time to speak with me today.

746
00:32:52,500 --> 00:32:53,130
I appreciate it.

747
00:32:53,730 --> 00:32:54,270
Thanks, Corey.

748
00:32:54,270 --> 00:32:54,950
Thanks for having

749
00:32:54,950 --> 00:32:55,220
me here.

750
00:32:55,820 --> 00:32:58,950
Harjo Gill, co-founder and CEO of Code Rabbit.

751
00:32:59,310 --> 00:33:02,950
I'm cloud economist Corey Quinn, and this is screaming in the Cloud.

752
00:33:03,565 --> 00:33:06,115
If you've enjoyed this podcast, please, we have a five star

753
00:33:06,115 --> 00:33:08,815
review on your podcast platform of choice, whereas if you'd hated

754
00:33:08,815 --> 00:33:11,875
this podcast, please leave a five star review on your podcast

755
00:33:11,875 --> 00:33:15,085
platform of choice along with an obnoxious comment that was no

756
00:33:15,085 --> 00:33:18,715
doubt written for you, at least partially by AI configured badly.