1
00:00:00,309 --> 00:00:03,853
Back in 2018, in the wake of the Parkland High School mass shooting,

2
00:00:04,373 --> 00:00:07,817
a group of former Navy SEALs banded together and launched a company

3
00:00:07,937 --> 00:00:11,440
called Zero Eyes. Their mission is

4
00:00:11,580 --> 00:00:15,024
to prevent and mitigate these types of violent

5
00:00:15,084 --> 00:00:18,447
events. And the means that they're going about achieving that mission is

6
00:00:18,547 --> 00:00:22,612
through technology. Here we're talking about artificial intelligence, specifically

7
00:00:23,013 --> 00:00:26,477
object detection algorithms. So they went and they trained

8
00:00:26,978 --> 00:00:30,743
algorithms designed to detect the presence of

9
00:00:30,903 --> 00:00:34,047
guns. Then they combine those algorithms, they

10
00:00:34,128 --> 00:00:37,852
run those algorithms on CCTV cameras. to

11
00:00:37,992 --> 00:00:41,675
flag specific instances of person

12
00:00:42,096 --> 00:00:45,458
with one of those weapons. So my guest

13
00:00:45,518 --> 00:00:49,241
today is Tim Solcer, who is one of ZeroEye's co-founders

14
00:00:49,381 --> 00:00:53,665
and the company's CTO. And we talked beyond just

15
00:00:53,825 --> 00:00:57,448
how the models work and how they were made to work and validated to

16
00:00:57,468 --> 00:01:01,310
work. but into the impacts that they're seeing on

17
00:01:01,350 --> 00:01:04,471
the ground, the impact that they're having, the ways in

18
00:01:04,511 --> 00:01:09,194
which this level of automated security is

19
00:01:09,314 --> 00:01:12,575
changing operations on the ground, and where it all could go.

20
00:01:12,975 --> 00:01:16,037
So, if you are new to the pod, I

21
00:01:16,097 --> 00:01:19,539
am your host, Ian Kreitzberg. If you are not new, you

22
00:01:19,579 --> 00:01:22,740
knew that already, and thanks for being here. This is The

23
00:01:22,780 --> 00:01:34,185
Deep View Conversations. Tim,

24
00:01:34,245 --> 00:01:37,647
thanks so much for joining me today. I appreciate you having me on. Yeah. So,

25
00:01:38,807 --> 00:01:41,969
you know, we've connected before. It's been a

26
00:01:42,789 --> 00:01:46,251
couple of years, which is kind of crazy. But ZeroEyes has been around,

27
00:01:47,432 --> 00:01:51,854
you know, even before that. You guys got your start in 2018. Let's

28
00:01:51,874 --> 00:01:55,196
just start there. And then there's plenty of other stuff to

29
00:01:55,236 --> 00:01:59,558
dive into. But with the kind of foundational

30
00:02:03,800 --> 00:02:06,883
So we got started in 2018 shortly after the shooting in

31
00:02:06,923 --> 00:02:10,645
Parkland, Florida. And that shooting was particularly

32
00:02:10,685 --> 00:02:14,368
terrible because the shooter was in a stairwell underneath

33
00:02:14,809 --> 00:02:18,031
a security camera for, I think, three or five minutes before the

34
00:02:18,071 --> 00:02:21,234
first shot was fired. So we looked at that

35
00:02:21,254 --> 00:02:24,456
and we essentially had this idea of, you know, if

36
00:02:24,476 --> 00:02:27,919
somebody was watching that camera, they would have been able to stop that shooting.

37
00:02:28,239 --> 00:02:31,822
And why aren't we watching cameras? It's basically because there's not enough attention

38
00:02:31,862 --> 00:02:35,545
span and people hours to be able to have eyes on every camera everywhere.

39
00:02:36,645 --> 00:02:40,187
And at the same time, AI was

40
00:02:40,267 --> 00:02:43,968
progressing, object detection was becoming a reality. I'd

41
00:02:44,008 --> 00:02:47,329
worked with some computer vision in a previous startup, and

42
00:02:47,349 --> 00:02:50,471
it was a natural progression to be able to say, you know, if we

43
00:02:50,491 --> 00:02:55,072
can detect any number of objects, faces, dogs,

44
00:02:55,152 --> 00:02:58,514
cats, why can't we detect guns on security cameras and use

45
00:02:58,534 --> 00:03:02,392
that as a means to provide proactive

46
00:03:02,432 --> 00:03:05,609
situational awareness during a shooting. Early days of

47
00:03:05,649 --> 00:03:08,911
ZeroEyes, we got started basically just looking for, can

48
00:03:08,951 --> 00:03:13,373
we build a model to detect guns? That was the first MVP.

49
00:03:13,753 --> 00:03:17,515
And we started off by collecting images from Google

50
00:03:17,555 --> 00:03:20,876
images, web scraping images, open image data set, and

51
00:03:20,936 --> 00:03:24,518
basically trying to build a model that would

52
00:03:24,538 --> 00:03:28,420
detect guns. And we tested on videos like clips

53
00:03:28,440 --> 00:03:32,082
from The Matrix and, you know, random images that we scraped

54
00:03:32,102 --> 00:03:35,635
from the web. We found out really quickly that Those

55
00:03:35,675 --> 00:03:39,318
types of models that are trained off web images

56
00:03:39,418 --> 00:03:43,741
are not easily transferable or generalizable to

57
00:03:43,821 --> 00:03:47,384
security cameras. And so we deployed our first model on a security camera

58
00:03:47,404 --> 00:03:50,606
and the performance was terrible. The images didn't

59
00:03:50,646 --> 00:03:54,189
match up to what the AI was trained on. So the

60
00:03:54,229 --> 00:03:57,732
next step from there was we went out and bought some cheap

61
00:03:57,752 --> 00:04:00,914
security cameras on Amazon and hung them up in our

62
00:04:00,954 --> 00:04:04,317
CEO's backyard, which is where our office was at the time. We were working out

63
00:04:04,337 --> 00:04:07,928
of his basement. And we

64
00:04:07,968 --> 00:04:11,148
found out really quickly that generating our own data in

65
00:04:11,168 --> 00:04:14,529
a realistic environment using a real camera, a

66
00:04:14,569 --> 00:04:17,850
real sensor, was the trick that we needed. And

67
00:04:17,890 --> 00:04:21,331
so we invested most of our early time in the company into

68
00:04:21,371 --> 00:04:24,792
just building a quality model and building an organic data set.

69
00:04:25,063 --> 00:04:28,425
That's the perfect jumping off point, right? I mean, we

70
00:04:28,445 --> 00:04:31,867
hear about this all the time, kind of no matter the application, if you're talking about,

71
00:04:32,408 --> 00:04:36,370
you know, studying whales off the Pacific coast, or in

72
00:04:36,590 --> 00:04:40,092
your case, identifying weapons, it all comes down to data,

73
00:04:40,933 --> 00:04:45,942
the quantity of the data and the quality of the data. So I Tell

74
00:04:45,982 --> 00:04:49,285
me more about that. How much

75
00:04:49,405 --> 00:04:52,568
data did you have to collect? And how did

76
00:04:52,608 --> 00:04:56,732
you think about coming up with different ways of varying

77
00:04:56,852 --> 00:05:00,135
the types of data that you're collecting to making sure

78
00:05:00,155 --> 00:05:03,418
you're getting different angles of, I guess, as great

79
00:05:03,438 --> 00:05:07,001
a variety of weapons as you could kind of conceive of? Yeah,

80
00:05:09,063 --> 00:05:12,405
Being a co-founder with four former Navy SEALs, we

81
00:05:12,445 --> 00:05:17,747
never had a shortage of weapons to use as examples. But

82
00:05:17,967 --> 00:05:21,288
exactly that, garbage in equals garbage out.

83
00:05:21,589 --> 00:05:24,690
And we view our data set as probably the most important thing in the

84
00:05:24,750 --> 00:05:28,031
company right now. In the early days, we never had enough data.

85
00:05:28,412 --> 00:05:32,093
That was 100% of our problem. And

86
00:05:32,113 --> 00:05:35,775
we spent a lot of time traveling to different environments, using

87
00:05:35,835 --> 00:05:39,313
security cameras to collect data and with

88
00:05:39,353 --> 00:05:43,215
different backgrounds, different lighting conditions. And that was

89
00:05:43,235 --> 00:05:46,916
really the challenge that we identified early on was security

90
00:05:46,936 --> 00:05:50,877
cameras are mounted in all different types of environments. Security

91
00:05:50,917 --> 00:05:54,518
camera quality, video quality from cameras varies greatly from

92
00:05:54,558 --> 00:05:58,160
camera to camera and manufacturer to manufacturer. So

93
00:05:59,280 --> 00:06:03,001
it was really important for us to have representative examples of

94
00:06:03,182 --> 00:06:06,543
all of the types of environments that we really wanted to detect guns in.

95
00:06:08,257 --> 00:06:11,919
It started in Mike's backyard and progressed to local schools.

96
00:06:12,260 --> 00:06:15,502
We spent every weekend for months in

97
00:06:15,522 --> 00:06:18,764
the early days traveling to different schools and using their

98
00:06:18,804 --> 00:06:23,867
camera systems to record weapon data. And

99
00:06:23,887 --> 00:06:27,269
then the next step from there was we really

100
00:06:27,289 --> 00:06:30,891
wanted schools to allow us to record data when there was actual

101
00:06:31,672 --> 00:06:34,934
people in the camera views. But

102
00:06:34,974 --> 00:06:38,176
schools were not very willing to have us walk around during

103
00:06:38,196 --> 00:06:41,897
the daytime with guns, with active students in the hallways.

104
00:06:41,977 --> 00:06:45,298
So we then came up with the idea, well, if we can't

105
00:06:45,338 --> 00:06:48,899
collect this data in real time, how can we use the customer

106
00:06:48,939 --> 00:06:53,360
backgrounds to build in some generalization

107
00:06:53,520 --> 00:06:57,241
into the model? And our solution was to build a green

108
00:06:57,281 --> 00:07:01,763
screen AI lab, essentially. So we

109
00:07:02,303 --> 00:07:05,824
built out a 5,000 square foot AI lab with

110
00:07:05,904 --> 00:07:09,763
full green screen walls. And we hung about

111
00:07:09,843 --> 00:07:13,305
100 security cameras throughout the warehouse.

112
00:07:14,825 --> 00:07:18,227
So we were able to walk around with ZeroEyes employees with guns

113
00:07:18,527 --> 00:07:21,728
and then overlay our customer backgrounds behind them to

114
00:07:21,768 --> 00:07:25,090
give that context in the scene that we weren't able

115
00:07:25,130 --> 00:07:28,552
to record in real life. And that's where

116
00:07:28,592 --> 00:07:31,713
we've been for the last few years. But there's a lot of

117
00:07:31,753 --> 00:07:35,175
trends around synthetic data, synthetic data

118
00:07:35,215 --> 00:07:38,349
generation that We'll probably obsolete that at

119
00:07:38,389 --> 00:07:42,494
some point. But we've

120
00:07:42,554 --> 00:07:45,798
invested a lot as a company in having a really high

121
00:07:45,838 --> 00:07:49,481
quality data set that represents as many possible scenarios

122
00:07:50,301 --> 00:07:53,942
So beyond that, you talked a little bit about the algorithms, and

123
00:07:54,142 --> 00:07:57,923
I just want to nail a little deeper into

124
00:07:57,963 --> 00:08:01,444
that. You said back in 2018, we were seeing

125
00:08:01,504 --> 00:08:05,505
advancements in object detection, which is basically, that's

126
00:08:05,645 --> 00:08:09,346
the core thing behind what enables your

127
00:08:09,386 --> 00:08:12,807
technology. That's also one

128
00:08:12,827 --> 00:08:17,089
of the major algorithms in self-driving cars. Um, so

129
00:08:17,109 --> 00:08:20,751
it talked to me about beyond the dataset,

130
00:08:20,891 --> 00:08:24,272
building the system itself, the algorithms, um,

131
00:08:25,233 --> 00:08:28,915
how do you validate that? What was the process

132
00:08:28,975 --> 00:08:32,456
like? And, uh, you know, you're building from scratch. You're

133
00:08:32,496 --> 00:08:35,778
not piggybacking off of other systems. It sounds like,

134
00:08:35,858 --> 00:08:39,540
so I imagine if the data collection process was

135
00:08:39,620 --> 00:08:43,042
intensive, the algorithm construction process.

136
00:08:45,322 --> 00:08:48,863
Yeah, there's a few different steps to it. Because when you think about the

137
00:08:48,963 --> 00:08:52,564
entire video pipeline, the processes of the data, there's many different steps

138
00:08:52,684 --> 00:08:56,205
from simply just decoding the video, then passing

139
00:08:56,245 --> 00:08:59,865
it through an inference engine, and then object tracking. So

140
00:09:01,546 --> 00:09:05,146
it's changed quite a bit over the years. But in the early days, we started with

141
00:09:06,567 --> 00:09:10,528
probably the best object detection technology at the time was fast R-CNN or

142
00:09:10,628 --> 00:09:13,831
faster R-CNN models. We were just starting

143
00:09:13,851 --> 00:09:17,173
to see YOLO models be released, which I think we're on

144
00:09:17,834 --> 00:09:21,776
maybe the eighth or tenth iteration of YOLO models at this point. And

145
00:09:21,996 --> 00:09:25,138
it's kind of diverging quite a bit. But in

146
00:09:25,158 --> 00:09:28,580
the early days, it really affected our hardware processing. I

147
00:09:28,620 --> 00:09:31,922
mean, we were trying to solve this problem of

148
00:09:32,102 --> 00:09:36,164
being able to process real-time video at

149
00:09:36,204 --> 00:09:39,499
the customer's site. Because customers are sensitive to

150
00:09:39,539 --> 00:09:43,763
their security camera data being sent

151
00:09:43,803 --> 00:09:47,187
somewhere else. So we identified really

152
00:09:47,227 --> 00:09:50,510
quickly we have to do this with GPUs. We have to do it with GPUs on

153
00:09:50,550 --> 00:09:54,194
premise. And so the model selection came down to what

154
00:09:54,254 --> 00:09:57,758
model gives us the best balance of accuracy and compute

155
00:09:57,798 --> 00:10:01,697
efficiency. For us, faster RCNN

156
00:10:01,737 --> 00:10:05,318
models were the highest performing at that time. They

157
00:10:05,338 --> 00:10:08,779
were definitely more computationally heavy, which

158
00:10:08,839 --> 00:10:12,860
meant that we couldn't load as many cameras per GPU as we wanted to, which

159
00:10:12,920 --> 00:10:16,462
affected our economics. But over time, the

160
00:10:16,502 --> 00:10:19,983
great thing about research and academia is that they're constantly putting

161
00:10:20,023 --> 00:10:23,246
out new stuff. And for the most part, it's open source. So

162
00:10:23,266 --> 00:10:27,230
we're able to use open source off-the-shelf

163
00:10:27,291 --> 00:10:30,514
models, which has progressed to, I think we're

164
00:10:30,594 --> 00:10:35,399
using Ultralytics models today, YOLOv5

165
00:10:35,800 --> 00:10:39,084
or 8. But with

166
00:10:39,244 --> 00:10:42,717
those advancements in model algorithms, that's

167
00:10:42,757 --> 00:10:46,259
brought with it increases in accuracy. As we're increasing the

168
00:10:46,480 --> 00:10:50,062
quality of our data set, the models, the algorithms themselves

169
00:10:50,102 --> 00:10:53,404
are getting better, and the speed is getting better. So we're able

170
00:10:53,444 --> 00:10:56,927
to run models on more cameras, higher

171
00:10:56,967 --> 00:11:00,149
resolutions, higher frame rates, which all of

172
00:11:00,189 --> 00:11:03,511
those things turn into better detection performance for us. And

173
00:11:03,531 --> 00:11:07,416
we expect that trend to continue in the future. Obviously,

174
00:11:07,456 --> 00:11:10,719
we're still talking about object detection models today, which is kind

175
00:11:10,759 --> 00:11:14,162
of like single frame analysis, but with

176
00:11:14,222 --> 00:11:17,705
the potential for large language models and

177
00:11:17,845 --> 00:11:21,028
vision transformers and things like that, we see a lot

178
00:11:21,048 --> 00:11:24,530
of advancement in the future, not just on the detection side, but also on

179
00:11:25,031 --> 00:11:28,314
the context side. What can we communicate about the

180
00:11:28,374 --> 00:11:32,337
image itself to our customers to provide the best situational intelligence?

181
00:11:33,246 --> 00:11:36,850
Hmm. Yeah. I'm glad you brought that up. Cause I was going to ask you about that. Uh,

182
00:11:36,870 --> 00:11:40,413
but the, the, that big area of advancement that

183
00:11:40,453 --> 00:11:43,776
we've seen over the past couple of years, it's all kind of centered around large

184
00:11:43,816 --> 00:11:47,280
language models, uh, transformer architecture, the vision models

185
00:11:47,320 --> 00:11:50,723
that you, that you mentioned. Um, so that, that

186
00:11:50,783 --> 00:11:54,026
is something that you're exploring early

187
00:11:56,027 --> 00:11:59,770
Yeah, it's something that we see as a huge opportunity because today

188
00:11:59,810 --> 00:12:03,132
we're providing this situational awareness of gun

189
00:12:03,212 --> 00:12:06,575
or no gun. There's either a gun in the image or there's not. And

190
00:12:06,595 --> 00:12:11,078
then we have a human in the loop that allows us to maximize

191
00:12:11,118 --> 00:12:14,701
our detection performance while eliminating false positives. So

192
00:12:14,761 --> 00:12:18,864
customers will never receive a false positive. But

193
00:12:18,904 --> 00:12:22,047
the potential with large language models is now you can extract a little

194
00:12:22,067 --> 00:12:25,281
bit more context from the scene. And you can also provide a

195
00:12:25,321 --> 00:12:28,803
little bit more dynamic inputs. So looking for

196
00:12:28,903 --> 00:12:33,005
more dynamic scenes or different objects in different configurations

197
00:12:33,105 --> 00:12:36,606
that wouldn't be realistic with a single model

198
00:12:37,930 --> 00:12:41,232
We'll stick on that side for a second. You mentioned something else that I want to get into in

199
00:12:41,252 --> 00:12:45,836
a minute, but that'll keep. On

200
00:12:45,856 --> 00:12:49,198
the language model side, the idea of greater context on the images that you're talking

201
00:12:49,238 --> 00:12:53,001
about, right? That's really interesting, because like you said, right

202
00:12:53,041 --> 00:12:56,223
now, gun or no gun, that's kind of it.

203
00:12:56,684 --> 00:13:00,386
Other context, I mean, in a security situation

204
00:13:00,407 --> 00:13:03,969
like that, any added piece of information is

205
00:13:04,369 --> 00:13:08,473
probably going to be very nice to have. But

206
00:13:08,553 --> 00:13:11,796
I wonder, as you start experimenting with

207
00:13:11,896 --> 00:13:15,800
deploying or incorporating additional

208
00:13:15,840 --> 00:13:19,703
models of different types, I

209
00:13:19,743 --> 00:13:23,467
wonder how that changes your validation process. Because

210
00:13:23,527 --> 00:13:27,691
the gun or no gun thing is a little more of a straightforward thing,

211
00:13:27,791 --> 00:13:31,334
it's a different type of model. Language models have

212
00:13:31,774 --> 00:13:35,556
reliability issues and

213
00:13:35,697 --> 00:13:39,058
I wonder how much that might be a challenge or if

214
00:13:39,119 --> 00:13:42,360
there's a way maybe through specific data sets,

215
00:13:42,641 --> 00:13:46,623
small language models, other techniques and safeguards to

216
00:13:48,224 --> 00:13:52,555
Well, I think there's two parts of that. One is Just

217
00:13:52,595 --> 00:13:55,796
like all good security comes in layers, I think the same thing is going to

218
00:13:56,036 --> 00:13:59,498
be applicable for IAI. In

219
00:13:59,558 --> 00:14:03,139
our case, I think there'll be multiple layers of AI models

220
00:14:03,179 --> 00:14:06,580
in the future that process all of the detections that we generate,

221
00:14:07,021 --> 00:14:11,952
but also kind of highlights the value of that human in the loop. If

222
00:14:12,793 --> 00:14:16,115
we were to send a detection directly through to

223
00:14:16,135 --> 00:14:19,238
the end customer and provide some layer of analysis from a

224
00:14:19,298 --> 00:14:22,720
large language model, it's very possible that that's incorrect and

225
00:14:23,041 --> 00:14:26,163
that it could confuse or cause some sort of

226
00:14:26,203 --> 00:14:30,840
issue in the response, the critical incident response. And

227
00:14:30,940 --> 00:14:34,121
today we have a human in the loop that essentially does the same thing, but

228
00:14:34,141 --> 00:14:38,343
they're also capable of performing other actions. So during

229
00:14:38,403 --> 00:14:42,244
a critical scenario, when we see a gun and click dispatch, that's

230
00:14:42,324 --> 00:14:45,846
also initiating a call to 911 dispatch

231
00:14:46,286 --> 00:14:49,627
that's closest to the camera location. And someone on

232
00:14:49,667 --> 00:14:52,828
our team is also getting on the phone with points of contact at the

233
00:14:52,868 --> 00:14:58,170
customer to be able to verbally communicate these things. I

234
00:14:58,190 --> 00:15:02,436
don't see the value of having a human in the loop ever being fully

235
00:15:03,617 --> 00:15:07,542
automated or deprecated because

236
00:15:08,683 --> 00:15:11,907
the value during a critical scenario is we're able to

237
00:15:12,007 --> 00:15:15,492
communicate directly to that POC and not rely on

238
00:15:17,830 --> 00:15:21,532
And so I, I guess what you were just talking about there, that that's the point that

239
00:15:21,553 --> 00:15:24,755
I wanted to get to and make sure we drill down on. Um, cause we've

240
00:15:24,775 --> 00:15:28,957
been kind of talking around it for anyone who's not familiar, um,

241
00:15:29,358 --> 00:15:33,460
with what you guys do, right. It's, it's. Object

242
00:15:33,500 --> 00:15:36,923
detection designed for, uh, you know, to, to scan for

243
00:15:37,243 --> 00:15:41,126
weaponry and, uh, CCTV footage. That's

244
00:15:41,346 --> 00:15:44,608
all linked up with warning systems. You have teams of people

245
00:15:44,648 --> 00:15:48,510
that review flags from the system, but I

246
00:15:49,271 --> 00:15:53,012
would love if you could kind of walk me through, if I was a point of contact, if

247
00:15:53,072 --> 00:15:56,854
I was one of your customers and we put in however

248
00:15:56,874 --> 00:16:00,956
many dozens of cameras we had and

249
00:16:02,237 --> 00:16:05,278
something happens, your camera picks up a

250
00:16:05,318 --> 00:16:08,720
flag, can you just walk me through what

251
00:16:08,760 --> 00:16:12,041
that process looks like from kind of inception of the

252
00:16:12,081 --> 00:16:15,403
model says we might have something here and whatever

253
00:16:16,583 --> 00:16:20,246
I'll take a step back and just talk about it from the entire value chain, because

254
00:16:20,266 --> 00:16:24,049
you're absolutely right. We didn't actually cover that in the beginning. We're

255
00:16:24,069 --> 00:16:27,891
connecting to real-time security cameras, all of the customer's existing security

256
00:16:27,911 --> 00:16:31,614
infrastructure, and we're pulling a RTSP

257
00:16:31,634 --> 00:16:35,056
feed, a real-time streaming protocol feed, from that camera

258
00:16:35,237 --> 00:16:38,579
and running an AI on the video feed frame by frame that's

259
00:16:38,639 --> 00:16:41,938
looking for the presence of a gun. at the point at which our

260
00:16:41,978 --> 00:16:45,200
AI says, it's pretty confident that this object is a gun,

261
00:16:45,580 --> 00:16:48,721
it's going to send that detection to a human in the loop. And the

262
00:16:48,761 --> 00:16:52,423
human in the loop is in our ZOC, our Zero-wise Operation

263
00:16:52,483 --> 00:16:56,065
Center. We have one located just outside of Philadelphia here,

264
00:16:56,305 --> 00:16:59,466
and then another in Honolulu, Hawaii. So we're

265
00:16:59,506 --> 00:17:04,108
able to make use of those time zone differences. The

266
00:17:04,408 --> 00:17:08,408
operators in the ZOC are performing an analysis to

267
00:17:08,448 --> 00:17:11,511
verify whether or not the detection has a real gun in it.

268
00:17:11,871 --> 00:17:15,654
So they see a real gun and they click dispatch. That

269
00:17:15,714 --> 00:17:19,057
dispatch button initiates all of our third-party alerting

270
00:17:19,097 --> 00:17:22,980
methods. So we have a dashboard and

271
00:17:23,000 --> 00:17:26,382
a mobile app ourselves, but we also integrate with

272
00:17:26,562 --> 00:17:30,422
local 911. We integrate with other third-party services

273
00:17:30,482 --> 00:17:33,702
in order to get the detection information to

274
00:17:33,742 --> 00:17:36,943
the customer in the best way that they can utilize it.

275
00:17:37,163 --> 00:17:41,144
So from dispatch, our

276
00:17:41,284 --> 00:17:45,205
ZOC operators will get connected to local 911. They'll

277
00:17:45,225 --> 00:17:48,926
communicate and verify that a local 911 has access

278
00:17:48,986 --> 00:17:52,447
to the alert image, and they're able to generate an

279
00:17:52,507 --> 00:17:56,028
incident based on that. But at the same time, we're also calling

280
00:17:56,269 --> 00:17:59,551
points of contact at the customer site. And what we're trying

281
00:17:59,571 --> 00:18:04,795
to communicate is basically what we're visualizing

282
00:18:04,875 --> 00:18:08,178
on our side. So the

283
00:18:08,218 --> 00:18:11,880
ZOC has a specific script that they stick to, but it's essentially something

284
00:18:11,920 --> 00:18:15,223
like we have an alert, a zeroized weapon

285
00:18:15,243 --> 00:18:18,405
detection alert of what appears to be a

286
00:18:18,445 --> 00:18:22,920
person brandishing a rifle in this setting. That

287
00:18:23,040 --> 00:18:26,463
is really the initiation for an incident

288
00:18:26,503 --> 00:18:30,426
on the customer side. Customers have all different standard operating procedures

289
00:18:30,506 --> 00:18:35,170
of how they want to be notified and what they do following a notification. But

290
00:18:35,370 --> 00:18:38,653
I do foresee more automation in the future around that, where

291
00:18:39,974 --> 00:18:43,277
today we're basically handing off situational awareness to

292
00:18:43,317 --> 00:18:46,599
the customer and allowing them to respond. But

293
00:18:46,659 --> 00:18:50,022
I see a lot of benefit in ZeroEyes being involved

294
00:18:50,042 --> 00:18:53,255
in that response in some way. whether it's something as

295
00:18:53,295 --> 00:18:56,456
simple as providing expertise in

296
00:18:56,796 --> 00:19:00,318
how a school or a commercial office building should respond

297
00:19:00,358 --> 00:19:03,880
to that threat, all the way through to initiating other

298
00:19:05,341 --> 00:19:08,822
forms of response, like dispatching alerts

299
00:19:09,062 --> 00:19:12,864
into access control systems, providing

300
00:19:13,024 --> 00:19:17,566
one-click access control capabilities, so

301
00:19:17,586 --> 00:19:21,625
that customers can actually take action from the ZeroEyes Alert. Um,

302
00:19:24,268 --> 00:19:27,571
Kind of in line with what you were just mentioning about being more involved in

303
00:19:27,711 --> 00:19:31,594
the action. I wonder how far something like this goes,

304
00:19:32,815 --> 00:19:35,958
you know, and, and right. And thinking about where you are right

305
00:19:35,998 --> 00:19:39,421
now, where you might want to go as you keep growing,

306
00:19:39,942 --> 00:19:45,223
since you're analyzing real time footage. Right?

307
00:19:45,443 --> 00:19:48,646
Like that's kind of the whole fundamental crux of

308
00:19:48,706 --> 00:19:52,268
what you're dealing with. If there was an

309
00:19:52,308 --> 00:19:56,131
attack, you know, a person brandishing the rifle, or maybe the person's in

310
00:19:56,171 --> 00:20:02,277
a hallway, they're locked in a room, they're moving. Would

311
00:20:02,297 --> 00:20:05,980
you be able to track the movement of these

312
00:20:06,060 --> 00:20:10,002
kinds of assailants? And in an action stage,

313
00:20:10,863 --> 00:20:14,165
beyond just saying, there's a person here, do something about it,

314
00:20:14,365 --> 00:20:18,368
could you physically or do you physically you

315
00:20:18,388 --> 00:20:21,929
know, speak to local police

316
00:20:21,989 --> 00:20:25,169
officers on the phone and say, okay, this is exactly where the

317
00:20:25,209 --> 00:20:28,370
person is. They're in this hallway. They are doing this right now. They are

318
00:20:28,430 --> 00:20:32,290
moving. They just turned left kind of like an overwatch thing.

319
00:20:32,430 --> 00:20:35,571
Is that, is that a function at all of what you

320
00:20:37,651 --> 00:20:40,712
So yes and no. Uh, we aren't, uh, we

321
00:20:40,752 --> 00:20:44,192
don't have visibility into the customer's camera systems to the extent that

322
00:20:44,232 --> 00:20:47,393
we could actually provide that video or watch a

323
00:20:47,413 --> 00:20:51,081
shooter. walk from camera to camera, and we do that for privacy reasons. Basically,

324
00:20:51,141 --> 00:20:55,602
our AI is the only piece that has access to the live video. But

325
00:20:55,622 --> 00:20:59,182
that being said, you're highlighting on a really key point of our value. And

326
00:20:59,222 --> 00:21:02,303
that is, even after the first shot is fired and we know there's a

327
00:21:02,343 --> 00:21:05,564
gunman on site, we don't know where

328
00:21:05,584 --> 00:21:09,264
he is at any specific point, or the first responders

329
00:21:09,304 --> 00:21:12,425
don't know. And so every time that shooter walks in

330
00:21:12,465 --> 00:21:15,526
front of a new camera that's running zero-wise, and we're able to get a

331
00:21:15,566 --> 00:21:18,869
new detection, we are dispatching that and updating law

332
00:21:18,889 --> 00:21:22,352
enforcement. We do have some local police

333
00:21:22,372 --> 00:21:25,874
departments that use our mobile app and they get the alerts directly to

334
00:21:25,914 --> 00:21:29,496
the mobile app for each individual officer. But

335
00:21:29,516 --> 00:21:32,779
the general feedback we've gotten is that they would prefer to get that

336
00:21:32,819 --> 00:21:36,041
information over the radio from dispatch. And so

337
00:21:36,141 --> 00:21:39,953
our biggest point of contact is that 911 dispatch center. And

338
00:21:40,233 --> 00:21:43,915
our integration with a company called Rapid SOS gives

339
00:21:43,975 --> 00:21:47,597
us the ability to automatically be connected to the

340
00:21:47,617 --> 00:21:50,838
closest 911 dispatch center to where our

341
00:21:50,878 --> 00:21:54,079
customer is located based on their camera location. So that

342
00:21:54,119 --> 00:21:58,261
puts us in direct phone contact with 911 PSAPs,

343
00:21:58,561 --> 00:22:01,723
public safety answering points, and those PSAPs are the ones that

344
00:22:06,024 --> 00:22:09,406
And so because of that integration, any officers on

345
00:22:09,426 --> 00:22:12,747
the ground, as well as points of contact at

346
00:22:13,087 --> 00:22:16,249
whatever building or location you're

347
00:22:16,269 --> 00:22:19,551
dealing with, they know the flag was picked

348
00:22:19,651 --> 00:22:23,392
up by, you know, camera 22B

349
00:22:23,473 --> 00:22:26,594
and hallway seven on the second floor or whatever it is.

350
00:22:26,934 --> 00:22:30,696
So they have a deeper level of situational awareness beyond

351
00:22:33,624 --> 00:22:36,885
Definitely. You also highlight a really important point,

352
00:22:36,925 --> 00:22:40,506
which is a unified mapping interface. Because

353
00:22:41,666 --> 00:22:44,947
if our customer is looking at a map, and we're looking at a different map,

354
00:22:45,287 --> 00:22:49,588
and police responding are looking at a third different map, it

355
00:22:49,988 --> 00:22:54,889
becomes really confusing trying to communicate about landmarks and

356
00:22:54,929 --> 00:22:58,557
where things are located. And so having a unified mapping

357
00:22:58,577 --> 00:23:02,200
interface, which we partner with a company called CRG, Critical

358
00:23:02,240 --> 00:23:05,443
Response Group, and they produce really high

359
00:23:05,503 --> 00:23:08,744
quality interior maps. When

360
00:23:09,145 --> 00:23:12,587
our customer has a CRG map, they're able to provide that to us, and

361
00:23:12,667 --> 00:23:16,210
we are able to overlay that on our dispatch map so that both

362
00:23:16,250 --> 00:23:19,973
the customer and our ZOC operation team is

363
00:23:19,993 --> 00:23:23,375
looking at the same map. And wherever possible, we also include

364
00:23:23,415 --> 00:23:27,238
the local 9-1-1 PSAPs in that when we send our alerts, so

365
00:23:31,694 --> 00:23:35,937
Now, you mentioned for the ZOC, for the operators that

366
00:23:36,157 --> 00:23:40,179
review these flags, that no

367
00:23:40,559 --> 00:23:44,482
false positives, I think is what you said. How are these

368
00:23:44,622 --> 00:23:47,984
operators trained? And I guess, what's

369
00:23:48,084 --> 00:23:51,947
their protocol? in terms of validating that

370
00:23:52,027 --> 00:23:55,310
whatever has been flagged is real or

371
00:23:55,550 --> 00:23:58,893
is not a concern. And I mean, I guess taking

372
00:23:58,913 --> 00:24:02,055
that a step further, if you get a flag that turns out not to be a

373
00:24:02,115 --> 00:24:05,498
gun, but maybe it's something a little weird, is that

374
00:24:07,760 --> 00:24:11,343
I would say the operators back in the ZOC have really difficult jobs.

375
00:24:11,903 --> 00:24:15,186
They are on a daily basis, they see

376
00:24:15,206 --> 00:24:18,464
guns on a daily basis. A lot of times it's

377
00:24:18,524 --> 00:24:22,567
toy guns or ROTC rifles, but they're constantly required

378
00:24:22,607 --> 00:24:25,808
to make split-second decisions, which

379
00:24:25,848 --> 00:24:30,211
is why we focused on hiring veterans and former

380
00:24:30,271 --> 00:24:34,193
law enforcement into the ZOC, because those individuals

381
00:24:34,233 --> 00:24:37,915
back there both have the training and understanding what

382
00:24:37,955 --> 00:24:41,437
a gun looks like and how to respond to a critical scenario, but

383
00:24:41,697 --> 00:24:44,839
also as a part of their former careers as

384
00:24:45,819 --> 00:24:49,080
military and law enforcement, A lot of times they were doing very

385
00:24:49,120 --> 00:24:52,921
similar jobs. They either had watch posts or

386
00:24:53,001 --> 00:24:56,362
they were performing some sort of surveillance where they had to

387
00:24:56,823 --> 00:25:00,203
do basically the same thing and

388
00:25:00,243 --> 00:25:03,744
be able to, during a critical scenario, remain

389
00:25:03,784 --> 00:25:07,805
calm and calmly communicate critical

390
00:25:07,846 --> 00:25:11,506
pieces of information that could be happening at the same time people

391
00:25:11,567 --> 00:25:15,408
are actively under threat. So we've

392
00:25:15,728 --> 00:25:19,329
had a lot of success hiring former law enforcement,

393
00:25:19,389 --> 00:25:22,851
former military personnel into the ZOC. And like

394
00:25:22,891 --> 00:25:26,452
I said, they have a really difficult job. On a daily basis,

395
00:25:27,012 --> 00:25:30,674
they have to understand unique standard operating

396
00:25:30,714 --> 00:25:33,815
procedures from different customers and be able to communicate to the

397
00:25:33,855 --> 00:25:39,197
customer in the best way that the customer needs that information. And

398
00:25:39,237 --> 00:25:43,258
how that translates into real life, I mean, We've

399
00:25:43,578 --> 00:25:47,701
made dozens arrests at this point of people

400
00:25:47,721 --> 00:25:51,503
that had guns in areas that they shouldn't have. But

401
00:25:51,523 --> 00:25:56,347
we also communicate with customers on a daily basis about non-lethal

402
00:25:56,527 --> 00:26:04,152
gun threats. And like I said, every customer has their own SOPs. For

403
00:26:04,192 --> 00:26:07,974
instance, some customers want us to disregard any

404
00:26:08,014 --> 00:26:11,336
toy gun detections. Other customers still want to know about them.

405
00:26:13,462 --> 00:26:16,644
And that extends to a wide array of

406
00:26:16,744 --> 00:26:20,267
scenarios that include all different types of guns

407
00:26:20,667 --> 00:26:24,410
being presented from law enforcement or known trainings,

408
00:26:25,090 --> 00:26:28,172
things like that. So as much as possible, we

409
00:26:28,213 --> 00:26:32,040
try to communicate with our customers so that we're aware of any

410
00:26:32,100 --> 00:26:35,481
reasons that there should be a gun on premise, but generally our customers want

411
00:26:35,521 --> 00:26:38,662
to know regardless of what type of gun it is or what the scenario is.

412
00:26:39,303 --> 00:26:42,744
And that's an awesome point of communication for the ZOC.

413
00:26:43,104 --> 00:26:47,265
So even if we see something that we believe is non-threatening, it's

414
00:26:47,646 --> 00:26:51,087
easy enough just to call up the customer and have that conversation and

415
00:26:55,103 --> 00:26:59,506
Yeah, I'd want to know. I'd want to know. You

416
00:26:59,546 --> 00:27:02,709
mentioned that your work, I

417
00:27:02,749 --> 00:27:06,752
guess, has led to a bunch of arrests of

418
00:27:07,092 --> 00:27:10,935
people who had weapons in places that they shouldn't have had them. I'm

419
00:27:10,955 --> 00:27:14,117
wondering what else you can tell me about the

420
00:27:14,157 --> 00:27:18,441
results that you've seen. Like, we've talked about you've launched

421
00:27:18,481 --> 00:27:21,823
in 2018, you've been in operation, you know, and growing

422
00:27:21,863 --> 00:27:26,318
into new places. what

423
00:27:26,398 --> 00:27:29,700
kind of situations have you come across and

424
00:27:33,122 --> 00:27:36,624
Yeah. So since 2018, we've expanded

425
00:27:36,684 --> 00:27:39,986
to, I think we're up to 47 States. We're spread out throughout

426
00:27:40,026 --> 00:27:44,469
the entire country on K-12, um, public

427
00:27:44,509 --> 00:27:48,071
transit, uh, on the commercial side, we're in, um,

428
00:27:48,772 --> 00:27:51,974
big box retail and logistics centers and things like

429
00:27:52,034 --> 00:27:55,630
that. Um, We

430
00:27:55,910 --> 00:27:59,651
haven't had that stereotypical detection

431
00:27:59,711 --> 00:28:03,132
and arrest of somebody that appeared to be, you know, entering a school to

432
00:28:03,232 --> 00:28:07,052
commit a mass shooting. But we've, you

433
00:28:07,072 --> 00:28:10,913
know, in early days, maybe we would see a gun once a month. We're absolutely

434
00:28:10,973 --> 00:28:14,514
seeing guns, dozens of guns on a daily basis in areas that

435
00:28:14,534 --> 00:28:20,135
I never thought we would see guns. And so the

436
00:28:20,175 --> 00:28:23,376
performance of the model itself has proven itself as

437
00:28:23,436 --> 00:28:26,727
we scale. We're generating detections all the time.

438
00:28:29,089 --> 00:28:34,253
And in the scenarios that we've run with local law enforcement and

439
00:28:34,854 --> 00:28:38,117
law enforcement at customer sites, we've been able to identify that the

440
00:28:38,497 --> 00:28:42,080
reduction in response time is considerable.

441
00:28:42,300 --> 00:28:45,723
So comparing response times

442
00:28:46,084 --> 00:28:49,766
without zeroized alerts, First responders

443
00:28:49,846 --> 00:28:53,589
are basically showing up at a school and not knowing where the

444
00:28:53,609 --> 00:28:56,933
shooter is located. So they enter that school, it

445
00:28:56,993 --> 00:29:00,396
could take them five to 15 minutes to clear the school

446
00:29:00,476 --> 00:29:04,100
and locate the actual shooter. And there's been plenty of real life examples,

447
00:29:04,700 --> 00:29:08,304
like Uvalde, where there's been serious

448
00:29:08,344 --> 00:29:11,470
challenges around that. So in

449
00:29:11,650 --> 00:29:15,394
the testing that we've done, we've been able to considerably reduce response

450
00:29:15,434 --> 00:29:18,556
time and direct first responders exactly where in

451
00:29:18,576 --> 00:29:21,719
the building they should be located. It's

452
00:29:21,759 --> 00:29:25,082
something as simple as if first responders show up to the wrong side

453
00:29:25,102 --> 00:29:28,705
of the building and they enter the wrong door, that could mean 15 minutes

454
00:29:28,765 --> 00:29:32,148
in lost response time. Wherever

455
00:29:32,208 --> 00:29:35,512
possible, we're really focused on reducing that response time to get

456
00:29:39,096 --> 00:29:42,700
You mentioned you're seeing dozens of guns every

457
00:29:42,720 --> 00:29:46,904
day in places that you wouldn't expect. Are

458
00:29:46,964 --> 00:29:50,546
these, what are those kinds

459
00:29:50,586 --> 00:29:53,928
of situations? Is that, are people just kind of caring and

460
00:29:53,948 --> 00:29:57,449
they're walking around and they're caring? Is there intent to

461
00:29:57,489 --> 00:30:02,312
violence? Do those always lead to kind of reactions in

462
00:30:02,452 --> 00:30:05,793
terms of law enforcement, but in terms of, you know, the customer that

463
00:30:05,833 --> 00:30:09,115
you're securing, is there responses to that or is it, oh no,

464
00:30:09,275 --> 00:30:12,616
he's okay. I mean, obviously each

465
00:30:14,983 --> 00:30:18,624
Yeah, short answer is it totally depends. But

466
00:30:18,644 --> 00:30:22,004
I would say that for the most part, fake guns are

467
00:30:22,024 --> 00:30:25,665
starting to look more and more like real guns. If you pick up an airsoft rifle at

468
00:30:25,785 --> 00:30:29,806
your local Walmart, that airsoft rifle is almost indistinguishable

469
00:30:30,026 --> 00:30:33,666
from an actual AR. And even

470
00:30:33,706 --> 00:30:37,187
more so when students will

471
00:30:37,247 --> 00:30:40,548
paint the tips, paint the orange tips black or remove the orange

472
00:30:40,608 --> 00:30:43,925
tips. So a lot of times we

473
00:30:44,086 --> 00:30:47,427
don't necessarily know if the gun is a fake

474
00:30:47,447 --> 00:30:50,628
gun or a real gun, and we have to essentially treat it as though

475
00:30:50,668 --> 00:30:53,909
it's a real gun. So I would say that's probably

476
00:30:53,949 --> 00:30:57,170
one of the more common scenarios that we see. Also, a lot of times we

477
00:30:57,210 --> 00:31:00,668
see people using objects

478
00:31:00,988 --> 00:31:04,473
like cell phones and pointing them at each other as though they're real guns.

479
00:31:04,533 --> 00:31:07,897
There's a very popular TikTok challenge that's been going around the last few

480
00:31:07,937 --> 00:31:11,060
years called Senior Assassin, which is about the

481
00:31:11,100 --> 00:31:16,386
most insensitive thing that I think you could do in nowadays climate. But

482
00:31:16,586 --> 00:31:20,088
students are bringing airsoft rifles or fake pistols,

483
00:31:20,148 --> 00:31:23,590
or in some cases, real guns to school in order to perform

484
00:31:23,630 --> 00:31:26,932
these mock senior assassinations and post them

485
00:31:26,992 --> 00:31:30,393
online. I couldn't tell you how many detections we've gotten that

486
00:31:30,413 --> 00:31:33,875
were similar to that, where we see students basically filming

487
00:31:33,935 --> 00:31:37,177
themselves pointing guns at each other in

488
00:31:37,237 --> 00:31:41,335
order to fulfill this TikTok challenge. So

489
00:31:41,355 --> 00:31:44,498
in cases like that, we always respond to the customer. We let

490
00:31:44,518 --> 00:31:47,761
them know what we're seeing. We try

491
00:31:47,801 --> 00:31:52,105
to communicate as much detail about the scene as possible, but ultimately

492
00:31:52,165 --> 00:31:55,848
it's on the customer to respond and execute

493
00:31:58,230 --> 00:32:02,163
Senior assassins, huh? Yeah. Oh,

494
00:32:02,223 --> 00:32:05,344
man. But and so you mentioned, I

495
00:32:05,384 --> 00:32:09,546
just want to, you know, clarify, right, like you haven't been

496
00:32:09,586 --> 00:32:13,287
involved in or encountered directly

497
00:32:13,387 --> 00:32:17,069
any of the kind of mass violence

498
00:32:22,222 --> 00:32:25,343
Luckily, no, uh, I think about this sometimes and you

499
00:32:25,643 --> 00:32:28,884
know, obviously if an event does happen, I want to be there. I

500
00:32:28,904 --> 00:32:32,065
want to be on those cameras to be able to detect it. Uh,

501
00:32:32,085 --> 00:32:35,546
but thankfully we've, we've not been involved in a mass shooting

502
00:32:35,586 --> 00:32:38,787
event. Um, I think it's only a matter of time though. The

503
00:32:38,807 --> 00:32:41,948
more cameras we're on, the more guns we're going to see, the more coverage we're

504
00:32:41,968 --> 00:32:45,329
going to have. And, um, I just pray that when we're in

505
00:32:45,349 --> 00:32:48,750
that situation, uh, that we're able to make the detection before

506
00:32:49,331 --> 00:32:52,668
Mm hmm. Now,

507
00:32:52,989 --> 00:32:56,432
you mentioned that you're in 47 states now,

508
00:32:56,832 --> 00:33:00,655
and you have a ZOC in Pennsylvania

509
00:33:01,716 --> 00:33:05,259
and in Hawaii. So you cover the

510
00:33:05,279 --> 00:33:09,222
different time zones. They're operational all the time. How

511
00:33:09,263 --> 00:33:12,865
does the ZOC scale in kind with

512
00:33:12,945 --> 00:33:16,287
the scale of your emplacements and your

513
00:33:16,327 --> 00:33:19,629
model? Like how many more people do you need for each

514
00:33:24,252 --> 00:33:27,794
Today, we're able to monitor the entire United States from just those two operating

515
00:33:27,834 --> 00:33:31,017
centers. I anticipate in

516
00:33:31,037 --> 00:33:34,519
the future we're going to expand and build operating centers in other locations, but

517
00:33:35,219 --> 00:33:38,521
today we're able to just basically follow a model that says when we add

518
00:33:38,681 --> 00:33:42,063
X number of cameras, we're going to expect to see X number

519
00:33:42,103 --> 00:33:46,545
of additional alerts, and that increases our headcount. So

520
00:33:46,565 --> 00:33:49,766
we follow a pretty linear model in that sense. We also try

521
00:33:49,806 --> 00:33:53,448
to staff people based on alert load throughout the day. As

522
00:33:53,468 --> 00:33:56,570
you can imagine, we get the most false positives during the times of

523
00:33:56,590 --> 00:33:59,751
day that are most active in front of cameras. So if

524
00:33:59,771 --> 00:34:04,153
you think about a school, that's the five minutes every

525
00:34:04,193 --> 00:34:07,955
hour when students are walking in between classrooms. And so during

526
00:34:07,975 --> 00:34:11,657
the day, probably 8 AM to 5 PM, we

527
00:34:11,677 --> 00:34:14,938
staff heavier than we need to during the nights and off times and

528
00:34:14,978 --> 00:34:18,800
weekends. But it's been

529
00:34:18,840 --> 00:34:22,103
pretty standard for us and the more scale that

530
00:34:22,143 --> 00:34:25,485
we have, the more different sites that we're on, the

531
00:34:25,505 --> 00:34:28,967
more data that we include from those different sites

532
00:34:29,027 --> 00:34:33,909
into our model to make it better, the more predictable

533
00:34:34,150 --> 00:34:37,371
our scaling model is. So that part

534
00:34:40,493 --> 00:34:43,754
Now on the privacy side, which

535
00:34:43,774 --> 00:34:46,996
you kind of referenced earlier when I was talking about the whole Overwatch-y type

536
00:34:47,016 --> 00:34:51,191
thing, have

537
00:34:51,311 --> 00:34:54,952
access to these cameras, but you're not watching those camera feeds

538
00:34:55,012 --> 00:34:58,513
necessarily. You get, you'll get like the frame that

539
00:34:58,533 --> 00:35:02,294
was flagged. But you, you train, like

540
00:35:02,314 --> 00:35:05,475
you just said, the data that you gather, you use to

541
00:35:05,875 --> 00:35:09,296
train the models. And so I, and

542
00:35:09,316 --> 00:35:12,957
I'm sure this probably varies customer to customer, but I'm wondering on

543
00:35:13,317 --> 00:35:17,338
the privacy side, you know, if we're, thinking

544
00:35:17,358 --> 00:35:20,979
about schools or public places, these big box retailers, for

545
00:35:21,039 --> 00:35:24,420
example, how you work on the data side

546
00:35:24,480 --> 00:35:28,380
to ensure privacy and what specific components,

547
00:35:28,440 --> 00:35:31,721
I guess, of these images that they'll

548
00:35:31,841 --> 00:35:35,622
use to train your models on? Is it anonymized in

549
00:35:35,682 --> 00:35:39,263
some way, faces cut or blurred out or whatever?

550
00:35:43,823 --> 00:35:47,893
For our training data, we try not to obfuscate

551
00:35:48,273 --> 00:35:51,934
as much as possible because we're concerned

552
00:35:51,974 --> 00:35:56,075
that it will affect the context of the model. But

553
00:35:56,796 --> 00:36:00,557
for any data that's coming from a customer, we do remove

554
00:36:00,597 --> 00:36:03,918
faces so that we're not infringing on any privacy of

555
00:36:03,958 --> 00:36:07,279
the customer. Obviously, it's something that we get the customer's permission ahead

556
00:36:07,319 --> 00:36:10,559
of time. We're very close with our customers in

557
00:36:10,579 --> 00:36:14,118
that respect. But on the data training

558
00:36:14,178 --> 00:36:17,380
side, we've, from the very beginning, tried to

559
00:36:17,441 --> 00:36:21,023
distance ourselves from any sort of biometric detection

560
00:36:21,083 --> 00:36:24,806
or recognition, specifically to address that privacy concern.

561
00:36:25,707 --> 00:36:28,830
So instead of detecting a person and

562
00:36:28,870 --> 00:36:32,453
then detecting a gun, we're strictly looking for a visible

563
00:36:32,513 --> 00:36:35,575
firearm. And all of our data that is

564
00:36:35,715 --> 00:36:41,654
annotated is annotated for that specific firearm. Avoiding

565
00:36:41,674 --> 00:36:45,015
the facial recognition piece, avoiding any sort of biometric analysis has

566
00:36:45,055 --> 00:36:48,257
allowed us to distance ourselves from privacy concerns with

567
00:36:48,317 --> 00:36:51,578
our customers. And because of that, I think our customers trust us.

568
00:36:51,998 --> 00:36:55,419
They trust that we're not looking at their live video. They

569
00:36:55,459 --> 00:36:59,761
trust that we're not detecting

570
00:36:59,901 --> 00:37:03,362
or biased in any way that would affect

571
00:37:06,425 --> 00:37:10,107
And there's also a security side as well. Obviously, there's

572
00:37:10,147 --> 00:37:13,809
security to everything we're talking about your security company, but the security

573
00:37:14,049 --> 00:37:17,751
of the model, the security of the cameras and the warning system.

574
00:37:18,671 --> 00:37:22,293
And you talked about on prem deployments.

575
00:37:23,774 --> 00:37:26,876
But I'm wondering about, and I'm just gonna throw a

576
00:37:26,916 --> 00:37:30,037
kind of stupid, like movie scenario at

577
00:37:30,057 --> 00:37:33,979
you. Would it be possible

578
00:37:34,360 --> 00:37:38,122
or if If possible, how do you mitigate for

579
00:37:38,342 --> 00:37:41,704
someone to somehow gain access to or

580
00:37:41,724 --> 00:37:46,667
hack into your system to trigger a false alarm? I

581
00:37:46,787 --> 00:37:50,189
can't, I don't know exactly how that would work, but if they were to gain access, like,

582
00:37:51,029 --> 00:37:54,391
is that a real security concern on your end? That

583
00:37:54,471 --> 00:37:58,013
your system could become compromised in a way that sends

584
00:38:01,235 --> 00:38:04,658
I would say that's probably lower on my list of concerns. I

585
00:38:04,678 --> 00:38:08,042
mean, getting hacked in general or falling prey

586
00:38:08,102 --> 00:38:11,225
to some sort of social engineering is always in the back of my

587
00:38:11,265 --> 00:38:14,468
mind and probably is for any technology leader or

588
00:38:15,709 --> 00:38:19,213
startup founder. But

589
00:38:19,393 --> 00:38:22,599
that piece of our business, it always runs through the

590
00:38:22,619 --> 00:38:26,565
ZOC. So the only way for an image to be dispatched to our customers is

591
00:38:26,846 --> 00:38:30,352
for somebody in the ZOC to actually click dispatch

592
00:38:30,632 --> 00:38:33,791
and send it out. Like I said, we have

593
00:38:33,811 --> 00:38:37,154
a really tight connection with our customers. So even in the event that

594
00:38:37,174 --> 00:38:40,996
there was an accidental dispatch, we

595
00:38:41,457 --> 00:38:45,580
would have direct line to the customer to be able to deescalate immediately

596
00:38:45,820 --> 00:38:49,142
if that were to happen. So I haven't run any scenarios with

597
00:38:49,162 --> 00:38:53,165
that specific threat in mind. But yeah,

598
00:38:53,405 --> 00:38:56,567
it's always in the back of my mind that we'll get hacked in

599
00:38:56,607 --> 00:39:00,747
some way that will either, you know, expose

600
00:39:00,787 --> 00:39:04,611
some vulnerable information to the world or like

601
00:39:04,651 --> 00:39:08,194
our customer information or, you know, in

602
00:39:08,234 --> 00:39:11,577
this scenario, you know, affect our ability to dispatch or,

603
00:39:11,637 --> 00:39:15,281
you know, send errant information out into the

604
00:39:15,341 --> 00:39:18,784
universe to our customers. So it's always a threat, but

605
00:39:21,926 --> 00:39:25,730
That's good. That's good. I've

606
00:39:25,750 --> 00:39:29,045
got a couple more for you. And the first thing

607
00:39:29,065 --> 00:39:33,608
is something that I've been thinking about, which

608
00:39:33,748 --> 00:39:37,330
is, I wonder how

609
00:39:37,370 --> 00:39:40,832
much you're thinking about or exploring ways that this

610
00:39:41,913 --> 00:39:46,475
work expands beyond gun-specific detection.

611
00:39:47,796 --> 00:39:51,278
You know, Zero Eyes doesn't have anything about guns

612
00:39:51,638 --> 00:39:55,480
right in its name. And the idea of object detection

613
00:39:55,700 --> 00:39:59,401
tied to, uh, warning systems with humans

614
00:39:59,441 --> 00:40:02,703
in the loop connected to first responders seems to

615
00:40:02,743 --> 00:40:05,984
me that it could scale to other situations. I

616
00:40:06,024 --> 00:40:09,125
mean, like you, I even think about a fire, for

617
00:40:09,165 --> 00:40:12,926
instance, in a, in a school building, the same way that, um,

618
00:40:13,807 --> 00:40:17,268
your systems can tell first responders, which camera

619
00:40:17,328 --> 00:40:21,747
flags, you know, an alert. it

620
00:40:21,767 --> 00:40:24,970
would seem to be a very advanced fire alarm for you to be able to tell

621
00:40:25,030 --> 00:40:28,333
someone there's a fire and this is exactly where it is. So

622
00:40:28,533 --> 00:40:32,177
evacuate accordingly, right? Like that kind of thing. Other threats

623
00:40:32,257 --> 00:40:35,340
beyond guns. I wonder if that's something that

624
00:40:36,561 --> 00:40:40,284
From the very beginning, we've been exclusively focused on being the

625
00:40:40,324 --> 00:40:44,348
best at one thing instead of mediocre at a bunch of different things. And

626
00:40:44,388 --> 00:40:47,771
that's served us really well up until this point in the company. That

627
00:40:47,791 --> 00:40:51,352
being said, our expertise in AI and

628
00:40:51,432 --> 00:40:55,393
object detection lends itself really well to expanding into other

629
00:40:55,533 --> 00:40:58,754
use cases like this. I

630
00:40:58,814 --> 00:41:02,395
see it generally as we are a very high-value trigger for

631
00:41:02,555 --> 00:41:05,856
an incident to start. And I would love to expand those

632
00:41:05,916 --> 00:41:09,657
triggers to cover more incidents, things like perimeter

633
00:41:09,677 --> 00:41:12,738
security and intrusion detection, health and safety, to

634
00:41:12,778 --> 00:41:16,503
your point. There's

635
00:41:16,523 --> 00:41:33,426
a lot of work being done on the retail side. with

636
00:41:33,486 --> 00:41:37,287
loss prevention. So I see there's applications

637
00:41:37,507 --> 00:41:40,788
across many verticals that touch the same customers that we do.

638
00:41:41,408 --> 00:41:44,609
I first say that we're going to continue to be focused on guns, but start

639
00:41:44,649 --> 00:41:48,150
to expand into some of these other areas that lend themselves

640
00:41:48,190 --> 00:41:52,271
well. And ultimately, moving

641
00:41:52,311 --> 00:41:55,572
into a future where vision transformers and

642
00:41:55,612 --> 00:41:59,823
large language models are more easily accessible in real time, It

643
00:41:59,843 --> 00:42:03,525
opens up a lot of possibilities. We could take this in a lot of different directions

644
00:42:04,986 --> 00:42:09,309
and ultimately try to solve as many customer value problems as possible. But

645
00:42:10,109 --> 00:42:13,511
yeah, very interested in intrusion detection, other types

646
00:42:13,551 --> 00:42:17,253
of weapons like knives and just aggressive behavior in general,

647
00:42:17,293 --> 00:42:20,915
things like that. And then ultimately we

648
00:42:20,935 --> 00:42:25,724
want that trigger to initiate some sort of valuable response. Today,

649
00:42:25,804 --> 00:42:29,206
that is us sending situational awareness to our customers that

650
00:42:29,226 --> 00:42:32,327
they can respond, but we're still sending a person to deal with a

651
00:42:32,347 --> 00:42:36,329
very dangerous situation. And so I would love to initiate

652
00:42:36,429 --> 00:42:39,950
additional response like, you know,

653
00:42:40,170 --> 00:42:43,452
locking down doors or sending a drone to verify the

654
00:42:43,492 --> 00:42:47,535
incident and just have additional eyes on, I think is a

655
00:42:47,575 --> 00:42:50,997
very likely future for us. If you can imagine a first

656
00:42:51,037 --> 00:42:54,538
responder showing up to the scene after there's been a drone there for five minutes,

657
00:42:54,918 --> 00:42:58,680
showing them what exactly is going on, they have the ultimate situational

658
00:42:58,700 --> 00:43:02,141
awareness that they wouldn't have otherwise. And given

659
00:43:02,161 --> 00:43:05,523
the dangerous nature of first responder jobs and security jobs,

660
00:43:06,103 --> 00:43:09,184
I think it's completely natural that at some point in the

661
00:43:09,244 --> 00:43:12,766
future, there will be some sort of response

662
00:43:16,192 --> 00:43:19,596
Yeah, there's a lot of places you can take it, I guess, when you start thinking

663
00:43:19,636 --> 00:43:23,021
about it like that. Sure. You know, drones, automated

664
00:43:23,081 --> 00:43:26,320
security. I

665
00:43:26,340 --> 00:43:31,223
guess you almost see the beginnings of like a Robocop type thing.

666
00:43:34,005 --> 00:43:37,448
It's definitely hard to differentiate and diverge your thinking between

667
00:43:37,488 --> 00:43:40,970
what is sci-fi and what is real life. And we get questions about

668
00:43:41,010 --> 00:43:45,153
that all the time. Total Recall type analysis

669
00:43:45,193 --> 00:43:50,439
where privacy is a thing of the past. It's

670
00:43:50,459 --> 00:43:53,542
something we're sensitive to, and as much as possible, I want to

671
00:43:53,582 --> 00:43:57,005
avoid privacy issues, because I care about that personally, myself.

672
00:43:58,926 --> 00:44:02,389
But at the same time, there's so much possibility on the response

673
00:44:02,429 --> 00:44:06,373
side to incorporate drones

674
00:44:06,433 --> 00:44:09,776
and robots, autonomous response, into areas where otherwise

675
00:44:12,253 --> 00:44:15,396
Right. I even think specifically for something like

676
00:44:15,756 --> 00:44:20,500
the Hurt Locker, like if that could be done completely with robots, that

677
00:44:20,520 --> 00:44:24,243
would be good. Setting people in the suits to diffuse

678
00:44:24,283 --> 00:44:27,465
these things is crazy. But that also raises an

679
00:44:27,485 --> 00:44:30,748
interesting point that I always think

680
00:44:30,788 --> 00:44:34,691
is interesting because like we've mentioned

681
00:44:34,731 --> 00:44:38,234
several times, you guys have been around for several years before

682
00:44:38,294 --> 00:44:41,756
the kind of boom in AI that we're dealing with now.

683
00:44:43,938 --> 00:44:48,141
And I wonder, I guess, how,

684
00:44:49,422 --> 00:44:52,945
if it got easier to talk

685
00:44:52,985 --> 00:44:57,208
to clients about what your offering is, when

686
00:44:57,888 --> 00:45:01,191
AI became much more in

687
00:45:01,211 --> 00:45:04,613
the public vernacular, And if in a weird way

688
00:45:04,653 --> 00:45:07,734
that also made it a little bit more difficult, because a

689
00:45:07,774 --> 00:45:10,975
lot of it is tied up, as you mentioned, right? With, with

690
00:45:11,515 --> 00:45:14,876
people from outside the industry who are, have a hard time sifting through

691
00:45:15,617 --> 00:45:19,058
fiction and, and, and the hype with what's actually happening.

692
00:45:20,238 --> 00:45:23,540
Yeah. Our biggest challenge in the early days was convincing people that

693
00:45:23,560 --> 00:45:28,301
it wasn't snake oil. And so, I mean,

694
00:45:28,341 --> 00:45:31,713
we ran so many demos. At that point, just trying

695
00:45:31,733 --> 00:45:34,814
to prove to people that we could actually detect guns. And then

696
00:45:34,854 --> 00:45:38,476
they said, well, you know, you're showing us you being

697
00:45:38,516 --> 00:45:42,057
detected with a gun. I want to hold the gun and be detected to

698
00:45:42,117 --> 00:45:45,519
really prove it out. And so a lot of it has been proving out

699
00:45:46,540 --> 00:45:50,044
that. AI security of

700
00:45:50,084 --> 00:45:53,787
the past was maybe embellished

701
00:45:53,867 --> 00:45:57,690
on the sales side to an extent that it caused customers to

702
00:45:58,631 --> 00:46:02,114
think that AI wasn't capable

703
00:46:02,154 --> 00:46:05,376
of performing in real time on security cameras. And we spent the

704
00:46:05,416 --> 00:46:08,479
last seven, eight years trying to change that opinion a

705
00:46:08,519 --> 00:46:13,523
little bit. And then going

706
00:46:13,603 --> 00:46:17,081
forward with the

707
00:46:17,141 --> 00:46:21,004
emergence of large language models and vision transformers

708
00:46:21,124 --> 00:46:24,306
and some of the

709
00:46:24,567 --> 00:46:28,890
common topics around them like copyright infringement, potential

710
00:46:29,570 --> 00:46:33,853
lawsuits, these large language models using

711
00:46:33,913 --> 00:46:38,356
public data in some way. It's

712
00:46:38,396 --> 00:46:41,638
caused a lot of questions about how we collect our data, and that's been one

713
00:46:41,679 --> 00:46:45,084
of our strong selling points to the customer is that All

714
00:46:45,124 --> 00:46:48,827
of this data is organic to ZeroEyes. There's

715
00:46:48,847 --> 00:46:52,130
no risk of us infringing on

716
00:46:52,190 --> 00:46:55,453
any copyrights or using any public data. It's all stuff

717
00:46:55,473 --> 00:46:58,975
that we've meticulously developed in-house

718
00:46:59,396 --> 00:47:02,819
and have scrubbed to maintain the highest quality possible. So

719
00:47:02,839 --> 00:47:06,161
I would say those are probably the two areas that we see overlaps with

720
00:47:08,062 --> 00:47:11,344
Yeah. Yeah, you operate at

721
00:47:11,664 --> 00:47:15,026
the intersection in an interesting way.

722
00:47:15,106 --> 00:47:19,409
It's different from a lot of other AI companies that

723
00:47:19,489 --> 00:47:23,111
I talk to. But my last

724
00:47:24,171 --> 00:47:28,713
point, I guess, to leave off here, is,

725
00:47:29,414 --> 00:47:32,954
you know, I wish you guys weren't needed. And it's

726
00:47:33,115 --> 00:47:36,775
interesting, you start the company,

727
00:47:37,916 --> 00:47:41,617
this team of veterans, in

728
00:47:41,657 --> 00:47:45,198
the wake of a devastating mass shooting and,

729
00:47:45,858 --> 00:47:49,299
you know, that's a problem and a crisis

730
00:47:49,339 --> 00:47:52,500
that hasn't really abated too

731
00:47:52,540 --> 00:47:55,721
much. And so here, you know,

732
00:47:55,761 --> 00:47:59,382
where you sit with a technological solution, and

733
00:47:59,402 --> 00:48:03,743
this kind of goal to, through

734
00:48:05,063 --> 00:48:09,544
detection and better response times to mitigate violence

735
00:48:09,584 --> 00:48:12,765
on the ground, whether that's someone has

736
00:48:12,805 --> 00:48:16,146
an airsoft gun, or whether that's, you know, a mass shooting

737
00:48:16,186 --> 00:48:21,427
might be about to happen, we have to get over there. You're

738
00:48:22,107 --> 00:48:25,549
the idea of the kind of technological solution to a problem that bleeds

739
00:48:25,609 --> 00:48:29,090
beyond technology. I wonder what you think about that and what

740
00:48:29,130 --> 00:48:33,252
other levers should be considered or

741
00:48:33,512 --> 00:48:36,713
if the reality of the world we live in is one where it's

742
00:48:36,753 --> 00:48:39,995
like technology is

743
00:48:40,035 --> 00:48:43,416
kind of the best thing we have left to

744
00:48:47,217 --> 00:48:50,440
I also wish we weren't needed. I think about that almost

745
00:48:50,500 --> 00:48:53,961
on a daily basis. As a company, we follow gun violence throughout

746
00:48:54,001 --> 00:48:57,102
the country really closely. And so I

747
00:48:57,162 --> 00:49:00,502
see on a daily basis the news reports of shootings and

748
00:49:00,522 --> 00:49:05,563
gun violence that happen all over the country. And yeah,

749
00:49:06,103 --> 00:49:09,184
when people come up to me and they say, how's business doing? I

750
00:49:09,224 --> 00:49:13,065
say, it's good. Unfortunately, people keep

751
00:49:13,185 --> 00:49:16,926
committing gun violence. And there's probably

752
00:49:16,966 --> 00:49:20,379
deeper issues at play there. But

753
00:49:20,419 --> 00:49:24,181
when I think about our position with the customer, we're

754
00:49:24,201 --> 00:49:28,123
providing a layer of security. And so when we go

755
00:49:28,163 --> 00:49:33,066
into a new customer, They trust us based

756
00:49:33,146 --> 00:49:36,826
on our track record and expansion to basically

757
00:49:36,846 --> 00:49:40,227
be an expert to them about how their security should look. And

758
00:49:40,247 --> 00:49:43,788
so we're able to really provide this great feedback where, you know,

759
00:49:43,828 --> 00:49:47,088
maybe a customer is struggling with their camera system and

760
00:49:47,108 --> 00:49:50,549
it doesn't even make sense to buy ZeroEyes until their camera system

761
00:49:50,609 --> 00:49:53,830
is upgraded. And so we're happy to make that recommendation to

762
00:49:53,870 --> 00:49:57,228
them because At the end of the day, our performance won't be as

763
00:49:57,308 --> 00:50:01,574
great on older cameras, lower resolution cameras. And

764
00:50:01,774 --> 00:50:06,197
it matters to us that our customers have the highest security posture

765
00:50:06,237 --> 00:50:09,679
that they possibly could. So we're in this really awesome

766
00:50:10,280 --> 00:50:13,782
kind of expert position for our customers. And all

767
00:50:13,802 --> 00:50:16,884
good security comes in layers. We're just one piece of

768
00:50:16,984 --> 00:50:20,486
it. And we're trying to address that gap of being

769
00:50:20,526 --> 00:50:23,828
that first early warning sign to communicate to

770
00:50:23,889 --> 00:50:27,131
customers when there's a visible gun, when there's a weapon that's brandished on

771
00:50:27,171 --> 00:50:33,121
their physical site. Yeah,

772
00:50:33,141 --> 00:50:36,503
going forward into the future, I see a huge possibility for us to

773
00:50:37,283 --> 00:50:40,465
be really ingrained with the customer to be that expert and to

774
00:50:40,505 --> 00:50:44,928
provide that

775
00:50:45,188 --> 00:50:48,703
knowledge that's needed to understand

776
00:50:48,743 --> 00:50:51,844
how to respond to one of these incidents. Hopefully, the vast majority of our

777
00:50:51,904 --> 00:50:55,165
customers will never experience one of these incidents, but with

778
00:50:55,185 --> 00:50:58,986
the prevalence of gun violence and how it's expanding, that's

779
00:50:59,006 --> 00:51:02,346
becoming more and more likely. And it's not as easy for customers to

780
00:51:02,386 --> 00:51:05,827
just say, you know, that's not going to happen to me. They

781
00:51:05,847 --> 00:51:09,008
have to be prepared in some way, and we're in this awesome position to help

782
00:51:09,048 --> 00:51:12,269
them prepare. So I think we're just one

783
00:51:12,289 --> 00:51:15,390
layer in their broader physical security, but we're a

784
00:51:15,430 --> 00:51:19,323
critical layer at this point. I hope I answered your question. That

785
00:51:21,483 --> 00:51:25,264
Thanks. Yeah. But from that perspective, the future

786
00:51:25,444 --> 00:51:28,885
is exciting. But, you know, again, dark

787
00:51:31,906 --> 00:51:35,307
Yeah, we as a company, we have to do well

788
00:51:35,347 --> 00:51:38,703
in order to do good. That's one of our kind of principles. In

789
00:51:38,723 --> 00:51:42,424
other words, we have to be profitable in order to expand to more cameras

790
00:51:42,684 --> 00:51:45,765
so that we can cover enough cameras in order to fulfill our

791
00:51:45,845 --> 00:51:49,025
mission, which is stopping gun violence. And

792
00:51:49,065 --> 00:51:52,146
so we live and breathe it

793
00:51:52,346 --> 00:51:55,967
at a daily basis here at Zero Eyes. And

794
00:51:56,807 --> 00:52:00,328
when people come in to work every day, they're singularly focused

795
00:52:00,368 --> 00:52:03,648
on that mission of ending gun violence. It's a beautiful place to work

796
00:52:03,708 --> 00:52:07,140
because of that, but it does come with Um,

797
00:52:12,222 --> 00:52:15,364
It's a heavy mission. Well, Tim, I appreciate you, uh,

798
00:52:15,684 --> 00:52:19,005
letting me steal you away from it for, for a little bit, um,

799
00:52:19,245 --> 00:52:22,547
and walking me through what you do. Uh, so thank you. Thank