Energy 101: We Ask The Dumb Questions So You Don't Have To

Public data in energy is basically the Wild West, a mix of outdated systems, inconsistent reporting standards, and layers of bureaucracy that make it nearly impossible to use efficiently. In this episode of *Energy 101*, Jacob and Julie chat with John Ferrell, the brains behind WellDatabase, to unpack how all this chaos came to be. From dusty microfiche records to confusing state databases that can’t agree on formats, John breaks down why accessing “public” data often feels like solving a puzzle with missing pieces. He also explains how WellDatabase is cleaning up the mess—turning scattered, messy datasets into tools that actually help operators, analysts, and decision-makers get work done. It’s a surprisingly entertaining look at how something as boring-sounding as data can make or break the energy industry.

Click here to watch a video of this episode.


Join the conversation shaping the future of energy.
Collide is the community where oil & gas professionals connect, share insights, and solve real-world problems together. No noise. No fluff. Just the discussions that move our industry forward.
Apply today at collide.io

00:00 - Intro
01:50 - Starting WellDatabase
03:53 - Public Data Challenges
08:27 - Storage of Public Data
10:15 - Quality Issues in Oil and Gas Data
11:59 - Enverus Data Cleanup Process
16:45 - Founding Enverus
18:10 - AI in Data Cleaning
23:28 - Personal Learning and Growth
25:40 - Public Data Reporting Gaps
31:44 - Essential Data for E&Ps
35:10 - Public Data's Importance for Investors
36:05 - Significance of Accurate Data
38:20 - Scale of Data Operations
40:30 - Keeping Data Current
42:58 - Selecting the Right Team
44:45 - Creating a Video Game
49:33 - Personal Anecdote
50:50 - Final Thoughts

https://twitter.com/collide_io
https://www.tiktok.com/@collide.io
https://www.facebook.com/collide.io
https://www.instagram.com/collide.io
https://www.youtube.com/@collide_io
https://bsky.app/profile/digitalwildcatters.bsky.social
https://www.linkedin.com/company/collide-digital-wildcatters

What is Energy 101: We Ask The Dumb Questions So You Don't Have To?

Welcome to Energy 101 with Julie McLelland and Jacob Stiller. Join us on our mission to help raise the world's energy IQ.

0:00 Welcome to Energy 101, where we ask the dumb questions, so you don't have to. Today, we're here with John Farrell. We've had John on, I think, every single Digital Wildcatters podcast. It's

0:12 been a big goal of mine. Yeah. Yeah, we love it. And I'm so glad you're here because you're always a fan favorite. I feel like you have the most downloads. I feel like you're just being nice to

0:21 me, but I'm going to run with it. I'm good. Yeah, that's great. Yeah, that's impressive. But also, yes, he kills it Yeah, so why do people care about John Farrell? What are you doing?

0:32 That's so cool that everyone wants to know about. I mean, it's mainly because I'm handsome, but outside of that, no, it's, you know, what, I think, really, it is. It's like what we do at

0:41 Well Database. It applies to everything. I mean, it's, it's so funny because people like, who are your main customers? And it's like, we've got people who are inheriting mineral rights. We've

0:51 got people, CEOs of major companies who are looking for drilling trends. We've got drilling engineers and we've got landmen. Every piece of the industry has got to pay attention to the public data.

1:04 And so I think that's why I'm on all of them because it works, it works everywhere. And there's pieces of what we do that affect everything. And I'm gonna be totally honest with you, my favorite

1:14 part about it. I will get on the phone and I know people tell me I gotta stop spending so much time on the phone with users 'cause I just enjoy it. But I will talk to a truck driver one day and then

1:23 I'll talk to the CEO the next day and they're both interesting to talk to And so, but yeah, it's really, that's what it is. We go out and we pull all this public data together so that if you're a

1:34 truck driver, you need to find the directions to your drill site. You can get it. And like I said, if you're a land man who needs to understand least positioning, we got it. If you're a CEO who

1:44 wants to see large scale financial statistics, we got it. So it's fun, it keeps it interesting. That's awesome, and to take a step back, Are you the CEO and your CEO, right? Yeah. And

1:57 co-founder of Well Database. Yes, yeah, I guess that probably was a good idea. Imagine. And so, yeah, yeah. We started Well Database. I joke about this. I can't even tell you when we started

2:07 it. It was 10, 15 years ago, somewhere in between. But like a lot of these stories, it just was an evolution over time. So, but yeah, as far as back story on us, we set out to go out and

2:21 automate all of this extraction of this public data from all of these sources. And you know, and that was really born out of frustration because there's only a couple of players who have done it

2:30 historically. And I'm not gonna name names, but honestly, they're expensive and their data's not very good. And the accessibility's aren't, you know, their data models suck. Anyway, the whole

2:42 thing was a wreck. And I'm at a service company trying to write applications that ingest this data so that we can utilize it and I'm just hitting brick wall after wall, trying to get it done. And

2:54 so I turned over to Josh Holt, my co-founder, and I said, Let's just go get to do this ourselves. Was he also working at the same company? Yeah, yeah, he and I go way back. It's really strange

3:03 because he was a 16-year-old intern at the company I was working at. And so I actually worked with his dad. Oh, wow. Anyway, so we go way, way back. And so everywhere I've gone, I've tried to

3:14 have Josh with me because he is like the best, like Barnum, the best programmer in the world I will put him up against anybody. Oh, that's awesome. So anyway, yeah, he was working with me and I

3:24 was like, screw this, this sucks. Let's go do it ourselves. And so we did. And that was, let's go do it ourselves. 'Til we did, is that's a 15 year period of - Overnight success. Oh, yeah.

3:38 That's exactly it. That's it, man. So, but yeah, it's great.

3:43 It is a problem and it has been a problem. I still hear the same complaints about those people I was using back then today. And so like they're not fixing that problem. They have to come to us to

3:52 get that problem fixed. Okay, let's dive into the problem a little bit. So take us into like a day in the life of when you're working at the service company,

4:02 like what data did you need? What did you need it for? And kind of like what obviously like you said, like it was a pain trying to find it, but like what data did you need? No, that's great. So

4:15 where I was, I was at New Tech Energy and we were doing log analysis and field studies and things like that. And when you're doing that work, you'll get a log off the truck and that's cool.

4:24 That'll have all these log responses, tools, and you can do just a blind interpretation on it. And that's fine too. But so much of the successful interpretation is understanding the geology and

4:34 the dynamics of the area you're in. And so you can't just look at that one well in space you know, just, I mean, you can evaluate it, but you're going to be taking a lot of assumptions. However,

4:45 there's wells all around, you know, I don't know if you guys have flown over Midland, like when you fly into Midland Airport, there's like a checkerboard out there. There are so many wells to be

4:53 drawing information from. And so that's what we would do. And we like, we have to analyze one well. And so we grabbed 10 wells around it. And we see what do those wells look like? What, I mean,

5:03 we don't always have all the log data, but what you do have is what's in the public space. So you got dates, when it was drilled, how deep it was drilled, where it was completed, what it

5:11 produced, and then you use all those inputs to help guide your analysis on your target well. And so from a service company standpoint, everybody pulls offsets to understand what's going on around

5:24 them and make sure that the decisions they're making for their well or their analysis, it adds up to everything else. 'Cause you look really dumb if you go out there and you're like, yeah, this

5:33 well, this is a good zone here, you should perforate this zone It's, it's, yeah, this is gonna. be hydrocarbon and they perf it in its water. And you would know that if you got the public data

5:46 and you looked around and say, where this was perfed. And there's other things, it's more complicated than that. But at the end of the day, finding out what your neighbors are doing and having

5:54 that kind of baseline intelligence. I mean, that's what I was doing at the service company, but that actually has so many more uses across the industry. So what we would do is we would go grab

6:06 that public data and we pull it into our systems that Josh and I had, we're riding and working with and that's what the engineers would pull that data into their analysis. And so it was just

6:16 populating these databases. And at the time, we had to have a staff - I think it was just four people, but still, the company had 100 people add. And four people were dedicated to going and

6:26 getting that data and manually putting in the database because there was no great way. It was all in different formats and different sources and all kinds of things. Um, so we were calling the

6:36 provider and like, we want to use an API, the application programming interface API, not the API number, but we're like, we just want to go pull that data in automatically. We can save tens of

6:46 thousands of dollars. If we can do that, we, we don't need all these people wasting their time fixing data and getting it in place. And so they were like, yeah, we don't have that. And we were

6:56 like, this is ridiculous. It makes no sense. And so, um, yeah, that's, that was like the biggest, that combined with Right around that time, we got like a renewal notice. And like I said,

7:08 this company wasn't huge. We were paying like 60 K a year for this data. And we got a renewal notice that it was going to like go up 50. And that's not workable. But of course, we've got these

7:20 processes that were all kind of intertwined with it.

7:24 This one, we got that contract renewal and I took it to the management team and they're like, okay, we got to think about this. And then like a week later, they're like, no, we can't, we just

7:32 can't do it. So I go back and tell them, no, we can't do it. And like, oh, too late, it's already renewed. You got to pay it. Oh my gosh. And it was like, can we get a discount? Like,

7:41 nope, not at all. They lock you in. That's it. And so that's relevant too, because from a business standpoint, I won't do it. I won't, I refuse. We'll walk people in like that And so we do

7:54 auto-renewal terms, but that's like, so you don't have to go to a lawyer every time to renew an agreement, 'cause when you do these corporate agreements, you have to. But every single one of them

8:03 is opt out, actually our newest revision is it's all opt in. We have to have your permission to renew it. But if you say, okay, then the contract can just roll on. So we save the lawyer part,

8:16 but we also, we refuse to trap people. It is wrong, yeah, and it usually leaves a bad taste in your mouth and you don't ever want to use them again. Yep. Okay, so all of this public data is all,

8:31 so is it like required for every operator to submit this data and where do you find it if you weren't using that service? Yeah. Where is it at? Before that existed, where was it? Right, no,

8:44 okay, so yeah, at this most basic level, every single state manages who can poke holes in the ground in their state. That's for water wells too. Like if any of you ever lived out in the country

8:55 and you got to drill a water well, you got to go get a permit to drill that well, which is fine, it's easy to do, but that county, or in this case, the railroad commission in Texas say, you

9:05 have to go to them and say, I want to drill a well here, here's all the information about this well. And then as a matter of public record, they publish that information out so everybody can know

9:15 where these wells are because there's lots of dynamics about - What acreage is the well surface drilled in? Where does the bottom go? Who owns the surface rights in this place? Who owns the mineral

9:27 rights? There's a lot of reasons the public needs to know when a well is permitted. And so, yeah, you file that permit and then that makes it into the public space. And then as you go through,

9:40 'cause huge numbers of permits are filed that are never drilled. And so now it's like, okay, now we need to know when it was drilled. And so that's another filing that comes in and then when it's

9:49 drilled, when it's completed, that's another filing that comes in. And then when it starts producing, they start data basing that. And so all of these pieces are all just regulated and they're

9:59 made a public record and there's lots of reasons for it. But the fact of the matter is, is that

10:06 every state manages their own piece of that data and it goes from permit to plug. So everything in between gets licensed and the state. at done

10:16 Reporting requirements will vary and the way they classify things will vary. I was about to ask that. Yeah So the data is not like normalized across it is so Unnormalized that's the problem. You're

10:29 solving. Yeah, and that's and that's exactly it and I get so annoyed with it Because we're like this state decided to rename a column for fun The state decided to like completely botch all these

10:40 connections It's like you want to go put someone who actually has to use it in that position Who is controlling the date of these I know and these people I mean I love them I know all one hundred

10:52 percent of public agencies are short staffed and Especially in a place like New Mexico where like Lee County and Eddie County are two of the most popular counties right now and Every other counties

11:04 got nothing and so the state of New Mexico is like I you only need like two people and you've got literally thousands of permits backing up because those two people cannot handle the sheer amount of

11:15 data coming their way. So, but regardless, the, the, um,

11:21 every state, godly, they did just, they required different things at different levels and different granularities and, and honestly, even the sheer amount of typos in the data would blow your

11:32 mind. And so that goes back when I said that, like, we'll just go get the data ourselves. And I've heard other people say that, that is the dumbest statement that you could ever make. The data

11:42 is a wreck It is unusable, and if you think you can just go pull it from the state and use it, I mean, I'll give you 5050 shot if you're grabbing fewer than five wells that you can use the data,

11:53 and that it's reliable. So how does it go from the state public record to do you all go and clean up that data? Yeah, I

12:05 will say this, when we started, we didn't, actually. We were like, we're going to keep it transparent. It's going to be so traceable, and in my mind, as a data guy, I'm like, that's a good

12:13 thing you know, you can see. We're not, we're not jacking with this data. We're, we're showing you one to one, but it quickly came to realize that that's so unusable, that it wasn't an option.

12:24 And so before we even went live with the product, we had to come back and like, okay, we have to clean this garbage. And it's so much worse than anyone knows. And if you talk to anyone in the

12:34 industry, they'll tell you the data's bad. It's worse than they're saying, it's horrendous. I feel like I'm getting a small taste that building client AI and just seeing the types of data we're

12:46 handed. And like the headaches our team has from trying to figure out how to use it at scale. Like it's almost like you have to go through every single, and this is an insane amount of documents.

13:04 It's, I feel like as an industry, we should all decide to follow the same. And I guess that's what we're solving. right? Like follow the same format, like each service company have the same

13:14 things. I love the idea and that is very optimistic. It has been tried a little bit. It fails pretty horribly because everyone's got their own ideas, how they do it. And one say, engineers,

13:28 that's why. Well, and once they argue enough, they're like, screw this, we don't need this crap. And they go off and do their own thing. So yes. But I mean, God, I think the company shell

13:37 shell, you know, the massive oil company, I think I looked at it. And there's something like 220 versions of that name in the public data set across the US and Western Canada. So like, it's it's

13:52 that's a minor one, because there are bigger issues. Like people will be like, what's the bottom hole pressure? And you go to the States, and some just say pressure, and some say five minutes

14:03 shut in pressure, and some say, you know, casing pressure, and some say. But it's all the same thing?

14:12 they can't be or they might be different. Because for each one that has a naming of it, there are five that just say pressure, and you're like, pressure where? How? Anything, help me, God.

14:22 Anyway, so it's a nightmare. It really is, and that even even, I mean, God, I could keep going, so you'll have to shut me up at some point, but even, this is always my favorite, in Texas,

14:34 they're not to get too technical, but like production is reported at a lease level. It's just a weird oddity that is challenging to work with, just in general, and I can explain more if you want,

14:45 but the, before a well is assigned to a lease, 'cause it's a paperwork to get a well assigned to a lease, but they drill it, they complete it, they start producing it, and it doesn't even have a

14:54 lease yet, so the state has what they call like pending lease production table. And you can go to that table and download that data, and once it gets assigned to a lease, they'll fold it over into

15:04 the lease, and it's all fine in theory. Some years ago, there used to be a disclaimer on it that something like they studied this data and 50, 60 were duplicated values. They've taken that

15:17 disclaimer off and I can tell you from the data, it's still a problem. And so I'm imagining people who walk in and just grab that data set and say, oh, I can use this. They have no clue that that

15:28 data is now double reported in two different ways. And so, again, we talked about the obvious things, misspellings, difference in nomenclature, but then the unobvious things are like, if you'd

15:42 been in this industry looking at this data for 15 years, you would have seen it come and go and know that it's a problem. If you walk in it today, you'd be like, oh, there's no problem. It's so

15:49 great. Yeah, it actually gives me anxiety here. You talk like, I'm like, I'm like, how do you do this every day? I love it, I love it. No, it is. So again, not to get too much in depth,

16:01 but every source in our system has to have dedicated code base because there is no universal kind of translation. So everything has to do one by one mapping, what this field means from this state.

16:14 And honestly, there are times we have to call the state and then be like, what does this field mean? Or you look at their paperwork and like define how they're supposed to report things. I mean,

16:23 you have to dig in to actually understand it all. So but each source gets its own code base. And then all of that rolls up into a data model, which then we run rules over to fix the data I mean,

16:35 and I'm not talking about rules that are super complex. I'm like the lat long for this whale. Is it in the county they say it's supposed to be in? Yeah. Sometimes it's just not. And so it's that,

16:46 yeah, it's bad, it's awful. So when y'all started, did you start like just in Texas and then you started branching out to other states? 'Cause it seems like that would be very like, it's like

16:57 you dig in and you're like, yes, we're gonna do this, let's start in Texas. Y'all get it figured out. And then you're like, go to the next one

17:05 Or did you already work in all of it, all the states? So you kind of knew like what to expect. No, I did not. Unfortunately, it's comical, but no. I, it was exactly like you said, we were

17:16 like a start with Texas, we'll grab that data. We pull it all in, we have it all mapped out and we're like, great. And we go to Louisiana and we're like, huh, okay. What is this? What, does

17:26 that map to this? Does this, where does this go? And then you get, you know, three, four states in and you have to circle back around to Texas as you realize now, three, four states in, that

17:35 like where you put this, probably needed to move here. And like you do these. How many times did you'll remap here?

17:42 I don't even want to say. No, it's seriously though, like, so we started, our official start for this company was like an '09. And we didn't release a product until 2012. And those three years

17:53 we spent fighting that data, trying to figure that data out. And so, and even at that with, I look back at what we had in 2012 is a joke.

18:04 It's no wonder it was hard to get off the ground because it was not what it needed to be. That would take years more. Okay, so how have y'all been able to use AI at all to help with your, with the

18:17 data at all? We use AI in the most unsexy ways, which if you get an AI, like the sexiest things arrive with the most trouble, the most kind of hit or miss results and stuff, but like seriously,

18:30 the AI training we do around reading unstructured documents, like PDFs, tips, and all these things, like that has taken what some technologies, I mean, they just read in it, be garbly goop,

18:43 but since we've been able to employ these AI models around reading data, not anything to oil and gas related, reading images, we have honed this thing into just an insane precision to where we can

18:54 go grab some of these old data points, so this is funny In 1993, the Railroad Commission only had production

19:05 What is a microfiche? You seriously asked what a microfiche was? It's a microfiche. Yeah, of course.

19:12 It's a thing you would Oh, man, I'm so old. They had met the library back in the day, and you would put in the machine there, and you'd load in these cartridges or these films, and they had this

19:17 big black and white screen, and you'd have to rotate through them. You never heard of this. Oh, my God. They're salt libraries. Yeah. And it scrolls through old newspapers and documents. The

19:20 newspapers are all in there.

19:35 Yeah, those are really cool. Oh. Yeah. But it's far cry from a computer. Let's make this. I want to go. Okay. I'm going to go find one at the library. Tell me if you find one. Yeah. We used

19:45 one in Pittsburgh during our documentary. We used it to like, we filmed it as B-roll, and I've seen it like in movies and stuff. But like, I guess it's some kind of way they just keep archives in

19:55 a safe way where you can't like put your fingers, gross little fingers all over 100-year-old documents. Like, right? No, it's really amazing. But that's where it works.

20:04 Flip through that? I mean, so yeah, we didn't do that part. However, we were able to get them, the railroad commission actually is the one who undertook this project of scanning those, but all

20:15 they did was scan the microfish. And so you, if you take the ugliest scan of any image you've ever seen and like cut the quality in half, that's what we're talking about. It's handwritten stuff.

20:25 I was about to ask, yeah, handwritten. And it's like a bad scan. How do you deal with that? It's where the - That a human can't even read All right, no, and that is a weird area because an AI

20:35 model properly train reads it better than a human can. And that's what I'm saying. It's so unsexy to talk about, we use it to read microfish. But it is so useful. It is sexy. It's literally

20:48 CSI-enhanced. Yeah. That's exactly right. I mean, that's as sexy as it gets. That's prime time.

20:56 But they compare it to chat GPT, where I want to type something that talks back to me. They're like, Oh, that's so great. Or, Make me an image of some dude with eight fingers. Because I don't

21:05 know why AI struggles with hands like it does. Hands and words. But yes, we

21:12 use it for these back-end systems. We have so much AI and even machine learning tuning that's done into those systems, which, again, just helps feed a better quality of data.

21:25 And then on the front end stuff, we've started some things that are pretty cool with AI, some things that we even work with John and them over here about with being able to query our database. But

21:38 this is also a weird thing, too. AI and numbers are funny. And so we don't work that way. We actually trained our AI to use well database, which is maybe a narrow difference,

21:52 It's saying, spinning out an answer. What it does is it uses our tool, which can give you the answer if you knew how to find it. But it uses the tool for you. And then instead of saying, you ask

22:04 it, how many wells were permitted in the Permian last year, it could just spit out 8, 222 and you'd be like, Okay, I guess, I don't know. So instead, ours will go through and it'll go apply

22:16 the right filter, zoom you to the right area, give you the right dashboard And then all of a sudden, you no longer, it doesn't tell you the answer, it shows you the answer. With all of the

22:25 context you might need about who was doing it, when they did it. And it just, it alleviates that idea of, 'cause even now when they cite their sources, it's always Jim's rig count that said there

22:39 was 200 rigs last week and you're like, how do I know if that's true or not? Anyway. It's mind-boggling what like, how LLM's pull from sources. It'll have an answer very confidently and it'll

22:53 source it, but what it does is it'll take it out of the context and reword it. And so there's sometimes where it's like, oh, that's a very definitive answer. And then you go click on the source

23:03 and you see it how it's in the article. And it's like, this doesn't even remotely how, like what it's saying, but like they frinking stained the words together. I love it when it's like - I hate

23:14 it. Literally the opposite. That's my favorite one. When you find it, that they stated it, but they literally had twisted it And they said that it's saying it is something, and the article, if

23:23 you look at it, says it's not - Yeah, I've seen that too, it's ridiculous. It's so much fun. But you know, I love the large language model world is really interesting because he gets you really

23:34 thinking about how you learn as a person and why the words and the vectors and the way that it's all built makes sense because you were taught by a teacher standing up there using words that then you

23:47 reframe to understand And that's what the AI model is doing too, except with

23:54 tons of wrong information, tons of weird sources, like amalgamation of the way words are structured. It's like if you had 17 teachers teaching you, you know, math, but like six of them taught it

24:07 one way and five of them taught it another and then three of them taught it wrong and it's like all in one. So, but yeah, they're back kind of to the question. Fascinating way to put it Yeah, but

24:22 the way we use AI and stuff is actually like super effective. It makes me so happy that it's possible because, I mean, ten years ago, we tried some of these things and we had to shelve it because

24:33 the technology wasn't going to make it work. And just over time, it came around and now we do things that are made like things I never thought would be possible with pulling out data from horrible

24:43 scans and things like that. And I feel like the technology is just getting like. It's like compounding now. It really is. And that's where the area is quicker. Yeah, and it's always been my

24:53 weird thing about our industry hesitation to

24:57 adopt new technologies. The beauty of what we do is that we're able to break it down to data ingestion, not just oil and gas related things. And so we start to be able to open up tools that maybe

25:12 the medical profession might use or any of these other horribly old industries. Government. Yeah, we are able to get a lot of benefit from what goes into that, which is, yeah, it's really cool.

25:24 But again, it's not something you'll find us posting on our marketing, because people are going to be like, OK, that's interesting, I guess. But it is. You're right. It is the coolest part.

25:34 It is kind of like the figure out the thing no one could read before. Yeah, yeah, that's very cool. Can we get into more about how public data works? So like, if Shell is doing a rig in Midland,

25:47 in a Permian, they have to give up some percent of that information to the Texas Railroad Commission, right? And that's the public data you're able to see or any, like we can literally go on line

26:00 and see it ourselves. Right, yeah, so yeah, yeah. And then just let you go on, does Shell like get to keep some of their data for themselves? And like, I'm assuming they want to keep as much

26:13 as possible, right? So like, how does that mindset work with like these big operators? Right, no, that's an excellent question. So there are minimum requirements. There are always going to be

26:21 minimum requirements. And those minimum requirements you'll find are largely tied to kind of how leasing works. Because when people lease a land to drill on, they lease the land, the surface area.

26:35 And then they're capable of leasing certain depths of minerals, but not others. Um, when you've got these land areas, like section township ranges, which Texas doesn't do, but the other lot of

26:48 states do, um, when you own percentage as of a section, because the section's a 160 acres. And so if you've got 20 acres of this section, someone else has 140, well, when you drill well in that

26:59 section, a lot of times the way that works is that you get now your percentage of the section from that well. So that well, you know, if they're, and they put 160 acres, you don't get a lot of

27:10 wells per section You can get something, but anyway, so when you think about the granularity that you have to report, it's a largely around that, whereas at what depth it's being, what formations

27:22 it's being produced from. And then of course, when since we're on horizontals, which count which section tensioners does it transverse and because honestly, it's the leasing rights may be bigger,

27:34 but like the individual ownership pieces or mineral rights, those parts will be really broken up a lot of times like these people will have. fractions and, I mean,

27:45 00058 percent, you know, mineral royalty rates. And so from the public side, those data points are really important to put in because if you're a mineral owner, which today, you know, a lot of

27:59 minerals are being passed down through families and generations and stuff. And so you'll have someone who gets these minerals and they just have no clue what it means And so they need to be able to

28:10 go to the state and say, Here's where my land is. What other wells are in my section? What other wells have been drilled in this formation? What other wells are, you know, going underneath my

28:19 land and versus, you know, being on the surface? And so those pieces are all a part of what the permit is, as well as some basic contact information, you know, who's doing so if you have a

28:28 problem, you can call them and that kind of thing. So that's part of all the permitting. And then when it gets to the other parts, like the completions, that's where the data buzzy as to how much

28:40 they have to report. And it largely exists, I believe, to kind of confirm the permitting information. So, you know, permits are, I'm planning to do this. The completions are, I did do this.

28:53 And so, again, they're similar but different, but you need to know both of them. And so it's a lot of the same type of information. It's that more actuals with details about, you know, their

29:03 permit might say, we're trying to perf 12, 000 to 16, 000 feet. And the completion will say be perf 12, 226 feet to 15, 842 feet, something like that. But again, all relevant because the way

29:17 leases are broken down, the way mineral rights work and all those things like that. And furthermore, there's a handful of technical details. You can find reasons why the public wanted access to

29:28 that to almost every one of them. So a lot of states require a log to be filed, but that log is not necessarily like every tool that are rotten known and stuff. fragment of it. And then when it

29:40 goes to frack, there's obviously frack focus, which is an independent group that will take in what was fracked, how much water was fracked, what chemicals were fracked, how much sand was fracked,

29:52 where it was fracked, all those things like that and another, one where public knowledge there's pretty pivotal. And then furthermore, the production, again, every year, every month, you might

30:02 get a check from an operator for your mineral rights. You need to be able to go verify that you got the right amount. And so some math involved, but still, that's, you so I guess what I'm saying

30:12 all that is that everything circles around what's in the best interest of the public to know about the activity. Now, there'll be some states who go a little further and I can't exactly say why,

30:24 but I appreciate it because they give us just more data and it's more data to work with. And then there's other states that kind of fuzzy what they're doing, or you might have one regulatory body

30:37 that handles the permitting a second regulatory body that handles the taxes on the production. And so tying those two together is a bit of a nightmare. But again, the public data really centers

30:50 around the things that been official for the public to know. And then the industry uses of the data really start to - if they're having to tell us where they're permitting, now we're watching

31:02 activity happen. They tell us where they're drilling. We're seeing where the drilling and then the production, again, we see how well they're producing. So now we can take all the information

31:11 about what they completed, what they fracked, and what it produced. And I can use that information when I go to plan my well and make sure that I'm making the most out of what I got. But yeah, to

31:20 the point in what's provided, it is largely around what the public could benefit from. We

31:28 have a lot of customers who are just people off the streets, and that's nice. I like those people that are interesting. But our biggest customers are oil and gas operators, service companies. for

31:38 all the reasons we talked about, you can utilize that information to hone your services and stuff like that. Yeah. So if someone wants to start a new project,

31:48 you know, obviously they're going to try and take as much public data as possible, but, and like that saves money, right? Like they just, they kind of have a bunch of free information there.

31:57 But

31:59 like, do they still need to like do basic ENP stuff? Like, do they need to, can they skip wire line if there's information? Like how does that actually work when planning like a new project and

32:12 an operation? I think unfortunately a lot of them view it that way. They view is like a statistical view of like, I have this data for 100 wells around my well. I know enough about what's

32:25 happening underground to make my decisions. And save money. And save money. Save a lot of money. Especially because, you know, we're drilling these, you know, sometimes four mile laterals,

32:36 Sometimes we forget, especially when we're in the industry, that all of the things we're talking about are happening literally two miles underground and then four miles further away. It is so out

32:49 there what we're actually trying to analyze. Nothing we could ever see in real life. And so anyway,

32:57 to me, that's more of a PE mindset, a financial mindset of like, I'll grab 100 wells and I'll take the average of what they do and that's what this well will do. And it'll be in that ballpark.

33:07 And it probably will be because especially in unconventionals, those zones, they're, okay, there's some variability in it, but they're really kind of similar across all the things. Now, yeah,

33:21 we could get into all the details of what aren't similar and where people shoot themselves in a foot, which is why the science is a mistake to skip over because while, These wells all might have the

33:31 same rock there. You might have different pressures. You might have different natural fractures and all these things need to be measured before you know they exist. Otherwise you go frack 'em and I

33:42 mean, there's so many things that could go wrong. So I won't even go in all those details. But you do get an enormous amount of intelligence by your offset wells. I don't personally believe it's a

33:53 substitute for the science. I think the science is still really necessary if you want to remove as much risk as possible, which these things are expensive, these wells. When you screw up a well,

34:04 you, I mean, that's horrendous. It's a big hit. So I don't saving 100 grand off of a15 million well.

34:14 I don't think that risk is worth it personally. But you do need, I mean, taking in those offsets gives you a huge insight to what you should expect to happen. I can start. Yeah, I mean, it

34:24 gives you, and honestly, take it a step further where you are talking about maybe you're a new company, you're like, I have a new strategy I want to take on where I'm going to lease this

34:32 particular acreage. I'm going to go in conventional or CCUS or whatever it may be. Well, you build a whole strategy around the public data because you see these large-scale trends and you can make

34:42 your thesis and pitch it to your investors that, hey, I'm seeing this kind of below-the-radar opportunity here. That's very surfaceable in the public data and a lot of companies have gotten funding

34:53 by being able to kind of use that as a baseline. But again, once you get going, use the science because, God, you have it all at your fingertips. Just use it. And you will, honestly, the odds

35:04 of being unsuccessful if you use that, just tank. You will do all right. Yeah. That's funny. Just learning everything in the last 40 minutes, realizing this is a story Brian Sheffield told,

35:17 something I recorded this year, how he pitched investors on a proposal and it it was 100 public data.

35:26 I mean, that's why he says savvy and popular as everyone knows him to be because he literally just did something anyone could do and was funded millions of dollars, just public data. That's how

35:37 crazy and huge it is. It's pretty funny. Well, and it is. It is nuts because we did talk about the purposes and what data is made available. I mean, there's good reasons for it, but like, you

35:46 mean, having the production data from what a well did, knowing exactly what was perforated, what depth perforated and what they pumped, like you basically written a formula for what you can expect.

35:57 And again, like I said, there's variables, but still you can build a whole business out of doing that. It's great. But yeah, I will say this, you know, those same kind of gotchas that I talked

36:10 about earlier, like if you don't know they're out there and you go do that same thing and you build a whole business model around it, and then it, oh, it turns out you were double counting

36:18 production like that will end your world That will, that will. There's a company - The Usual Database. Yeah, Usual Database. The Usual Database. That's it. There's a company, I'll leave. Well,

36:28 you know, this was public that this happened, and this company has completely reshifted everything they've done. They are no longer the same company anymore, and so I feel fine saying it because I

36:39 know people at this company, the company's Flotet Chemical. I know people at this company, and they do a very good job today. They are absolutely not this company anymore, but rewind, about 15

36:48 years ago, they were doing this. They were saying that our chemicals are being utilized in these wells. Here's our production uptick. So usars, you're going to see this much more return. We're

36:58 seeing it everywhere. Well, one of these, I do believe, was like an activist investor type report person went in and said, hey, these people are over counting these areas. They're double

37:08 counting this, they're doing this, and what they're saying is actually isn't true. And their stock price tanked. I mean, just after the bottom fell out. And they actually brought us in to help

37:19 kind of like shore it up. before they finally said, All right, we've got to stop this whole data business, focus on chemicals, and today that's what they do. And they're a good company today,

37:27 but that's one example. I mean, you can look it up on the internet. There's stories of it. It's gnarly. Yeah. Anyway. The worst feeling realizing your data is not right.

37:40 But, you know, it's a story for us too, because I can't, I mean, I put my name out there on this data. I, I, my reputation, everything that I promise you this data is good. And, you know,

37:54 we work tirelessly to it and like kind of nutty personal about it, because like if someone finds a problem, I take it personally, like this has to be fixed. And so I put my name on the line every

38:05 day on it. And I've had to, you know, walk back and fix things before. And but it's, it is so critical that you actually, you know, know what, what the data is before you start making these

38:19 multi-million dollar decisions on it. So since 2009, you've been collecting data public and you've been maybe cleaning it up or just keeping it as is, whatever. So that's almost, I can't do math

38:33 in my head. I mean, 15 years of all this. Like do you have all everything you've ever done stored somewhere? Do you have a data center? Is it a few hard drives? So what is the size of this

38:45 operation that's like on a digital storage? Oh, man, I

38:51 like to call it big-ish data. It's about 100 terabytes worth of data involved between our structured and unstructured data pieces. Well-database Josh and I are programmers. And so from a structure

39:04 standpoint,

39:07 we have an entire data center in Greens Point that we utilize. We've got multiple cage. We have a cage with multiple racks, with dozens of servers and databases that have partitioning and the

39:18 ability to - to scale at various ways. And then we have all of this replicated to the cloud. So we have it there too. And so

39:29 it's a lot of data. It really is until Facebook comes and says that they do what our entire data set every 30 seconds or something like that. And you're like, okay, maybe it's just big ish data.

39:42 So it's a little bit in between. It's manageable size. But again, it requires a significant amount of resources to store it. And our code base is giant. We've got our code. But like I told you

39:55 about each state gets our own kind of dedicated base. That set of code, which if you ever hear someone says, as we have the scripts to go get the data, like run the other way, because this is

40:06 more than just a few scripts, this is enormous. We have over a million lines of code in our data project. And that's not even talking about our front end, our API's, any of that, like just the

40:18 data project and are ruled. all those things. How many times do we have? I don't know. A million. We have 200 trillion. A billion.

40:31 How does the data say up to date? Is it through APIs? No. Because I don't imagine that. Yeah, that's through a nasty hodgepodge. We have an entire system built around this notion that has to be

40:43 like North Dakota makes their permits available in a PDF at 5 pm. Central time every day. Go start looking at it for449. When you get it at5, parse it, push it out to our system. So when they

40:54 change that.

40:57 All of a sudden there's this big glaring red thing in our system. When we learn that too, like one thing we found in our system is that it's better to be missing a data point than have a bad data

41:08 point. And so we've got this kind of flagging system that when our data systems are rolling through and they hit something that there's unexpected, it just stops. And then we have this reports that

41:20 we get every day being like, hey, you've got five jobs that need to be looked at in these places. And obviously, our customer is going to notice real quick when the data stops updating. So we

41:31 have it like, I'm honestly, that's a very first thing in the morning, run through any of those and deal with that. And you have to have someone like on call all the time or no. Are you talking

41:40 about me or no? I mean, it really, I will say like that's both you and Josh Like anytime we've worked with y'all, y'all are in the code and y'all are like, oh, we're just like going to do this.

41:50 And as founders, like, that's incredible. Like I don't know if it's smart, but I mean, I don't know either. But hey, I love it though. I'm a pro, I'm just a programmer at heart. And I'm not

42:02 the best programmer. Josh is like so much better than me. And that's why I'm so glad he's here, but like I'm a data guy. And so I live in the data code. And if I wasn't doing that, I don't think

42:13 I'd enjoy the job so much. It's one of those things I always want to have one foot in, even though I spend a lot of time doing a lot of other things now, I just really have a kind of extreme

42:25 ownership of that code. Yeah. I will say it's probably nice, like not, like if you want something right away, you don't have to wait, you can be like, All right, I'll do it. I mean, it's my

42:35 toxic trick too, and I wish that I knew how to code, 'cause I would do that too, and I would annoy everyone. Right, no, programmers hate it, because - Oh, they do. Yeah, because you're like,

42:43 We need to do this, and like, Ah, that's gonna take a week, and I'm like, Nope, it's gonna take 30 minutes. No, no, you can't. It's like, You want me to show you? Okay. And they're like,

42:51 Okay, shut up. And then, but yeah, it's fun. That, I mean, this is a maddening job, and this actually leads to a question I wanted to ask. I mean, there, you have been building all this

43:03 data for 15 years, you are so focused and dedicated on accuracy and just efficiency, security. Like, how do you pick the right partners to work with and hire people even. below you to do more

43:16 mundane tasks that you know are going to be multiple Johns. You don't want to hire someone who doesn't give a shit. You can't really afford - It's the chicks and balance that you're forming. You

43:26 can't afford for someone to just kind of be like, I don't know, you kind of need everyone to be on and trust them and be autonomous so you can sleep by night. How the hell do you do that?

43:39 You know what comes to you, it's an interesting thing because I'm a little, I look at myself on some of these recordings I do sometimes and I'm like, you're a weirdo. You are way too interested in

43:50 this data stuff and so, but then I'll inevitably meet up or catch up with people and they're just as weird as me about it and that's when you're like, you wouldn't come work with us, this is what

44:00 we need. You have to be - Where weirdos. Yeah, yeah, more weirdos. No, but it's true though, I won't. I would not in a million years just offshore this kind of stuff, I would not just grab

44:12 any Joblo programmer to do it. It's got to be someone who is as passionate, but we also built this whole system. I mean, when we did this, we built our whole company around automation. And so

44:24 things plug in really nicely and they can be monitored really nicely. And so it gives us some confidence that when we put something in place, like we've got a nice checks and balance to red flag

44:35 things real fast. And we've had over 20, 000 people in our platform and if I don't spot it, they spot it So it's, yeah, they're unrelenting. I wish we would have, I wish I would have thought

44:48 about this before, but I was using Well Database one day with John and John on our team. And I was using it as just like, hey, show me like what all these lines mean? Like this right here, why

45:03 does this go like this? All of the just a million lines. And he was like explaining all of it and it'd be fun to even. like another episode where you just walk us through what all the lines mean.

45:17 Yeah. Well, and you bring up such a good point on the data on how this comes from the states and what it is. And sometimes I feel like that's an area that people get a little neglected. They just

45:30 assume permits, permits permit. But there's an actual PDF that gets generated with this permit. You can go read every single field, all the comments, everything that goes into it. And if you go

45:42 to the railroad commission, if you just type in railroad commission data query, it'll present you with about a hundred different ways of searching for different items. And I don't know how any

45:52 regular person finds anything there. It's when on well database, you click on a well and you hit show details and everything's right there. It is very easy to use. I had so much fun just like

46:02 playing around on it. I'm like, I don't know what I'm doing. No, I wanted to make a game. I wanted to make a like an oil tycoon kind of game, but like use real data. So you could be like, no

46:12 wells are drilled. I wanna go drill one here and I wanna perfect here. And then I wanna take the, and then go query our database for the offset data and say exactly what it would have made or

46:21 estimates of what it would have made. And so you can build an oil company out of it. I thought that would be a fun game. I'd be so fun. Like the tan. Yeah, basic oil. I feel like don't your

46:31 kids program. Oh yeah. They have projects. You should give them when you have that project. My kids, my kids are great. They respect what we do so much. They love well database And but they are

46:42 so not interested in any business programming. They're like, I wanna make video games. And I'm like, You know. You're the wild tycoon one. Yeah, until it is right there. So the video game,

46:51 but like the video game industry is insane. It's so good. Did you start and like, did you make a video game before? I didn't start, I mean, I did. It did, it's like a hobby. Yeah, on the

47:02 side for fun. And my son is

47:06 in the middle of making a game right now. He started his game company. It's so much fun. I'm so cool. I'm such a nerd. I love games. That's awesome. But no, I think everyone would enjoy that

47:16 if you could like just have a playing map and like go square off acreage and like go back and reference historical prices as when you drill. And like you could have a timeline anyway. Yeah. All

47:28 right, we're gonna, we're gonna build it We're gonna use, well, database API. There's so many ways to go. It's like you can do like a geogaster thing where you like guess what acreage you're in

47:38 or basin. There's like clue. Like there's so many takes, like Monopoly is the easy one. Like, God, like this is, I might need to like censor all this. There's like a billion, this is public

47:51 data going on. No, it is, but yeah. I mean, come back. A circle. Yeah, you could be like, this is 1932. You've got, you know, two shovels.

48:01 Let me go find that salt dome over in Barbers Hill and dig it out, man. Or you just get old handwritten documents from the '40s, and it's like, Is this an eight or three? You just have to figure

48:13 it out. It's an eye or a lowercase at all, or a one. Wait, that's actually such, if companies would actually let their private data be public and make a game out of trying to transcribe the

48:28 handwritten notes? Yeah. It's actually genius. It's like a capture kind of situation where it starts reading old books. Yeah. That's awesome. Yeah. All right. Well, if you heard any of that,

48:41 you did it because that's our idea. Yeah. Let's seal it. It's all in implementation. Yeah. They'd have to contact us for the data, so well, right. But no, it's fascinating. It's such an

48:52 interesting thing. And my kids, they do love it, but they don't love his corporate programming. That's what they don't want to do. They don't want to be mean.

50:59 Colin for a more technical of viewers right now there he has this slide show and it's very fun like You know like people these they still like they're up a thing with like a billion words on it And he

51:12 has like memes and keeps it short and tight. So very digestible a very Finger on the pulse on culture today on telling your story and stuff and that's why we're gonna have you back for the 23rd time

51:25 Absolutely. Yeah, there's got to be some competition. I'm on a way. I'm still waiting here with your ass calling. What's the prize?

51:32 Well, that's your invest. Yeah. Oh, thanks

51:37 And I'll stop deal. All right, John. Thanks for coming on. Thank you so much for having me guys. Appreciate it