Overruled by Data

What makes a modern data platform actually work inside a law firm?

In this episode, Mark Thorogood, Director of Enterprise Data Operations & Software Engineering at Perkins Coie LLP, breaks down how his team moved beyond on-prem constraints to a DIY lakehouse and logical data fabric that unify insights across the firm. Drawing on lessons from the army and time at sea, Mark explains why standards, simplicity, and focusing on “critical data elements” beat boiling the ocean, and how a business glossary (not just a data dictionary) turns data into decisions.

Mark also shares the nuts and bolts: Databricks and Microsoft Fabric on Delta/Parquet, Denodo for virtualization, and Power BI on the front end, plus the outcomes that matter (costs cut to one-fifteenth, 12× more finished rows, and cycle times moving from months to sprints). We get into API realities (why you’ll cache), medallion-style layers (including their “copper” tier), portfolio-level budgeting beyond matter records, and what’s next with agentic AI producing defensible, explainable analyses across practice groups.

Timestamps:
(00:00) Intro
(01:34) Mark's career journey
(02:48) Lessons from the Army and sailing
(05:09) Leadership and team building
(06:21) The importance of data
(08:27) Building a data platform
(12:52) Implementing a lakehouse
(16:03) Architecture inspiration and choices
(17:47) Business glossary vs. Data dictionary
(22:03) Challenges in scaling and API performance
(23:24) Utilizing API data for enrichment
(26:37) Building and managing a data team
(28:52) Client portfolio budgets and AI integration
(30:43) Future of data analytics and AI in legal
(35:18) Virtualization and semantic layers
(36:50) Resistance to change and overcoming it
(38:57) Practical tips for data platform transition

Connect with our guest:

Connect with Tom:

What is Overruled by Data?

Overruled by Data is the podcast for law firms looking to accelerate their data journey without all the pain points.

Hosted by Tom Baldwin and brought to you by Entegrata, each episode shares real-world stories from law firm leaders who’ve tackled the tough stuff—getting data from all the right places, navigating the AI hype, and scaling operations in a way that doesn’t leave you with a mountain of tech debt.

If you're in a leadership role at a law firm, this show offers valuable insights from those who've been there, sharing what works and what to avoid on your data-driven journey.

[00:00:00] Mark Thorogood: There's always resistance to change, and I'll just throw this out there. The only person in the universe that really wants change is a baby with a dirty diaper, we either change and evolve because the technological Darwinism is in play. If people do not evolve, they're gonna go buy the way of the dinosaur.

[00:00:18] Mark Thorogood: That's the truth of the matter.

[00:00:22] Tom Baldwin: My name's Tom Baldwin. This is Overruled by Data, the podcast for law firms looking to start their data journey or accelerate the journey they're already on. Brought to you by Entegrata. Welcome everybody. Today's guest has built one of the more ambitious data platforms. In the big law market, Mark Thorogood is the Director of Data Operations and Software Engineering at Perkins Coie, where his team was an early adopter of A DIY Lakehouse approach and has since layered on a logical data fabric to unify insights across the firm.

[00:00:54] Tom Baldwin: Before his 25 career year in legal tech, Mark spent 12 years in the Army where he learned the value of discipline, planning and clear intent skills. He now applies to building and running data operations at scale. Outside the office, he's also an avid sailor, which gives him a unique perspective on navigating uncertainty and keeping a crew aligned, which is super cool.

[00:01:14] Tom Baldwin: Today we'll dig into how Mark sees data transforming law firms. The lessons from building early. What he's most proud of and where he thinks analytics and AI are headed in the legal industry. Mark, welcome to Overruled by Data.

[00:01:28] Mark Thorogood: Hey, thank you Tom. Pleasure of the year. Thank you for inviting me.

[00:01:33] Tom Baldwin: Yeah, so let's dive right into it.

[00:01:35] Tom Baldwin: You know, your 25 year career in legal, you've led software engineering and data operations teams at Perkins and at McDermott. Looking back, what was the common thread that's carried you from early technology work to running such a large complex data program today?

[00:01:50] Mark Thorogood: And the number one thing is focus on the critical data elements.

[00:01:54] Mark Thorogood: When I was younger, I tried to boil the ocean. Now I just try to cook lobsters. That is the number one thing I learned. So be practical, focus on what matters most critical data elements and, and I could talk about like, say elite for instance. Elite might have what, 6,000 tables, gobs of columns. You're really looking at about four or five that do your work to cash.

[00:02:19] Mark Thorogood: And you can boil that down to about. 2030 columns at most. And then you can describe the whole money conveyor belt. So focus on what matters most critical data elements.

[00:02:31] Tom Baldwin: The money conveyor belt. I love that, uh, phrase. It's a good one.

[00:02:34] Mark Thorogood: It's the money conveyor belt.

[00:02:36] Tom Baldwin: It's, and we're, we're gonna dig into this a bit later because I think it's a critical part that folks get really caught up on when they're starting their data journey is where to start.

[00:02:43] Tom Baldwin: So let's, let's circle back to that 'cause it's a really good point. You spent 12 years in the Army before jumping into your legal tech career, what habits from that experience, whether it's discipline or planning or after action reviews, ARR as they're called, um, still influence the way you lead technical teams?

[00:03:02] Mark Thorogood: I would say in the Army you have to have standards. You have to make things iPhone simple for people. You can't over complicate things. 'cause if you think in the military, we're all kind of uniformed and we can deploy, I don't know, 2 million people to Kuwait. Eject someone from the country and be back before Thanksgiving.

[00:03:26] Mark Thorogood: Law firms can't do that 'cause we're to bespoke too idiosyncratic you. You just gotta focus on those standards. I remember what they trained me at. Do things correctly, do the simple things correctly. Don't get too cute with the way you're trying to do.

[00:03:42] Tom Baldwin: Love it. I love it. Outside of work, you and I have talked before.

[00:03:46] Tom Baldwin: I know you're an avid sailor. Does sail give you any unexpected parallels to building and running teams, whether it's reading the conditions, balancing risk, or keeping the crew aligned.

[00:03:57] Mark Thorogood: I would say that there's a lot of parallels between sailing and, and how you're gonna run a team. 'cause the biggest thing is, is you gotta make sure that the crew knows their job.

[00:04:07] Mark Thorogood: They gotta be read into what that passage is. And, and I'll, I'll, I'll share with you a story. There was one person where the husband died because the wife didn't know how to turn off the autopilot. The guy fell off the sailboat in the middle of the Atlantic. They kept going. SIBO was able to get onto the radio, but by the time they could get someone out there, the person passed away and perished.

[00:04:30] Mark Thorogood: And I know that when I get a crew there, I gotta make sure that they can save me if I fall off that boat. You know, you're 70 miles out in the ocean doing a passage. That's the number one thing they gotta be read in on their plan. Same thing if something goes wrong and things will go wrong every day they go wrong.

[00:04:49] Mark Thorogood: They gotta be dialed in on what their job is, what their roles and responsibilities, and what they need to react to. So the way you lead a team on a boat is the, IS communication is the same way you're gonna do it in an organization.

[00:05:04] Tom Baldwin: And you think about kind of meshing those two together, right? Your 25 year legal experience, your military service, and sailing.

[00:05:11] Tom Baldwin: How would you describe some practical examples of your leadership style and practice in building your team at Perkins?

[00:05:19] Mark Thorogood: I would say the practical things. Be supportive. Be a teacher, empower people. You gotta make sure that they're trained, competent, and confident in what they do. Whether they're on a vessel in the middle of the ocean, or if they're trying to implement a dashboard for, you know, a high ranking partner.

[00:05:41] Mark Thorogood: They gotta be trained, competent, and confident. And what you do as a leader, what I do as a leader is I'm constantly teaching people. I consider myself like a chief resident at a hospital. Everyone ha or doctors, they have their cases, they come to us, we talk about their cases. What the. Ideology is how to deal with them, what are the symptoms, how, what the treatment needs to be, and I'm constantly training these so that they can go through their profession and become skilled doctors.

[00:06:12] Tom Baldwin: No difference. Awesome. Switching gears a little bit, everybody has sort of this aha moment in their career where they realize like, oh shoot, data is really important. And we don't have it. Was there a specific moment or a project where that clicked for you, either at McDermott or at Perkins, where you really fully appreciated the need for a ba, a data backbone, not just a unified platform for reporting, but something more enterprise scale.

[00:06:42] Mark Thorogood: Well, one, I'll talk about the importance of data and, and how it helped is basically data is a representation of reality. You can then take the data and you can make models. You can do what if analysis, and you can get more precise in your judgment. We all, you know, face goldilock situations. Do you buy too little?

[00:07:02] Mark Thorogood: Do you buy too much or you all wanna get it just right. And I remember way back when, this is when the Velociraptors ran the earth and there's certain things called servers with storage area networks. I had to make sure we had enough storage area network that would accommodate all our email files. So we built the model and we nailed it.

[00:07:23] Mark Thorogood: I mean, right where it was. And the model was good for the next four to five years. It was constantly tracking. We didn't overspend, we didn't underspend, and half the scramble, we got it just right. And that's where I started to see the importance of it. And then I just basically said, Hey, if it works for me, it'll work for others.

[00:07:41] Mark Thorogood: Maybe you know, I'll apply this to the practice groups. And I had a few trials there and I was successful at it. Actually. I was the manager of application services. Love the job. I was conscripted into run data. 'cause they basically said, Hey, you can solve these problems. You're gonna become the director of data operations.

[00:08:00] Mark Thorogood: It wasn't my choice, it was Rick. How's choice?

[00:08:05] Tom Baldwin: I'm serious. So serious. I I, I, I have no doubt that's the, uh, my, my high school basketball coach is one of his favorite sayings was the reward for a job well done is the opportunity to do more of it. It sounds like for you,

[00:08:19] Mark Thorogood: if I were put that dashboard with Power BI back in 2016, I wouldn't be the director of data operations.

[00:08:27] Tom Baldwin: Exactly. Exactly. Um, I, I know you've spoken at a few events and you've talked about standardizing and centralizing data before taking on big projects. What convinced you that that was the order of operations that matters most? Other than just plugging in tools?

[00:08:42] Mark Thorogood: I kind of saw the trajectory that's going on in the world and I looked at like, you know, some of the systems that I admire, like Amazon.

[00:08:53] Mark Thorogood: You can see that they had a centralized integrated service and they actually used it to run their operations and then they sold it as Amazon Web Services. You know, so their system where they simplified and integrate brought things together. I can see the consolidation in markets. I could see the Walmart high in of markets.

[00:09:13] Mark Thorogood: You know, where you see all these mom and pop shops and, and, and the analogy here is that if you look at plugin and playing tools, it's all like the mom and pop shops are there and you can replace it with a supercenter. So just walking around and experiencing my reality, I would see the analogs and I say this is, this is the natural flow of the technological Darwinism that we're experiencing and where it's going to lead.

[00:09:40] Mark Thorogood: So, so that was my motivation behind that. And it actually paid off. It actually paid off. We did some statistics recently. We cut costs by one 15th. We were able to produce 12 times more finished rows than we were ever doing compared to last year where we were using like on-prem technologies. And our dataset cycle time went from months to sprints.

[00:10:05] Mark Thorogood: That's amazing. So you can see it and it's right there. So we've lowered the cost to produce finished data products that wrap around a business problem and produce that value. Now, if I would've had a bunch of tools, I would've had this like junk drawer if you will. You know, this cock phony of things that would be very difficult to understand and manage, but you just look at the natural reality out there.

[00:10:29] Mark Thorogood: Just just walk through the next Costco and say, Hey, how is this supply chain working? And you'll see it

[00:10:37] Tom Baldwin: if data is like navigation at sea. What was the first GPS signal your team built that replaced gut instinct with real guidance?

[00:10:48] Mark Thorogood: I would say the gut instinct with real guidance was, uh, you know, basically sizing out that sand. Yeah, that storage area where we nailed it. The gut instinct. You, you had this swelter of conflicting opinions. There were all the people who were like, oh my gosh, I need more, more storage. It was gonna be very expensive.

[00:11:07] Mark Thorogood: And then there were the other people who were cheap and cheerful. I don't need as much. And we would've wound up building this, you know, cottage industry of bolting things on. And I was like, no, we, and we nailed it just right. We were able to throw an arrow instead of just, you know. Guessing, you know, trying to feel around in the dark as to where to go.

[00:11:30] Tom Baldwin: Circling back to something you said earlier, you know, you trying to find sort of those practice projects. Was there A-A-G-P-S signal on the practice side or the business side where the outside of it, they started to really kind of perk up and realize the power of data?

[00:11:46] Mark Thorogood: I would say there is various ones of those and, and if you really look at like the analytics of the work to cash cycle one, that's more of an operational, but it does apply to a lot of practices.

[00:11:58] Mark Thorogood: You got engagement scores versus profitability where you're taking a whole bunch of activity data that's coming in either through Salesforce, through email, through, you know. The Phoenix badge system, people came on prem and you consolidate that and you're comparing it to outcomes. And they're starting to see the ways, better ways of, you know, how they can adjust their practice, how they can customize their practice to the voice of the client.

[00:12:27] Mark Thorogood: You know, what the clients purchase and habits are, what the client's needs are, uh, and they can prep for that. They can see it in an advance. What is the next logical matter that they would have

[00:12:40] Tom Baldwin: on the show? We like to profile firms that are early in the data journey and, and in legal. We, we do a, a decent job of sharing, you know, what we can for others benefit and.

[00:12:52] Tom Baldwin: In this section, I, I wanted to kind of pivot over to the choices you made, building out a lakehouse when there really wasn't a lot of understanding of what it was, particularly in the legal market. Things that you really, you know, feel great about. Like those decisions you doubled down on today, what you might do differently to the extent you can share.

[00:13:12] Tom Baldwin: So maybe just start with a high level, 'cause I think your implementation of a Lakehouse is probably the most unique of the ones out there of firms that have done it themselves. Maybe explain a little bit about what you did and then we'll kind of dive into more questions from there. But just for everyone listening, kind of talk about the platform you have today.

[00:13:30] Mark Thorogood: It's infrastructure as code that's in, you know, the inside of Azure, but we can move that to the Google Cloud platform or Amazon Web Services if need be. On top of that, we use Delta Lake Parquet files, you know, as our standard. It's all scripted in Python. Uh, SQL or, you know, YAML. Uh, on top of that.

[00:13:54] Mark Thorogood: And then we use, uh, either Databricks primarily or Microsoft Fabric for more to pay. Uh, Power BI, you know, data flows. There, there the, the fabric and, um, Databricks are both running the PySpark engine. And then on top of that you have Power BI and it's encompassed with a data virtualization layer. We happen to use denodo, and what that enables you to do is that you have a one stop shop to pick up your data.

[00:14:23] Mark Thorogood: Whether it's from an operational system or it's from an analytical system, they get one view into it. If you are using Power bi, obviously you're not going through the virtualization layer, you're just going through whatever the dashboard is there, and that also enables us to have a chat bot and the Gentech AI.

[00:14:39] Mark Thorogood: So we use a GPT-5 and we use the o3 GPT reasoning engine for the agent. Ai well, in, in addition to a vector database that we have to translate natural languaging into, um, something that's possible by machine. That's kind of a high level thing. I hope that made sense. I hope I didn't geek out on people.

[00:14:59] Mark Thorogood: Some people probably, no. This is

[00:15:01] Tom Baldwin: other people

[00:15:01] Mark Thorogood: like what the hell are you robot to,

[00:15:05] Tom Baldwin: you know, I think two years ago folks would've probably not had been exposed to a lot of those concepts, but I think today now firms understand at least. The fabric concept of a lakehouse, you know, Microsoft's done a great job of, of socializing that concept to even the legal market.

[00:15:24] Tom Baldwin: So folks are definitely more familiar with it. The Denodo thing, I think the data virtualization's not a concept that a lot of folks are familiar with, and certainly at least at ILTA, folks that hadn't heard of Databricks got a healthy dose of understanding where Databricks fits in and the power of Databricks.

[00:15:41] Tom Baldwin: As a supplement to what is native to the Azure data stack. Right. So all those nuances, I think firms are starting to appreciate more as an early, I mean, boy, we've been talking about this journey with, with your firm Mark for quite some time. Um, looking back on any of the choices you made, again, do, knowing that like you were one of the very first firms to do this, was there anything that you would've done differently out of the gate?

[00:16:06] Mark Thorogood: Oh, of course, of course. And, and I'll touch on that, but one thing I, I, I wanted to talk about the architecture and where the inspiration came from. Just briefly. I looked to the banks. I looked to the big four. Mm-hmm. You know, I didn't look to legal and if you look at what BlackRock, TPG and you know, JP Morgan Chase, bank of New York Mellon are doing, you'll see they've been using data virtualization since about, you know, 2005 on.

[00:16:34] Mark Thorogood: So that's where I got the inspiration for ad architecture. I looked to the banks, follow the money, if you will. Now, your question was, would I do something different? Uh, there was a lot of choices that we made that were good choices. We had a lot of people we benchmarked off of. As you know, Tom Fireman was part of that.

[00:16:52] Mark Thorogood: You were part of that. James and Nook was part of that. We used various consultants. We went to Gardner, um, we spoke with John Laley. If, if anyone read anything about the DAMA-DMBOK or which is the data analysis book of management. He's the governance czar. We also hired, you know, Bob Seiner, who was non-invasive data governance.

[00:17:17] Mark Thorogood: So we had a clear view on what the road ahead is and what other people did. But when they were talking to us, they were like. Man, you know, we don't have any law firm clients, you know, and, and I read John's, you know, Bob Seiner's book and I was like, I'm gonna give this guy a call and say if he'll do a consultant gig on data governance.

[00:17:35] Mark Thorogood: And he did. And it really kind of paid out. We had our operating model there, so that was good. What would I change? Kinda like that song. I want to go back. One of my favorite songs. I wish I could go back. But I know I can't go back. I know. Or something like that. I don't know the lyrics, but it the, the melodies in my head, I would provide a business glossary.

[00:17:59] Mark Thorogood: I started with a data dictionary, a data dictionary. The addressable market of a data dictionary is the geeks. The business glossary and showing the life cycle, like go to market, you know, intake to client, deliver the value bill and collect, productize the knowledge so you can four, speed it back into the market and have that cycle go.

[00:18:22] Mark Thorogood: And then what each component of that cycle means. Like say in the bill and collect, you have the houred work. The hour build, the hour collected, it's very easy. And describe these concepts that they wrestle with every day in very clear explainable terms. That's what I would do differently. I wouldn't have started with the nuts and bolts data dictionary stuff I, and try to factor that out.

[00:18:49] Mark Thorogood: I would've went from the top down instead of the bottom up.

[00:18:53] Tom Baldwin: Do you have an example of that concept of you? Have you built that out since then? Oh yeah, yeah. We have an

[00:18:58] Mark Thorogood: example. So what I just described, I would probably provide a data dictionary or, or a business glossary, excuse me, not a data dictionary.

[00:19:06] Mark Thorogood: That's what I would do differently, and I would start with something like this mean. This is the basic value to stream that we all share. And you can see go to market and to go to market, you're gonna have your CRM system, you're gonna have your proposal system, you're gonna have your firm's website, your intake system, or your open matters.

[00:19:25] Mark Thorogood: You're probably gonna have some like intap or or something of that type. And then you're gonna deliver to value. You have all your practice management systems, whether it's comp or foundation, ip, Patsy, wave, your DMS. Those are the things, and you gotta figure out those critical data elements that link 'em together.

[00:19:43] Mark Thorogood: Remember these things right here. The connective tissue is very, very important. And then you can have your billing and collect and then your productized knowledge, and then that goes back into you how you present yourself to the market, how you identify prospects, and you developing it and you nurture 'em into leads.

[00:20:03] Mark Thorogood: And then I would describe the systems underneath that. Next thing I would do after that, if I just took the billing collect, I would say, Hey, by the way, of all these things, all these data elements you typically experience, this is what you probably understand, the hour work. The hour build, the hour collected in terminology would be like WIP or AR receipts, and the metric is I either pace billability or collectability.

[00:20:28] Mark Thorogood: What is your percentage of, is this a good billability ratio or are you billing 80? 80% of your time, 90% of your time, if it's lower than 80, it's probably a problem. And then what are the sources for me to describe Whip? I just need these two tables and I only need a few columns. And then this way, instead of starting with the haystack, you start with the needles of the haystack and you give that to people.

[00:20:53] Mark Thorogood: So really an analogy, a sense, if you will, instead of giving people a dump truck of dirt. I'd give them the diamonds on a velvet backdrop. I don't want my users going through all that dirt to find the stuff that really is valuable to them, that that would be the difference. I would've started from the top down in addition to going from the bottom up.

[00:21:18] Mark Thorogood: So kind of like a big vice, but I started from the bottom up. Took a lot of time. Why did I do that? The bottom up is more concrete. It's easier. It's like Maslow's hierarchy. You start with the physical stuff, you go to the stuff that's more social and then you self-actualize. I would've started from the self-actualization and go down.

[00:21:40] Mark Thorogood: That's the one thing I would've done differently. I hope. Hope you enjoyed that.

[00:21:45] Tom Baldwin: That was great. No, that's awesome. Thank you, mark. It makes perfect sense every, a lot. I think a lot of firms struggle with how to ground themselves in that sort of initial jumping off point and they get, and they get their axle twisted up in a starting point and they think they have to, you know, to your point, boil the ocean when they really just need to find a lobster or two to cook and serve up on a platter and make it, you know.

[00:22:05] Tom Baldwin: Perfect. Exactly. Speaking of sort of, you know, okay, going back to your journey, building the lake house. You start to build momentum and then you realize you probably hit some point where there was a scaling problem and it could have been on the schema, cost control, taxonomy, governance. Was there one point where as you started to get more usage and adoption, that you hit that scaling point?

[00:22:29] Tom Baldwin: That again, had you known in advance that this would've been a problem or this would've been an issue? You would've addressed it earlier or simpler?

[00:22:37] Mark Thorogood: I would say. Yeah, there's APIs out there that are not gonna be as performant as the vendor will say, you are going to wind up API caching at some point in time for these reasons.

[00:22:52] Mark Thorogood: Let me enumerate the reasons 'cause they're so burned into my memory. Here's, here's my emotional baggage about APIs. One, they're slow. Two, they may not be dependable. They'll error out three, they may cost you money. Because every API call, especially at their rate, and they might rate limit you too. So you have all those issues going into play.

[00:23:16] Mark Thorogood: Generally speaking, if you're dealing with clouds and you'll be dealing with clouds, you'll be very, very motivated to CAPI results. It's just a reality.

[00:23:28] Tom Baldwin: Let's dig into that a little bit because I think sometimes it's an awesome point 'cause we see it on our side as well. I think a lot of firms struggle with the vision to move past three or or four, you know, source systems, like once they get the accounting systems, some, you know, a little bit of client data matter data, people data, they sort of, they can't see past that.

[00:23:50] Tom Baldwin: The API content really, especially if you start enriching, you know, docket data, let's say, or feeds from PitchBook or deal point data, you start to really get some interesting enrichment. Folks have a hard time with that. Can you give some examples of API sources without naming the ones that don't perform well, but just generally like the API sources and how you're using them.

[00:24:13] Mark Thorogood: I would say some, well, you know, we are getting dock data. We're getting data from the USPTO. We're getting data from, you know, PACER. We get data from other sources. We're looking at implementing the Sally standards. There's an API out there. Uh, we get data from, you know, actually a lot of API data from various places.

[00:24:38] Mark Thorogood: Salesforce in Hive, those all API, their cloud services that we're, you know, pulling the data back from.

[00:24:44] Tom Baldwin: Yeah. What's the primary use case for pulling in that API content,

[00:24:49] Mark Thorogood: usually benchmarking your data, making it richer. There's a lot of understanding. That you can gain from your practice and also you can start to compare your practice to other folks.

[00:25:02] Mark Thorogood: Mm-hmm. Um, there's a lot of value there. There's some things that are very, you know, very strategic, but I can say you can pick up a lot of market signals by harvesting that data. Seeing where you stand in the spectrum of a, say a business domain.

[00:25:19] Tom Baldwin: I'm just gonna pivot here a little bit. A lot of folks that've got an on-prem SQL data, M-D-D-E-D-W, whatever you want to call it, think that they can handle all these data requirements with an on-prem SQL data platform.

[00:25:33] Tom Baldwin: How would you respond to folks that have that position when you think about all the data enrichment and data integration and services you're providing?

[00:25:43] Mark Thorogood: It would be very, very difficult if you got a small use case. You might be able to get away with it. If you're trying to do serious data enrichment, you're probably gonna go to medallion architecture at some point in time where you're gonna have bronze, you'll have, you know, silver and gold.

[00:25:59] Mark Thorogood: We actually have copper that falls in between. So we have four. You don't have a platinum layer yet. Come on, mark. No, we don't have a, well, we might have a platinum layer. I don't know. It's coming. I'm never saying it's coming. I'm sure, but I'm not gonna have, I'm not gonna have layers sprawl. That's one thing I wanna push that against.

[00:26:16] Mark Thorogood: Feels like it. I,

[00:26:17] Tom Baldwin: I've never, I've heard of platinum layers, but I've not heard of copper.

[00:26:21] Mark Thorogood: Uh, we explain copper. Copper? Is it a combination of I think, bronze and some others? I don't know what the combination is. I'd have to ask whoever invented that term. Doing it on sql. No, that'd be very difficult. You, you'd have very limited capabilities.

[00:26:36] Mark Thorogood: If you have a a point solution, you might be able to get away with it. I wouldn't recommend it.

[00:26:41] Tom Baldwin: Let's talk a little bit about you. You've built up a team internally. Oftentimes when you build a platform like this in-house, you've gotta rely on some folks. How did you find those folks? How do you reduce kind of that single point of failure when you've got a a team that's building and maintaining and deploying and updating a product?

[00:26:59] Mark Thorogood: You find the folks through your social network primarily, or you can go to the market and then you gotta whittle through a lot of resumes and you'll probably make a lot of mistakes, but primarily focus on your social network to find good talent. We don't operate as individuals. We may, but you know, the people who are very successful in this world usually are very connected.

[00:27:22] Mark Thorogood: So that would be one thing. How do you guard against people leaving? People do leave. You have to standardize your systems. You have to make it simple. Simple is not easy though. That's where, you know, I learned from the military, you have the discipline to follow up and follow through. Not just get something working, but get it working right, and get it working.

[00:27:43] Mark Thorogood: So it's repeatable. You have to have those standards there. So it's interchangeable. It's Lego wise. So you can mix and match systems and you can also mix and match platforms and people if need be. I do celebrate people, so I, I don't want any of my folks to ever leave, but I know that that's not gonna be the case.

[00:28:05] Mark Thorogood: But you, how do you guard against that? It's like any company in the world. You just have to make sure that you have things that are supportable in the absence of a person.

[00:28:15] Tom Baldwin: Awesome. Thanks Mark. We're gonna switch gears a little bit. And I know sometimes it's hard as we get into some of these things, sharing really specific, the most, probably the most impressive projects and the things you're most proud of are the things that you might be able to talk the least about, but to the extent you can share what initiative made you think to yourself, okay, this is why having a data platform made a difference.

[00:28:38] Tom Baldwin: Whether it was a client en en engagement analytics project, a self-service dashboard, or something else. Is there something you could talk about that really was like that? You know, moment where you felt like, okay, we've really kind of flexed the muscle of our data platform in a meaningful way for the business.

[00:28:56] Mark Thorogood: I would say one thing is client portfolio budgets. Everybody talks about matter budgets, but if you have a large a FA or a large arrangement, you are gonna have hierarchically arranged budgets. You're gonna have budgets at the project level. You'll have at a work stream level, you'll have it at the portfolio level, and you'll have it at the program level, and the clients will stipulate that accounting systems.

[00:29:26] Mark Thorogood: In the legal industry generally support clients and matters, but once you go below that, how do you group all those matters together into an a, f, a arrangement and track it over time? You have to build some, uh, metal layer, some middleware, and that can be on your lakehouse to join all those bits together in the lens of the client.

[00:29:51] Mark Thorogood: And then if you put some AI on top of it, you can say, am I on time and on task with this budget? Which budget is at the most risk? It'll give you a reasoned response to it. So I would say that's the one thing that is much different because right now a lot of people do hacks to try to take a large portfolio or program and shoehorn it into a system that's not designed for that.

[00:30:19] Mark Thorogood: Knowing that our clients have systems that enable that. Like SAP, you know, your large clients don't use law firm technology. And that, that was the difference. So now we're able to operate in the style and the manner that our clients expect, and our clients trust what they understand, and if they see a system that reflects them, they'll understand it and you don't have trust issues.

[00:30:47] Tom Baldwin: Yeah. Mark. Um, when we think about the future of data analytics and AI and legal, I know you've talked quite a bit at, in various, uh, events about natural language. On top of your data in the next year or two, what's realistically possible for firms? You know, you've got things like chat style, retrieval with guardrails, rag on enterprise content, even agent like systems.

[00:31:08] Tom Baldwin: And before we get into your answer, I know there's no shortage of practice of law use cases around ai, but I think some of the stuff you've been talking about is more on the business side, which seems to be sort of an unlocked area of opportunity for firms.

[00:31:25] Mark Thorogood: Well, we operate on the business side and also the practice of law side.

[00:31:29] Mark Thorogood: I mean, we're using AI to understand various business cycles, you know, whether it's the tradeMark patent or litigation life cycles in addition to the work to cash life cycles that we see internally. So the AI is kind of universal. It can cut across all those different things. You just have to have the data that is kind of AI enabled.

[00:31:49] Mark Thorogood: And what I mean by AI enabled is typically put it into a dimensional model or. A graph model, you know, something of that sort. Now, one of the things I'll say, what is possible for law firms? I think what is going to really be the next wave of innovation is gonna be around agent AI agents taking questions that are very complex and providing very thought provoking responses that are defensible because the AI agent will explain how Cato's results.

[00:32:20] Mark Thorogood: And I'll say something that I did yesterday. Um, basically you said, analyze the legal work across all these practice groups for five years and tell me where we're growing or declining, and tell me which ones have the highest collection ratios of billing ratios. Tell me the trends. Over the past five years, that's what I posed to the agent.

[00:32:40] Mark Thorogood: Now the agent had access to all the data in the lakehouse and also some operational systems through our virtualization lane and said, go crazy with it. It produced a 36 page report that basically is like, Hey, these are the things that we would recommend. Here are some of the geographies, the industries and sectors that you might want to emphasize, and here are some of the things that you might want to adjust.

[00:33:06] Mark Thorogood: It was startling and it did it. Pretty quick. Mm-hmm. It did it in about 20 minutes. You know, the agent was doing these things and then the report, it showed me all the queries and its thought process and the rationale for why it selected certain data sources and not others. Now, in order to AI enable your data, you are going to have to put a lot of metadata around.

[00:33:29] Mark Thorogood: You're gonna have to describe your data. In ways that the machine can interpret, and then you're gonna have to give it some cues so it knows what to emphasize or what to go to under certain context. But without that context, the machine has no idea. It's, it's just like getting back to sale. And if people don't understand what you're doing, they don't have that context, they won't be able to operate with it.

[00:33:55] Mark Thorogood: And by the way, in my sailing, I got trained by Margaret Palmer. She's the one who trains the people at Annapolis in the Naval Academy. And she told me flat out on day one, my communication was horrible and I, I was mortified. And she was right. So the thing is, is you just gotta supply that context. Don't make any assumptions.

[00:34:20] Mark Thorogood: People won't know unless you tell them what to know. Same thing goes with the machine.

[00:34:26] Tom Baldwin: That was amazing, and I think our listeners would love to hear those practical use cases. I know for some firms the idea of a lakehouse feels like a far off future thing, but since you're already there, obviously that can't be the end state.

[00:34:40] Tom Baldwin: What's next? Where do you see the industry heading?

[00:34:44] Mark Thorogood: I'll say that we're gonna have more of a business emphasis because a lot of the technical details will be handled with the machine. That's one thing that I see that's gonna, 'cause you can do a lot of Python coding inside, uh, of the machine and have it do the coding for you.

[00:34:59] Mark Thorogood: I also see a lot of the semantical layers because we want to customize and tailor our products to our clients. So you're gonna have to give semantical lenses that put it in their terminology. We have a client that, you know, they don't understand what a matter is. They understand what a project is, so you have to make those translations so they don't feel lost.

[00:35:23] Mark Thorogood: I see virtualization growing. A lot of people are providing virtualization. I saw virtualization, you know, years ago we virtualized so many things. If you remember back to the Ts, we virtualized hardware. Mm-hmm. Right? There was all the VMware and then we virtualized networks, virtual private networks. We virtualized software, Java virtual machine, your JVN.

[00:35:47] Mark Thorogood: We even virtualized our currency, Bitcoin. And now you see all these people talking about zero copy, where you connect to data and you don't copy data. Don't heve the data around enable agents to access the data so they can combine it in context so that they can make very profound thought provoking.

[00:36:09] Mark Thorogood: Recommendations. So I see virtualization, really, uh, that's gonna take off along with the semantical layer 'cause they're kind of partners. If you see a lot of the virtualization, it's usually gonna be in the context of semantical, business facing layers as you translate the data into the tongue of the people who are more strategic.

[00:36:31] Tom Baldwin: That's awesome. Let's end with some practical tips for folks that are maybe on the fence about this or just trying to know where to jump off. We often talk to folks at firms that have a really well mature, established on-prem data platform that in their minds has worked well for years, if not decades, and they really don't see the reason to move to something more modern.

[00:36:52] Tom Baldwin: What would you say to folks in that position?

[00:36:54] Mark Thorogood: I would say the. There will be resistance to change. There's always resistance to change, and I'll just throw this out there. The only person in the universe that really wants change is a baby With a dirty diaper, we either change and evolve because the technological Darwinism is in play.

[00:37:12] Mark Thorogood: If people do not evolve, they're gonna go by the way of the dinosaur. That's the truth of the matter. You know, personally, I experienced a lot of resistant on my own team. What really amazed me is that when I was pushing forward into the future, the people who pushed back the most were in my own backyard.

[00:37:33] Mark Thorogood: I thought that they're my friends, they're gonna run with me immediately. And they didn't. They ultimately, they did, but it was much more challenging. Now also, I'll say this, what is the future?

[00:37:45] Tom Baldwin: Before you pivot off this, how did you get them over the hump? This is a really important topic. In your own backyard, you had folks that were resistant.

[00:37:52] Tom Baldwin: How did you, 'cause you can't do it through brute force, that's not sustainable. How'd you get them to get on board?

[00:37:58] Mark Thorogood: It was hours and hours of discussion over months and years and it really sucked the life outta me. But you can't give up. It's kinda like you might have someone in your family and you can see is like you gotta change and if you don't change, your life is gonna be miserable.

[00:38:20] Mark Thorogood: You can't give up on them. But you know what they're doing is not good for them. And I'm sure everyone has these personal experiences and I had that personal experiences. It was never easy. I don't want to go through it again, but if I had to, I would. Yeah, and I've, I, you know, I wish, you know what? I wish I had an answer for you, Tom.

[00:38:39] Mark Thorogood: I wish I had this answer. Here's the magic pill. This is super easy. I'd share with everybody. You guys love me. I don't even want to profit from it. I just want everyone to have better lives. It's just not that way. You're gonna have to put in the elbow grease. You're gonna have to go into the gym and do the reps.

[00:38:56] Mark Thorogood: This is the reality. I don't think there's any other way.

[00:39:00] Tom Baldwin: Yeah, fair. If you had someone who's like, Hey, I need to be able to show value in 90 days, how would you advise them to get started?

[00:39:09] Mark Thorogood: I would need to know more about their story, about, you know, 90 days one, i, I, I, I would sincerely make this recommendation.

[00:39:18] Mark Thorogood: I would look into the market and what the market offers. Because the people in the market are the bumblebees. They go across all the firms. They know what's going on and, and you know, Tom, we go way back. And what you have done is amazing. I would say give you a call. And you are not paying me for this pitch.

[00:39:41] Mark Thorogood: You didn't even ask me to say this. No. But I would say, give you a call. You want to see value. You have these ideas and, and what, the first time I met you was, what, 2010 or something like that? It's been a long time. So I stole the boil and the lobster metaphor from you, it was in your presentation.

[00:40:01] Tom Baldwin: I was wondering, I said that I thought to myself That sounded familiar.

[00:40:03] Mark Thorogood: Yeah. You s said that I could actually show you the deck that you, you gave me and I've been using it's, I still use it. It's still valuable. Now you, you're probably asking the question. It was like, should you go through a KPI? Should. You get a dashboard? It depends on what people need, but if you really want to figure out how to have that tying the value, go talk to the bumblebees.

[00:40:25] Tom Baldwin: Yeah. Yeah. Fair. Hey Mark, this has been awesome. I really appreciate your time. It's great to share some more stories and really, we all really appreciate you sharing your wisdom with the market. Thank you so much and, uh, there's another wrap to Overruled by Data. Thank you so much, Mark.

[00:40:43] Mark Thorogood: Hey, thanks Tom, I appreciate it.

[00:40:47] Tom Baldwin: That's a wrap for this episode of Overruled by Data. If this podcast resonated with you, if you took one or two things away from it, you want to hear more from law firm leaders that have been there and done that, hit the follow button.