Explore the evolving world of application delivery and security. Each episode will dive into technologies shaping the future of operations, analyze emerging trends, and discuss the impacts of innovations on the tech stack.
00:00:05:04 - 00:00:33:00
Lori MacVittie
Hey, you're tuned in to Pop Goes the Stack, where we explore emerging tech like it's a cursed artifact. Curious, powerful, and probably going to break something. I'm Lori MacVittie, and I brought backups. I also brought my co-host, who were still trying to figure out if he's AI or not, Joel Moses. And today we have a returning guest from earlier episodes, Dmitry
00:00:33:00 - 00:00:35:18
Lori MacVittie
Kit. Dmitry, welcome.
00:00:35:21 - 00:00:36:22
Dmitry Kit
Glad to be back.
00:00:36:25 - 00:01:06:24
Lori MacVittie
Awesome. Well, you were perfect as a guest for this particular topic, because today we're talking about, well, of course, we're talking about AI. But, you know, everyone else is chasing generative AI for flash, for productivity gains, all the all the good things. But the quiet revolution, the one that enterprises can, you know, actually monetize, use, apply more broadly is on the predictive side, right.
00:01:06:25 - 00:01:36:04
Lori MacVittie
Traditional, classic, you know, the the words we use to describe the AI that we've been doing for a long time before LLMs made their their big splash. And recently, there was a company and they used, you know, more a more traditional approach to uncover the right formula for a paint that can passively cool buildings by up to 20 degrees. That's cool.
00:01:36:04 - 00:01:42:15
Lori MacVittie
Just paint it and it gets cooler, like, I'm, I'm on this. I know you were in the code, Joel, and I know.
00:01:42:22 - 00:01:58:09
Joel Moses
Hey, look, you know, all you got to do to get me to a party is talk materials engineering, and I am so there. You know, before we get started, I think it's really important to realize the importance of what we're talking about. So the story came out, and, you know, this, this paint helps make buildings 20 degrees cooler.
00:01:58:09 - 00:02:20:07
Joel Moses
And, it's it's a breakthrough in terms of the material that was discovered that could actually do this. But, let's, let's reverse course just a little bit and talk about the early part of the 1980s, when, through a great deal of pain and discovery process, a lot of trial and error, they discovered a new, chemical formulation called lithium cobalt oxide.
00:02:20:07 - 00:02:42:23
Joel Moses
And that became, the the formula that that was incorporated into what we now know is lithium ion batteries. And those have changed the world. I mean, the density of lithium ion batteries has enabled us to to drive cars that are electric and, you know, have good range and that sort of thing. But the discovery of lithium cobalt oxide was almost by accident.
00:02:42:26 - 00:03:05:05
Joel Moses
And so, so moving material science from trial and error. I'm looking for something with a specific set of properties that has a specific valence charge or has a particular characteristic is, is a lot of trial and error on the part of the experimenters. And the great thing about artificial intelligence is it doesn't ever get tired of the churn of experimentation.
00:03:05:07 - 00:03:18:22
Joel Moses
And, and that's kind of what we're going to be talking about today, the revolution in materials engineering brought on by something called generative adversarial networks. Dmitry, what what what is a GAN?
00:03:18:25 - 00:03:54:28
Dmitry Kit
Well, and just to follow up on that, so, a lot of these things do come through experimentation. And taking that idea that you mentioned, before this paper, the way that scientists looked for these materials was by defining the design space. Like what, you know, what materials are possible? And then searching through the space, and then simulating the results to see if, if, if it produces the kind of things that they want.
00:03:54:28 - 00:04:29:15
Dmitry Kit
And as they produce more and more of the things they want, they start focusing more and more on that space. And in the paper they mentioned using genetic algorithms, right? Which is sort of the way to search the space. And, and that's usually a slow way to do it, right? The approach that was taken in this paper is incredibly novel, in which they trained, a GAN, a generative adversarial network to produce designs based on the kind of properties you want.
00:04:29:18 - 00:04:59:01
Dmitry Kit
So instead of searching in the design space, they're saying, look, we want materials that have this kind of reflect, reflect heat in these kind of, in this kind of manner. And then, they can sample from GAN, to, to generate a whole bunch of possible materials that do that. And then, taking a few of those samples, they can simulate them and actually see if it is in fact producing the kind of properties that they want.
00:04:59:03 - 00:05:27:26
Dmitry Kit
And by training the GAN to, to produce these things really well, they were able to generate 2000 potential designs a second.
Joel Moses
Wow.
Dmitry Kit
And so and, and because they already knew that the GAN was already in the right spot for the properties that they wanted, they didn't have to search all 2000. They just could sample a couple, and they did produce real world materials.
00:05:27:26 - 00:05:33:29
Dmitry Kit
And they demonstrated that the GAN was actually, was actually, in fact, producing the things that they expected.
00:05:34:01 - 00:05:35:21
Joel Moses
Wow. That's that's kind of impressive.
00:05:35:22 - 00:05:58:06
Lori MacVittie
Well, now you still have to explain. So back when I was in college and granted, like it was a long time ago, there was just neural networks, NNs, that's all there was. Now there's GANs, CANs, CNNs. Well, all right, there's there's a ton of them. So we're talking about GANs. So what is a GAN specifically.
00:05:58:06 - 00:06:30:20
Dmitry Kit
Right. So a GAN is a neural network that, and then particularly they use conditional GANs, which you give it the label which is the properties you want, plus some noise factor. And initially a GAN produces some random output, because it hasn't been trained. There's random weights in this neural network. And essentially because you're introducing noise as part of the, of the input. It's just going to produce noise on the output.
00:06:30:22 - 00:06:54:13
Dmitry Kit
The other, so what they did have, and we haven't talked a little bit about this, but they trained a neural network to do forward simulation. So given, the design that this GAN produces, it can actually tell you whether the property, what properties this material will have, or predict it. And so, so that's one of the signals you have.
00:06:54:16 - 00:07:27:02
Dmitry Kit
The other one is what's called the discriminator. And that's part of the GAN, usual GAN framework. Which is, the discriminator is supposed to say whether the input it's receiving, is that coming from real data or fake data generated by the by the GAN. And so what the authors did was they combined the prediction of the neural network that said, here's the, you know, here's the kind of properties we expect this design to have. And the discriminator that says, oh, man, this is completely fake.
00:07:27:02 - 00:07:46:24
Dmitry Kit
Like this, this doesn't come from a real distribution at all. To then go back and say, okay, how do I modify my outputs such that I trick the discriminator to think that I came from the real data set, as well as match the properties that the neural network is predicting to the properties that were given to me in the first place?
00:07:47:00 - 00:08:12:12
Dmitry Kit
Right, because that's what I'm trying to recreate. And so, using backpropagation, centered neural network training paradigm, you change the outputs of the GAN such that it becomes, looks like, starts looking more and more like the thing you expected to produce. But because you keep adding the noise at the other end, it's never going to be the same thing that it produces.
00:08:12:14 - 00:08:22:00
Dmitry Kit
But it's supposed to be in the kind of the same space, right, of designs that you expect to, to get for those kind of properties that you expect to get.
00:08:22:03 - 00:08:51:29
Joel Moses
Right. I heard a statistician friend of mine call GANs Fight Club for numbers, which I thought was kind of an interesting way to put it. But it's, it's essentially it's essentially creating out of, out of, out of randomness given, guided by a set of parameters, things that adhere to those parameters most closely by, by, first of all, perturbing the input and then and then running itself against itself, so to speak. Which is is an interesting way to do it.
00:08:52:01 - 00:09:13:12
Joel Moses
Now, this technology is actually released into open source. I believe it's on Microsoft's site. It's called MatterGen. And MatterGen does, looks like it does a number of other things. The technology the researchers are talking about is generally usable for all sorts of materials engineering. So, if, if it can formulate patents, what else,
00:09:13:15 - 00:09:17:14
Joel Moses
what else do you think it can do? Dmitry?
00:09:17:16 - 00:09:43:03
Dmitry Kit
Well, this is the interesting, the interesting part of this work is that they train the simulation net. Right? So the problem with, with exploring the space, traditionally, is that you still need to confirm a, you know, confirm that you are producing what you think you're producing.
Joel Moses
Right.
Dmitry Kit
And that takes time and usually takes fluid dynamics, or whatever it is.
00:09:43:06 - 00:10:12:29
Dmitry Kit
So here they, the way they designed the design space, allowed them to train a simulator that could evaluate lots of designs very quickly. And I think that's applicable potentially to any kind of physical simulation system. So material science is definitely, I think that opens up a lot of doors for any kind of material science research, which I think is what makes this an amazing work.
00:10:13:01 - 00:10:34:27
Dmitry Kit
But we can think about any other place where, you know, fluid dynamics or any other kind of physical systems, where right now we have to run these really large, you know, graph, simulations on graphical cards to try to predict what's happening. Maybe we can, you know, train neural networks with the right design.
00:10:34:29 - 00:10:52:28
Joel Moses
Yeah, I think that's an interesting thing to note. The, they call it the flywheel approach. So you have MatterSim, which generates things according to the, the attributes that you want them to have. So it generates a guess at what the molecular structure might look like and then MatterSim, which goes through and says does it really do that?
00:10:53:01 - 00:11:20:11
Joel Moses
And both of them use different branches of artificial intelligence in order to get to that. The, I believe that given the design specifications, they are they are seeing relative error below 20%, which means that for the most part, the bulk of what is being generated is something that has those chemical properties, which is remarkable. If, if you can translate this into other domains, it's going to have, you know, a profound impact.
00:11:20:13 - 00:11:37:20
Joel Moses
We've already seen this site, this type of approach being used to formulate new drugs like, like mRNA, for example. But, the design of batteries, fuel cells, digital signal processors, all of them can use this, this mechanism. It's it's an exciting time.
00:11:37:23 - 00:11:58:17
Lori MacVittie
Well, who who else could use this approach? I mean, fundamentally, the difference between what they're doing and what an LLM doing is an LLM, right, is predicting words. And this is predicting certain properties and materials. Which Joel will go on about if we let him. He really will.
00:11:58:19 - 00:12:00:01
Joel Moses
Super exciting. Super exciting.
00:12:00:01 - 00:12:33:21
Lori MacVittie
We'll have a special episode just for Joel to talk about material science. But in the enterprise, most people aren't designing paints or material sciences, right? They they buy their paint and they they buy their batteries. But there are uses for this approach, as opposed to just relying on an LLM, in the enterprise. But the question is, so what kind of data or problem set if I, you know, in IT might be well served by this kind of approach versus just a chat bot?
00:12:33:24 - 00:12:34:22
Lori MacVittie
Dmitry?
00:12:34:24 - 00:13:08:05
Dmitry Kit
So, obviously cybersecurity is, is is more of my area of expertise. And when we look at those signals, like packets and bits. They contain structure that's not necessarily language, or at least not necessarily in the form that the large language model was trained, you know. So including like tokenization, our even if we look at strings in terms of, like what's being executed or what's be what requests are being made to a web server.
00:13:08:07 - 00:13:53:02
Dmitry Kit
That data has a very specific structure, and it actually is quite limited compared to the space of the entire English language, or any language. And so it is possible to come up with very efficient and effective solutions by starting from the first principles of, you know, here's what my data is, here's how it's distributed. And then building up a solution, even in this form, right, of, and we use data encoders and, you know, generative AI to, generate to, to generate signals like the signals we expect to see in the systems, that we are monitoring, so that we can, we can then detect when we are seeing
00:13:53:02 - 00:14:30:06
Dmitry Kit
things that are not the kind of things that we've seen in the past. So kind of anomaly detection of type approaches. I will also say that reinforcement learning with human feedback actually uses this idea of training a simulator, getting people to label things that's expensive and doesn't quite scale. So when you get a, enough people to label enough things, you can train a neural network to behave like people. And then you can
Joel Moses
And label things.
Dmitry Kit
can reinforce a large language model with, with a neural network that runs a simulation of a person at the other end of it.
00:14:30:06 - 00:14:38:01
Dmitry Kit
So this idea of training, you know, a stand in basically for a simulation is an incredibly powerful idea.
00:14:38:03 - 00:15:05:24
Lori MacVittie
Especially in, in security. And in, and in fact, right before we we started actually recording, we were kind of walking through some of those, those things. And Dmitry basically had an insight. He didn't say it in the same words, but it was it was his insight, really, that, you know, one of the reasons that LLMs work well with a lot of web traffic and APIs is because it is language.
00:15:05:26 - 00:15:42:26
Lori MacVittie
Right? So you can you can see it. But it is a more limited set. Right? Because it's structured. HTTP has a structure, JSON has a structure, data has structures. The key, I think, as we move forward to agents and agentic that start actually using LLMs to generate things and ideas and conversations is that's no longer structured. So you almost have to work toward an approach that leverages both in order to advance cybersecurity in the AI future that we're all heading toward.
00:15:42:26 - 00:15:44:15
Lori MacVittie
Right?
00:15:44:17 - 00:16:14:02
Dmitry Kit
That's yeah, That's that's good. And and another way to put this is that large language models are incredible. They have a lot of knowledge. But when we start working with very specific applications, there's a lot of structure that we can use to, to reason about that data. And if we come from the perspective of a large language model, then we're trying to get the large language model to emphasize those structures, to focus on those structures that you want it to reason about.
00:16:14:02 - 00:16:36:12
Dmitry Kit
Like, this is an attack, this is not an attack, or whatever it may be, in the application. Whereas if we go from the other perspective, we can say, let's actually like already start from the structures that we have and encode them into the, into the, into the machine learning models. And then have them really focus in on, on that.
00:16:36:15 - 00:16:44:00
Dmitry Kit
So, and I think it's both. I think at the end of the day, the answer is it's probably both. definitely.
00:16:44:03 - 00:17:05:17
Joel Moses
Yeah, definitely. Well it's great. And you know, just just to put it, not to put too fine a point on it, but the model that was used to to create the this new paint, was roughly 600,000 parameters. It's not a huge model. And it's just because of what Dmitry's talking about. They applied a focus on the on the characteristic that they were really interested in.
00:17:05:17 - 00:17:29:10
Joel Moses
Not a general language characteristic, but, the way that interconnections work with materials and molecules. And, and that model can be relatively small and produce, I believe the, the, the, the known error rate there was 20%. Which, which means that your search space for the property that you're looking for is, is diminished hugely, by generating things that apply to it.
00:17:29:10 - 00:17:33:09
Joel Moses
It's, it's a, it's an exciting it's an exciting space.
00:17:33:11 - 00:17:40:09
Lori MacVittie
Yeah. And they can probably run on a lot less powerful hardware as well, right?
00:17:40:12 - 00:17:48:11
Joel Moses
Absolutely. I, in fact, I think the Microsoft package says that you can run it quite well on Apple Silicon, which means MacBooks.
00:17:48:13 - 00:17:53:15
Lori MacVittie
Hey. Well, you know, I got one. I'll I'll try it out sometime.
00:17:53:18 - 00:17:57:03
Dmitry Kit
I believe they used the 3080 NVIDIA, so.
00:17:57:06 - 00:17:58:27
Joel Moses
Very nice.
00:17:58:29 - 00:18:23:10
Lori MacVittie
Well, you know, it's it's getting late and this has been a great discussion. I learned a lot, including that Joel really likes materials science. I didn't I didn't know.
Joel Moses
It's lovely.
Lori MacVittie
But you know what, you know, if there were three things you wanted to leave listeners with, and they're not all material science folks, right? Yeah, I think they're probably technologists. Like, what three things did you learn
00:18:23:11 - 00:18:25:25
Lori MacVittie
you think they should walk away with? Joel?
00:18:25:27 - 00:18:53:09
Joel Moses
Well, I, I, first of all, LLMs are good for more than just chat bots. I think that that that we kind of gravitate towards the thing that we interact with most readily. And that would be a chat bot, but, generative adversarial networks and, and reinforcement learning and things like that, which are not quite as sexy, at least from an interactive perspective, are still not only usable, but we're finding new breakthroughs with them all the time.
00:18:53:12 - 00:19:19:00
Joel Moses
The other thing is that the technology is is broadly applicable to things to to help people design and find a way to search large amounts of data very quickly for things that have less error. And that that's huge in the area of cybersecurity, the design of of new drugs in the pharmaceutical industry. We haven't begun to see what what has happened.
00:19:19:02 - 00:19:36:17
Joel Moses
And as I mentioned, in the 80s, it was the discovery of one lithium ion salt, over the space of years and years of research into battery technologies that broke open the ability for us to drive electric cars today. So I'm excited for what happens in the future with AI on this.
00:19:36:19 - 00:19:42:23
Lori MacVittie
Cool. Dmitry? You know what, what is it you think people should take away from this?
00:19:42:26 - 00:20:03:03
Dmitry Kit
So, I think, as we got into deep, deep learning, we sometimes forget that the data is important, and understanding what the data is and understanding what your problem is. Especially when things don't behave in the way you expect them to. Sometimes it's important to fall back on, well, you know, what am I feeding it?
00:20:03:05 - 00:20:30:12
Dmitry Kit
Like what, like what am I even inputting into the models? So I really like this this collaboration that these authors had with multi-discipline, with computer scientists and so on, that, where they really understood what their problem is, created a design space that really made sense. And that allowed a lot of these machine learning tools to just eat it up and produce amazing results.
00:20:30:14 - 00:20:33:27
Lori MacVittie
Awesome. Once again, garbage in, garbage out.
00:20:33:27 - 00:20:35:03
Joel Moses
You are what you eat.
00:20:35:05 - 00:20:50:19
Lori MacVittie
There you go. Yes, you are what you eat. I like that. Well, that's a wrap for pop goes the stack. If you made it through unscathed, hey, subscribe and share with a friend who loves danger. We'll poke the next gremlin very soon.