407 Mallick
===

Kevin Folta: [00:00:00] Hi everybody and welcome to this week's talking biotech podcast by Colabra and thank you to Calabra for sponsoring this podcast. Now, over my career, I've grown up in the age of DNA and RNA. How do you sequence DNA to find novel variants? How do you measure levels of RNA accumulation to be able to infer the relationship between gene expression and maybe some biological trade or function?

Around the turn of the millennium, we started to use microarrays and these were tools that allowed us to assess. a large number of genes expressed at one time. So rather than going from one at a time, or maybe two or three at a time, if you're really wild we were doing 8, 249, I think on the first Affymetrix arrays.

And so this was pretty exciting stuff because now you could take a snapshot of any given tissue or any given developmental state and start to get an idea as to what's [00:01:00] happening inside that tissue in terms of gene expression. The problem is, is that We know from the central dogma of molecular biology that DNA and RNA are only the first two steps and that the accumulation of proteins and then the activity of proteins is really what matters in the cell, right?

There's really the shakers and the movers and the structural elements of the cell. So here we had an opportunity. To maybe look at a protein or two at a time. If you use different methods, maybe you could pick up a few more here and there, but being able to accurately quantify a large number of proteins or better yet, a global snapshot of proteins seemed like a bridge too far.

But today I'm talking to Dr. Pragmalik. He's an associate professor at Stanford university and co founder and chief scientist at Nautilus biotechnology. And he says, He can explain to me how you can get a global snapshot of [00:02:00] all the proteins in a given cell or tissue. So welcome to the podcast, Dr. Malik.

Parag Mallick: Thanks so much for having me 

Kevin Folta: today. Yeah, this is really cool because we spend a lot of time on this podcast talking about this. different solutions in either transgenic or other types of viral delivery, all kinds of solutions to problems. We haven't delved much into the tools of proteomics and how we can learn something from the proteome and use it as either diagnostic, a lot of things we could do with it.

We'll get to that in a second. So let's start with the beginning. What is a proteome? 

Parag Mallick: That's a great question and a great place to start. I think a lot of people, when they think of proteins, they think of that thing that's on their cereal box. And did I get enough protein today? And of course the reality is that there are for any given organism, there are potentially tens of thousands of different proteins that each drive an individual piece.

And so the [00:03:00] proteome is the complete set of proteins within a system. And when we think about measuring the proteome, There are lots of different levels of detail we can think about. We can think about what proteins are there, how much of each protein is there and is it modified in some way? And those really help us understand how the proteins are driving the behavior of a system.

Kevin Folta: Yeah, so it's kind of a snapshot, though, like of a given, say, organ, or maybe blood, or plant leaf, or... So you get all of the... population of the structural and catalytic proteins that are present in that organ or tissue or whatever at that given time. Do I 

Parag Mallick: have that right? That's right. And I think one of the key things that you're bringing up here is that the proteome itself is an incredibly dynamic thing.

And literally changing in every cell of your body in your bloodstream every second of every day. [00:04:00] And so typically when we talk about proteomes, we talk about things like the cellular proteome or the membrane proteome, the plasma proteome. And we do think about how does that evolve? How does that change over time?

But the tools typically. will measure from a given sample one snapshot in time. But of course, that's not, there's nothing to say. You can't take multiple drops of blood from a person and measure how they're circulating proteome changes over time. Yeah, 

Kevin Folta: this is really cool because in a lot of ways, you know, on the podcast, we've talked about things like various variants and DNA that maybe are predictors of disease or a gene expression that maybe, you know, the RNA is a predictor of disease, but here you're actually looking at the functional.

endpoints, the proteins themselves that are performing the work inside the cell or serving a structure in the cell. And so how is a proteome more useful in predicting or detecting different disease states? [00:05:00] Well, 

Parag Mallick: the analogy that I often give to the proteome in general, and I'll answer your question about disease states in particular in a second, is that, you know, the DNA is kind of like the blueprint for the house, the, the proteome really is reflective of the nails and the, and the, the studs and all the things that are actually doing the work in the house.

And so when we look at other measures things like the genome or the transcriptome. Those are pieces of a larger regulatory process that defines the behavior of the system and some, some aspects of that are very well predicted and controlled by the genome. There are other things that the, the major regulatory event happens at the transcriptome, but there are many, many, many events where the driver of the behavior of the system occurs.

Because of the protein, changes in the protein, changes in the localization of the protein, and they really are the driver of the behavior. So I'm, I'm as a systems biologist, [00:06:00] I don't tend to say that any one ohm is more important than any other ohm. But taken together, they give you a more complete view of how biology is driven.

Kevin Folta: Now, that's really good. I never really appreciated this as a guy who grew up on doing analysis of RNA. But really, when you look at gene expression, you would see and I kind of grew up during the time when we were just learning about ubiquitination. So we were really seeing these examples of high levels of R.

N. A. But no protein. And then other cases where you would see almost no detectable R. N. A. But then Gobs of protein that was trans, are translated from the little bit of transcript that was there and all of the idea that the protein quantitation can be happening independent of RNA and really can be a more informative predictor in lots of ways.

And, and, and. We I think we kind of overlooked that a little bit. Was that because it was harder to get a handle on [00:07:00] proteins at a global level in a meaningful way? Or like, what was the disparity 

Parag Mallick: there? Yeah. So so 2 great observations. So the 1st really about the difference between transcriptome and proteome.

And I think early on, There were a lot of assumptions made that, you know, if something was transcribed a lot, there was a lot of transcript around, there would be a lot of protein, and they'd be really strongly correlated. Now, over thousands of studies, we've seen that they're not. And biologically, that actually makes sense.

Why would biology go through all the effort of creating an additional layer of regulation if it were just perfectly correlated? And then the other aspect of that is, In order to be correlated, that would mean that every protein had the exact same translation rate and the exact same degradation rate.

And we know that there's information in variations in those. One of my favorite examples is a protein called HIF 1 alpha which is involved in, in [00:08:00] stress response to hypoxia. And it has exactly that phenotype. If you look at the amount of transcript in the cell, in general, it's really high all the time.

If you look at the amount of protein, it's pretty low most of the time, except under hypoxic shock. Where the cell turns off its degradation cascade and the levels of the protein will shoot up by an order of magnitude in a matter of seconds to minutes. And the only way that that regulatory mechanism works is because of a handshake between transcriptome and proteome.

Kevin Folta: Yeah. Yeah. So, so in other words, you want a lot of that protein around for when you need it. So you don't want to have to start it de novo. So you have this protein around that's just being turned over unless it's necessary. 

Parag Mallick: Exactly. So and there are a number of, of similar processes in biology. And so.

The question is, if this is such an informative key part of driving the biochemistry and cell [00:09:00] biology why is it that we haven't, why has it not been measured as extensively as genomes and transcriptomes? And I think the answer to that really comes down to the tools that are available to measure genomes, transcriptomes versus proteomes.

And frankly, we've seen over the last several decades a tremendous advance in the tools that we've had, particularly for genome analysis sequencing. RNA seq that have made it really straightforward for any biologist in the world to get access to genome and transcriptome. On the other hand, we do have powerful tools in proteomics, dating back many years, 2D gels as the starting point, and then mass spectrometry is.

Really a very important influential tool targeted methods. But what we found in general is that while incredibly powerful, these tools are [00:10:00] also fairly complex to use. And so they've been limited to a much smaller set of researchers and penetrance.

Kevin Folta: Well, how are things done traditionally when we talk about proteomics or either separation or detection? What were, how was it done? And what were some of the major roadblocks that really slowed the global assessment of proteins? 

Parag Mallick: Yeah, so I'm going to start with some of the challenges and and then dive into the methods.

The proteome is, is very different from the transcriptome. The, probably the first big distinction is just the range of concentrations that proteins can have. In general, transcripts span a dynamic range. The difference between the least prevalent transcript and the most prevalent transcript on the order of.

You know, maybe [00:11:00] three to four orders of magnitude with most things being present in a relatively small range. Proteins on the other hand Within a cell can span a range of seven to eight orders of magnitude things like transcription factors being present in a handful of copies and other things like actin cytoskeletal components or cell surface receptors, which may be present in in millions to tens of millions, hundreds of millions of copies per cell.

And then in blood, this is even more extreme. Again, you might have tissue leakage components that may be present in a handful of copies. And then you have things like albumin. That are present in MIG per mil quantities. So dynamic range of 10 plus orders of magnitude. So that that's really been a huge challenge of any analytical technology to both be incredibly sensitive and span a wide dynamic range.

It's just a very hard technical challenge. Another key challenge is just the incredible diversity [00:12:00] of biophysical properties that proteins have. DNA broadly is relatively similar to other DNA. But with proteins, you might have one that's really hydrophobic, really sticky, another one that's very charged, and then everywhere in the middle.

So that's number two in the major challenges of this union analytical technique that is capable of dealing with all of that diversity. And then number three really just comes down to what tools exist in nature. In nature, we have things like polymerases and that are able to copy and make a small amount of DNA into a large amount of DNA.

Unfortunately, with proteins, if you have one copy of something, that's all you've got, and there's no machinery for copying and making more. So you need tools that are able to sensitively detect very small amounts of, of, of proteins. So those three challenges are, are, have been have, have been really substantial and and have led to.

Why, [00:13:00] why the tools that we have today are often not giving us the completeness of the whole proteome. And potentially are a little challenging to work with so far. 

Kevin Folta: Yeah, I think I understand. And plus, a lot of the old technologies were dependent upon understanding what's already there so that you could make comparisons to what you're finding.

So it eliminated a lot of the novelty that may be there, especially in small copy number proteins, right? Is that true? Thank you. 

Parag Mallick: I think I think really you had different tools. So on the targeted side, for instance a lot of affinity reagent based methods would say, all right, well, I've built a catalog of some number of proteins and I'm going to measure those on the discovery side typically performing with mass spectrometry.

You always had a reference database, but that's true in genomics as well. But then you were wrestling with the challenge of how do I measure this thing that's present in one copy alongside how do I measure something that's present in a million copy? [00:14:00] And so that really was just a substantial barrier.

Kevin Folta: Well, what are the new innovations and how are protein arrays really changing the landscape of analysis of proteomes? Yeah, 

Parag Mallick: well, so I, I didn't, I didn't fully answer your question about the existing technologies. I'll just, I'll give a quick summary of that and then dive into what, what we're working on in terms of single molecule methods broadly, there are 2 classes of.

of existing proteomic methods. One is targeted methods that use either antibodies or aptamers to measure specific individual proteins and often assembled into panels of tens or hundreds of proteins at a time. And so really, Great technologies. If, for instance, you want to look at those proteins that are related to inflammation, there might exist panels for that.

On the other hand, you have tools like mass spectrometry, which have a [00:15:00] tremendous range of capabilities. And even within mass spectrometry, there are targeted methods and broad scale methods. And the general principle is to of the dominant workhorse for proteomic analysis is a set of methods called shotgun proteomics methods or bottom up proteomics methods.

where you take your proteins, you take your sample, you digest it into peptides. Inject those peptides into the mass spectrometer and then use the mass spectrometer to measure the mass of each peptide and then fragment that peptide and measure the masses of all the fragment ions. And from that information, you can then determine which peptide you're looking at, and then, and then based upon the intensity of the signal you get, you can infer how much is there.

And so the, the challenges there really are again, how do you measure so many things? How do you measure sensitively? How do you avoid biases? And those technologies in general are likely to measure the most abundant species [00:16:00] and have a hard time seeing the lower abundance ones. So that's really the landscape that we were working within and.

Again, mass spectrometers are amazing instruments, but they are they're sophisticated and, and the level of expertise and training required to run those workflows effectively is substantial. So the, the challenge then that we came up with was to say, all right, well, and really something that having been in the field for a long time, what if we were to, instead of trying to improve.

The existing tools, what if we started from scratch and wrote down, what are the set of criteria that you might want? In your brand new, newly created proteomics workflow, and you could write down a couple of those. You could say, well, sensitivity is number 1 dynamic range we want throughput. So we can crunch through a lot of samples.

And then, you know, we've talked a little bit about ease of use, but ease of use is really critical [00:17:00] in helping platforms move beyond analytical chemists and proteomic researchers. Into the space of the broader biological community. So that was, that was the framework that we used in starting. And the other key observation was, well, we potentially want to measure extremely rare proteins.

Proteins that are potentially present in one copy in our entire sample. That naturally leads you to say, well, I want to measure one copy of something I can't make more than I probably need an assay that is based upon single molecule counting single molecule counting just analytically is definitionally the most sensitive approach that one can take literally sitting there looking at an individual molecule and counting it.

And then. From a quantitative perspective as well, there's no transform to intensity or brightness or area under the curve or any of those things that can lead to distortion and quantitation. It's, [00:18:00] it's so definitionally single molecule counters are both as sensitive and and as accurate as one can achieve.

So, then the question is, all right, well, we want a single molecule counter based proteome assay. How do we actually do that? And the approach that we've been taking really has three key pieces to it. The first is the creation of a hyper dense single molecule protein array. Where we take all of the proteins from our sample and immobilize them on a very special nanopattern chip.

The second piece is interrogating those individual protein molecules over and over and over again with a series of, of affinity reagents that we've specially designed to to not be protein specific. And then the third part is a machine learning framework that takes that data and converts the patterns of bindings of these affinity reagents into a set of identities and [00:19:00] quantities.

of what all the proteins are in the sample. So that was pretty high level, but I just wanted you to have the framework to think about these three pieces. 

Kevin Folta: Oh, but this is the part that, that even is boggling my mind a little bit. And I'm, I'm no stranger to arrays on chips and things like that. Cause we've worked with the first Affymetrix ones that came out.

If you're talking about arraying out proteins, you're, you're taking a sample and then placing them onto a chip. Or some sort of a substrate where each individual protein is anchored into a specific spot on that chip. And then you identify it with an affinity reagent that then can be detected with. Yeah.

So, so, so how do, how do you, how do you, if it's an affinity reagent and where I'm getting into the, like the edge of my understanding of things here, you have to already have characterized that for a specific protein that it's going to bind to. Is that true? So you have to have, you have to have a catalog of affinity [00:20:00] reagents for the protein targets you're interested in.

Parag Mallick: So that's actually, so there, there are really two parts that were a little upside down and in the way that our approach works. The first, so when you think about your standard arrays, what you, what you typically have is you have many, many, many molecules. So your AFI arrays are typically measuring a hundred thousand million different molecules and you're getting your quantitation from the brightness of the signal.

And they're annealing onto a capture substrate that's capturing transcript X into location 1, 2 and transcript Y into location 1, 3. And that's, that's typical is that you would have your capture reagents down and then you would... flow your sample over it. In this case, it's upside down of that. We are taking every molecule from the sample and immobilizing it on an array.

And we're gluing it down. And doing that in a hyper dense super Poisson manner was a very hard [00:21:00] challenge to overcome to essentially create a giant chessboard where each cell of that chessboard had one individual protein molecule on it. And the way we did that was, was also a little unusual typically when the approaches that people have taken for that are really one of twofold.

1 is using a technique called limiting dilution, where you just dilute your sample a lot but then you end up filling most of your space with empty space and having Poisson deposition for the rest of everything. So, 50% of your locations that have anything would have multiple molecules. So that wasn't a great solution.

The other was to fabricate chips where the, the landing pad area was so small that you had steric hindrance preventing multiple proteins from, from landing. But the fabrication of things like that is very challenging. And we really wanted to measure billions of individual molecules. So the, the folks in the company came up with a really clever approach, [00:22:00] which was instead of trying to make the landing pad small we instead make the proteins big.

And what I mean by that is that we take each individual protein molecule and glue it to a very special nanoparticle that we've created inside the company. That is quite large on the order of 100 nanometers or so, but unlike traditional nanoparticles of that size, that would have hundreds of thousands of conjugation sites.

We've designed these nanoparticles to have exactly one conjugation site. So they, each nanoparticle can only hold exactly one molecule. And then the landing pads on the chip are the exact same size nanoparticle. So they can only hold one nanoparticle. And so that auto assembles into. A hyper dense, super Poisson, single molecule protein array.

Kevin Folta: Awesome. Okay. So that all makes sense. So now let's talk detection. So once you have this array with one molecule per cell, essentially you know, one, one piece in every square of the check chest board, right? Then how do you know which one is [00:23:00] where and how much, well, you know how much by counting. So how, how do you detect each one?

Parag Mallick: That's right. So as you pointed out, at the single molecule level, identification is quantification. If I can figure out what every molecule is, I get my quantification by just counting up those identifications. So to your point, I have these things glued down, and the glued down is really important. They are not loosely adsorbed.

They are glued down. They're not going anywhere. And how do I figure out, I'm looking at some particular cell, who are you? I'm going to carry your chessboard analogy a little bit further than you perhaps intended. But you know, so I have, I have my various chess pieces on the board and I'm trying to figure out you know, are you, are you a pawn?

Are you a knight? Are you a rook? And the way that one would traditionally do that is with affinity reagents. that are specific to that type of, of entity. So I might have the pawn antibody or I might have the knight antibody. And that's, that's fine. [00:24:00] But in order to figure out what everything is, I would have to ask a very large number of questions.

I'd have to ask one question for every type of entity that could be on my board. And that would work, but it'd be really slow. You'd have to ask a lot of questions. So the other very unusual aspect of our platform is that we don't ask protein specific questions. Instead, we ask questions that are much more general.

So you can imagine if I had my chessboard and I were looking at some square, I could ask a question that's pretty general, like, Do you have ears? Are you, are you tall or are you short? Do you you know, how curved are you and with those questions that are pretty general, they're not, they're not protein specific in any way, but you know, you can see how with that series of questions, I could narrow down pretty quickly to say, Oh yeah, you're probably a night.

Kevin Folta: Yeah. So, so, so you could essentially say, well, this one has this [00:25:00] hydrophobic run. You are likely a membrane protein, something like that. 

Parag Mallick: So we get slightly more specific than that, but that is the intuition is that we're going to ask a series of much less specific questions. And in our case, the questions we ask are about sets of three to four amino acids.

So do you contain the epitope HHH? Do you contain the epitope RWF? Do you contain the epitope ETR? And while any one of those questions is, you know, literally 10% of the proteins in the proteome will say yes to, the combination of several of them is shockingly specific. Now, 

Kevin Folta: see, that's really cool.

Now, I could not fathom how this could work. And now I totally get how this can work. That's a, that's a really neat trick. So when you say epitope, just for folks who are listening, who may [00:26:00] not, You know, be in this exact area. You're talking about the area in which the antibody recognizes what is the signature on that protein or there are multiple amino acid stretch that is present.

And if you find like you mentioned it, you know, how often do you find, you know you know, E. L. V. But it can happen often, but you don't always find that with other Triads of amino acids. So when you start to put it all together computationally, you can really figure out what's in each single cell 

Parag Mallick: of that.

That's exactly right. And that's, that's really where the machine learning piece comes in is you have this string of, of answers to questions. And then based on what you've observed, you then feed it into the machine learning algorithm and says, all right, well, based on what I've observed, yeah. This particular protein is compatible with that pattern of 

Kevin Folta: very nice.

Okay. So [00:27:00] let's take a break here. We're speaking with Dr. Parag Malik. He's an associate professor at Stanford University, but the co founder and chief scientist at Nautilus biotechnology. And this is collaborates talking biotech podcast, and we'll be back in just a moment. And now we're back on Collabra's Talking Biotech podcast.

We're speaking with Dr. Parag Malek. He's an associate professor at Stanford University and also the co founder and chief scientist at Nautilus Biotechnology. And we're talking about innovations in proteomics. Why proteomes are important and how can novel technologies coupled to machine learning help us better understand the proteome of an organism or of a tissue or a cell or of whatever we are interrogating.

And what's really cool about this is I bet this can work not just in animal cells, but in plants or bacteria or anything. 

Parag Mallick: That's right. It's a, because we're profiling, we're measuring [00:28:00] sets of three to four amino acids. That's applicable to any proteome sample. And so whether it's plants or bacteria, yeast human mouse it's really intended to be fully general.

Kevin Folta: Really cool. So when we're talking about this in, in the, in the parlance of, of disease diagnostics, that kind of thing, or I should, I should just ask you, when you talk about things like biomarkers, are there? Some specific examples you could give us of really nice biomarkers that can be detected that can give us early evidence of a specific disease state.

Parag Mallick: So I think that's a fantastic question and. Really, when we think about the landscape of biomarkers, there are a couple of different types. There are screening biomarkers, which are used to give early indication of a disease state. There are diagnostic biomarkers that help us refine what is the, what is the disease we're actually [00:29:00] looking at.

So, for instance, we might have a screening test that says, okay, you might have prostate cancer and then a... Further set of biomarkers to say is this aggressive prostate cancer is this benign prostate cancer may then have biomarkers to help stratify patients in terms of who is potentially likely to benefit from a therapy.

A further set of biomarkers when somebody is on a treatment to help us understand, are they responding effectively? Is this is this working? So we have the predictive to say it might work and then the actual monitoring to say it is working and then we may have another class of biomarkers, which is really.

For for tracking states of disease. So for instance, in multiple sclerosis, we may have biomarkers that tell us that someone's having an event. Same thing with, with heart attacks and strokes. So really to assess the state. And when we think about the challenges of developing those kinds of biomarkers oftentimes what we're wrestling with [00:30:00] is sensitivity of the measurement platform.

Some of these signals are really. Low in abundance specificity of what we're looking at. To what extent is this signal only only spiking when when somebody has cancer versus when they, you know, banged their knee or got a cold? And so how can we find those? That's really where the, the in order to, in order to find those, you need to have sufficient quantitative accuracy so that you can differentiate healthy states from disease states.

Thank you. You need sufficient throughput because some of these may cycle. And so we might, may want to look at a person over time and see how the chain, how the cycling rate changes or the spikes in the cycling. It may not be a single time point. And then to the specificity, proteins themselves also can exist in modified states.

So it may not just be that that we're looking for some protein to be up or down. But we're looking for a [00:31:00] specifically modified form of it to be to be altered. Or we're looking at a change in the molecular composition, the molecular heterogeneity of, of a, of a protein. So when we look at tools and we look at our platform in particular, really the.

The opportunities for addressing those kinds of challenges are one in the sensitivity side, being able to look at individual molecules. On the specificity side, we haven't talked about it, but one of the really fun things about looking at at individual molecules and and undigested individual molecules is that you can look at and say, Oh, this molecule is triply phosphorylated in this way.

To get at what's called the proteoform, which is the combination of splice variation and post translational modification. And there's a belief that those forms may be far more specific than the abundance of the protein alone. And the last piece really is about throughput. In order to really do these studies effectively, you need to be able to crunch through [00:32:00] large numbers 

Kevin Folta: of samples.

Yeah, this is, this is all really neat stuff. Okay, but if we go back to the question of detection, you use this triplet. Approach to identify amino acid sequences that can be assembled, essentially computationally to tell you who's in which spot on the chessboard. But we know from this last section, you mentioned modification and we know that proteins are notoriously I guess that's the right word are frequently modified either with.

ubiquitination or pro or even phosphorylation. You know, glycosylation, so many different ways in which you can modify a protein. So does this kind of detection tell you about modified forms of proteins or just the hollow protein itself? 

Parag Mallick: So, the answer is, is is yes, but in 2 very different ways.

So, the 1st aspect is that the platform can use most pre existing affinity reagents that[00:33:00] particularly for Western applications on within the platform. So, for instance, you're off the shelf antiphosphotyrosine reagent. is perfectly usable within the platform to assess is, is a given molecule phosphorylated or not.

Now, unlike, it's not able to access the hundreds of different individual types of modifications, you know, methionine oxidation, for instance, would be really challenging. There are, there don't exist affinity reagents for that. But for, for the big things phosphorylation, methylation, ubiquitination, et cetera they're really good targeted affinity reagents that could be used in concert with the, the, we, we call the multi affinity probes, the, the short epitope targeting affinity reagents.

So, so that's 1 aspect. And then I guess the other aspect is in a purely targeted manner. If you had a protein of interest for instance, we have a project studying tau, which is a really important [00:34:00] protein in Alzheimer's disease. There exists dozens of affinity reagents targeting individual isoforms of tau or position specific post translational modifications.

And so through the combination of those, you can really say, oh, this is the 2N form. It's been modified at position 181 with a phosphorylation, it's been modified at position 314 with a methylation and really get into a level of detail at the individual protein molecule level that we just haven't seen before.

Kevin Folta: That's really cool. And so when we start to kind of turn the corner here and talk about application, you really can start to come up with specific questions to interrogate a family of proteins like you say towel or, you know P 53 and other tumor suppressors, things like that. You can really start to drill into that.

And take snapshots of cells to understand the process that you have in question. So help me with some applications of this, like where has it been used so [00:35:00] far? Or where is some really exciting application in something like personalized medicine? Yeah, 

Parag Mallick: absolutely. So I want to clarify up front that we're still a development stage company.

So so we have not it's not available for purchase as yet. So where we've used it has been in part of our of collaborations that we've pursued generally, the places where we think that the platform is going to have the most impact. Again, to your prior comment, it is a general purpose tool that should be usable across the spectrum of life science research.

But the early applications that we're targeting are first off in therapeutic development. One of the big challenges we have in, in therapeutic development is just finding great targets. You want things that are very abundant in your disease tissue and and ideally non existent everywhere else in the body.

And so that, that therapeutic window that differentiates your [00:36:00] disease from your your everything else. Is a major driver of toxicity. And so if you can find targets that have larger therapeutic windows, even if they're lower in abundance, they could be really valuable. So that's, that's really the first place that we think the application, the technologies can be applied is for finding, finding new targets.

Downstream of that is a series of mechanism of action studies to really say, all right, I've got a lead compound. It's targeting this particular protein. When I hit it with this compound, what happens? How does the cell respond? What pathways change? And accordingly, are there model systems where the pathways change in the way we want?

Are there model systems where nothing happens? And beginning to understand as part of that mechanism of action where, where a therapeutic is effective and versus not. Which is a stepping stone to the next piece, which is really in biomarker discovery and finding markers amongst those classes. I outlined before about therapeutic response [00:37:00] and effectiveness.

So we think those are really the 3. Three capstone applications for the for the platform in therapeutic development and biomarker discovery. 

Kevin Folta: It's really neat. So what are some other maybe reaching guesses as to places where this kind of technology may be useful? 

Parag Mallick: Well, so we've I have to say this has been so much fun.

Because sharing the platform with people and getting their thoughts on, Hey, I'd love to use it for this, the, this has been so many different things that I never would have anticipated probably. My favorite extreme application of this is I was chatting with, with some folks from Impossible Foods and and they're like, Oh, I'd really love to study hamburgers to you know, understand what, what the protein composition is so we can, we can make better, more, more realistic artificial hamburgers.[00:38:00] 

Kevin Folta: That's a pretty, I mean, it's a, it's a reasonable application of this because if you're getting the protein mixture correctly, correct, that, you know, it's the way that you could mimic the, the, the real deal there. So that, that's, that's pretty interesting stuff. So, what about in in, in medicine? Are there any particular places where understanding the proteome?

Is something that could give an early diagnostic or early heads up to the progression of a specific disease or, or maybe even phrase this differently, something like Alzheimer's, where you don't really have a solid way to diagnose until people are symptomatic or, or postmortem, are there potentially ways that this kind of thing could be used there?

Parag Mallick: Yeah, absolutely. I think, I think. The and Alzheimer's is a really great example. There are proteins like tau, like neurofilament light chain NFL that have been implicated. But NFL in particular, it's, it's really a marker of. [00:39:00] Tissue damage. It's not a marker of a molecular change in a subset of cells.

And so when we think about in the biomarker space, where this is most most effect, most exciting initially, I do think. In cancer is a natural place. We know that the cancer cells shed proteins, potentially specific forms of proteins into, into the circulation. And they do that from when they're very small.

And having the sensitivity to see those shed proteins at their earliest stage would be tremendous in Alzheimer's disease, as we talked about tremendous opportunity there. And then I do also think that we have, there are a number of other diseases that really are, are protein driven diseases things like amyloidosis come to mind a lot of immune disorders where there really haven't, we haven't seen strong genetic associations.

And so being able to look at the proteins that are really the [00:40:00] driving the phenotype of the disease. I think are huge opportunities. 

Kevin Folta: Well, sure. I think of dozens of them. I mean, you know, Huntington's disease prion related disorders. There's so many or all kinds of different dementias are based upon the barren accumulation of specific proteins that if detected early, maybe would give more opportunities for therapeutics because people really don't get treatment until you start to see symptoms.

And at that point, the process is well underway. And so this kind of thing is really exciting to me to think about how we may be able to benefit from it. What are some other really interesting potential applications? Is there anything else that really comes to mind where proteomics can inform us about, you know, biomedicine or climate, the environment, something like that?

Yeah, well, I mean, 

Parag Mallick: I'm, I'm gonna, I'm gonna lean on you a little bit. I think there are tremendous opportunities in, in plant science for, for really dialing into the proteome. I'm not certain what exactly drives the [00:41:00] flavor of a strawberry. But I would have to believe that there's a protein component to it.

Kevin Folta: Yeah, there's a whole bunch of them, actually. You're taking basic metabolites and converting them into specific secondary metabolites with the activity of a couple of catalytic steps. And then those products are broken down on the other side. So you're creating something and maybe not getting rid of it to accumulate a flavor compound.

So the... The presence of a specific protein variant is something that could easily be assessed using a tool like this. That may be very important and and ascertaining how something is going to taste, especially because some of these things are perhaps, as we're learning, maybe regulated through common transcription factors.

So, yeah, maybe something there. 

Parag Mallick: Yeah, really interesting. And then I think, you know, about other aspects like drought tolerance, disease tolerance. What are [00:42:00] the, what are, what are the molecular distinctions that are driving that? What are the consequences of an infection on a plant? Can we recognize it?

Can we see it? Again, things that are harder because our phenotypic consequences, the thing died often. And can we maybe get some additional detail that can help us to detect these things sooner to understand variants and the consequences of those variants. So I think tremendous opportunity there as well.

Kevin Folta: Well, this is all really cool. I, I love this kind of stuff for this podcast because it gives us a hint of the cutting edge of technology that to be honest, I didn't understand this just by reading the website and thinking through it a little bit. I didn't go deep into trying to understand it, but it wasn't really clear to me exactly how it works.

Now it's crystal clear and I can start thinking of different applications for this, but you're doing this as a function of a Nautilus biotechnology, right? This is who is going to be Well, tell me about what stage the company is at and when this kind of [00:43:00] technology may be available to the average person either as a service or as maybe a kit or a core machine you might have at your university.

Parag Mallick: Yeah, absolutely. So as I mentioned, we're a development stage company. We've entered into a few a few partnerships, collaborations with folks like, like Genentech and Amgen and MD Anderson, TGen and then just last quarter, we Oh, sorry. Announced our first access challenge, which we threw it out to the scientific community and said, Hey, what would you want to do with this?

And we'll, we'll sponsor a few of these applications and, and got some really exciting, interesting projects. So, we we had 3 winners of the 1st access challenge, which we're looking forward to working with over the course of this year and then opening it up for other folks to kick the tires in.

And a formal early access period which will be probably end of this year, early next year in preparation [00:44:00] for for bringing the instrument and making it available to folks next year. Super 

Kevin Folta: cool. So where would people find out about that next challenge? Because I know 1 scientist who's very interested.

Parag Mallick: Yeah, absolutely. So, so I think On, on our, on our website, there's, there's a great place to, to inquire for information. And we also are, are pretty prolific in sharing information about, about the challenges and access programs there. 

Kevin Folta: Very good. So what, what's the website or an, are you available somewhere on social media?

Parag Mallick: Yeah. So the website is www. nautilus. bio. And then we're available on social media at Nautilus bio on Twitter. And then also Nautilus biotechnology on LinkedIn. 

Kevin Folta: Very good. I I totally, totally dig this. I was a little bit maybe intimidated about asking questions about proteomics cause it's not my, you know, my cup of tea per se.

But I really appreciate the power of it and, and I'm very excited that this kind of thing exact exists. So, Dr. Prag [00:45:00] Mallek, thank you so much for joining me today on The Talking Biotech podcast. Yeah, thank 

Parag Mallick: you so much for having me today. It's really been fun. 

Kevin Folta: And as always, thank you for listening to Collabora's talking biotech podcast.

Share with a friend who maybe doesn't know much about proteomics and introduce them to the new cutting edge, because this is something that may have tremendous applications, not just in medicine, but also just in identifying snapshots of biology and what makes biology tick. What's really cool about this is that it really may be applied as microarrays were applied back in 2000.

This seems to be right at the next cutting edge of how we'll understand how biology works. So this is collaborators talking biotech podcast, and we'll talk to you again next week.