348 Uhlen
===

[00:00:00] Hi everybody. And welcome to this week's podcast. When we think about genomics, we usually focus on the genome, the DNA blueprint that resides in every cell. We try to connect it to various diseases, important traits and predispositions to disease, whatever. But the DNA is simply the instruction manual of the machine.

And the grand challenge of genomics is how to connect that parts list to the function in the complicated machine of the human body. Now there's ways to do that by functionally studying in analyzing the effects of single genes on various traits. We won't talk about that. Another way to think about this is to understand when and where the parts are in the machine.

The parts are the proteins, the catalytic instructural molecules that are the major components of cells. And they're the central drivers of gene expression, biochemistry, metabolism, development, physiology, whatever. So if we're going to understand where different [00:01:00] proteins are. We can begin to infer their roles in specific biological processes.

And even more, we can start to relate those data to the mountains of gene expression and other genomic. Our guest today has been a leader in assembling the human protein Atlas, a guide to localization of proteins in specific tissues, and then relating them to other higher order questions. So our guest is Dr.

Mateus Uhland. He's a professor of microbiology in the Royal Institute of technology in Stockholm, Sweden, and also the director at SightLife lab in Stockholm. So welcome to the podcast, Dr. Uhland. Thank you very much. It's wonderful to be part of. I'm really glad you're here because we focus a lot on DNA.

We focus a lot on RNA and gene expression and a lot of traits that are controlled at those levels. And we sometimes don't give a. Credit to proteins who are actually the molecules that are doing the work at the end. So this is really exciting. So let's start at the [00:02:00] beginning. Why is it critical for us to understand the location and function of different proteins?

What, as you sort of outline in the beginning, the proteins are the building blocks of human life, but also our lives. The planet. So understanding the proteins is what I think the dream of all of us working with human biology, but also the biology of other species. So they adjust, as you say, they, the genes are the blueprint and they are.

We have fantastic tools to start the genes and DNA. It is a little bit harder or much harder to start the proteins, but of course, since they are the functional units of human life and also other species life, we need to move to that level to understand the human life in a sort of more holistic sense.

So and [00:03:00] obviously we need to know. Which proteins are in different parts of human body, which are in the brain, which are in the liver, which are in the blood and so on. And that is sort of what we're trying to do now in the human protein Atlas to map out and create a map of human, the human body. Well related to proteins then.

Okay. So you mentioned the human protein Atlas. So what exactly is the human protein Atlas and what was the main driver to start building this, this resource? So I was very much involved in the genome project that was going on during the nineties and sort of. Cumulated in in ninth, in the beginning or 2001 when the the genome product was you know, the first sequence was actually launched or, or, or, yeah.

And already back in those times, I was then [00:04:00] almost obsessed by the thought that we should look at all. The coding genes in the genome, coding them for proteins. And at that time we thought it was hundred 30,000 genes. Each one, then coding for four proteins. Now we know 20 years later, That is actually only 20,000 genes.

And these are then the genes that then codes for the proteins that makes us take they, they are the reason why we can function in the world, why we can talk, why we can think and so on. So obviously it's incredibly exciting to be part of the journey where you're actually then map out. This a very complex organism as humans.

Well, how did we get that so wrong? I mean, from a hundred and what 30,000 genes predicted, or at least that hypothesized to come down to [00:05:00] actually 20,000 coding regions of how did we get that? So increased. Well, that is a, that's a, that's an interesting question. I think one of the reasons was that when we were doing the gene on product, back in the nineties people looked at the RNA and in the RNA, we have a lot of so-called splice variants, which are variants of RNA.

And this was I think, mistaken for new genes. So now we know. That a lot of these so-called splice variants, which are very unsafe of the RNA actually comes from the same gene. So every year actually the number has sort of come down and come down and come down. And there was a betting back in the nineties.

How many genes human have. And I think no one actually guessed that we would have as 20,000. [00:06:00] So I don't think anyone want actually. Well, I guess when we think about that, so let's say 20,000 genes, how many different proteins can come from that set of genes? Well, right now in the databases we have about 80,000 different proteins predicted.

And this is because we have variants. So you have one gene, it codes for a protein, but that protein can then vary in different ways. One is that it can be modified. It could be, as I said, these flies forms where you add on different sort of. Parts to it and so on. So so, but what we are focusing on in the human protein Atlas is the sort of redundant proteome the 20,000 proteins, which sort of are coded by the 20,000 genes.

And we are not focusing so much on the variation of these [00:07:00] 20,000 proteins. Okay. So you're not the variation, but also not really focusing on modification. Right? So proteins are sometimes decorated with other types of amendments, glycosylation, or other things that change their chemical properties or their function.

And so you're really just looking at. Core protein itself. Yeah. That's, that's an interesting statement. You do that by saying that they changed function by modification, they seemed somewhat controversial. Obviously modification can make an enzyme to be active or inactive. But I would not say that changes the function of the protein.

It's simply an on and off switch of, of, of the function. And a lot of the decorations, as you say, the glycosylations also doesn't really change the function. In most cases, it's more to make it more soluble or maybe to go to another. [00:08:00] Part of the human body and so on. But if you actually want to look at the function of proteins, I think it's kind of close to the 20,000 that we now know from the, from the genome, from.

Very good. So, so the whole goal of the Atlas is to determine localization. So how do you do that? I mean, how do you go across a broad section of all the proteins and find out where specific proteins are residing? Yeah, that's it. So this is of course the core of the whole project. And we do that first of all.

Well, we decided about 20 years ago when we started this product is to make a so-called antibody to every human protein. So an antibody is a protein actually that specifically binds to. Of the proteins in the whole mixture of all of the proteins. And this is of course, very much used in the [00:09:00] therapeutic industry, by the medical pharma companies and so on.

But what we decided to do is to do something very ambitious and that is to make one antibody. To every protein, the 20,000 proteins. And this is something we spent 10 years with. We had a lot of work we had hunt, you know, this was almost 1000 person years. And now we have this resource of antibodies that then can, which is sort of a hook to actually fish out the protein from, from different tissues.

And then this is combined with sort of classical. Bio imaging, where you actually take out and tissues, normal tissues and disease tissues from humans, and then you actually stained them with these antibodies to get images. And we are producing a lot of images. So we have. About 10 [00:10:00] million images now, which are available in the, in the database.

Each one is then showing one protein in one tissue. So people can go there and look, where is the protein? How does it look? And so on the magnitude of this is, is, is incredible. I know that. Maybe had one antibody synthesized once and to have one done and to vet it and to ensure that it's working properly, that was a tremendous amount of work.

I can only imagine doing every protein in the human body. You all right? I mean, we're also working with therapeutic antibodies and the pharma companies are doing that for cancer and so on. And obviously it's probably about. A thousand person, years to generate one antibody. So, so this was of course, a lot of automation.

It was a lot of it and so on to actually make this possible. But in the end we have this resource now [00:11:00] it's pretty incredible. So what are some of the things that we've learned from the human protein? Well, we learned a lot of things, but I should say that the, the real value of this is that the individual researcher can go in and look at their favorite protein, and then they can see, where is this protein localized in the cell, in what organs is it and low.

I think surprises is that people maybe think that they have a kidney protein and their devil well, something for a kidney disease, but then they go into the protein art class and they realize that this protein is also in the liver or in the brain. And this of course has consequences for side effects of drug development and so on.

So that is sort of. The main focus is to provide the sort of an unsecure pedia or of [00:12:00] all the proteins. And you can then go in and see it yourself for your favorite protein. But we have also. A lot of surprising technologies by doing this kind of holistic type of mapping. One is that we now know that we have about 3000 proteins, 3000 building blocks.

That constitutes what we call the housekeeping proteins. These are proteins which are needed in all cells around the body and they are. Incredibly important to map and understand, then we have about 5,000 proteins. So about 25% of the proteins, which are specific for one tissue, one organ or maybe a few.

And these are course also very interesting. So this is something. And I guess this is also very surprising that when we look at these specific proteins, the organ or the [00:13:00] tissue that actually has most of these specific proteins, is this followed by the brain. So this is of course something which is rather surprising And then of course we have also mapped all the proteins, which are actively secreted to blood.

And these proteins are very, very important for precision medicine, for medical diagnostics in the future. And the, it is interesting for follow the. When you have a disease or, or, or actually where to, to actually explore wellness in MP. But what we have found when it comes to the blood proteins is that each one of us, you and I, and everybody which is listening has, is unique blood profile and sort of a fingerprint.

So you can actually look at the proteins and actually. You will have, you will have something different from your neighbor and so on. And this [00:14:00] is of course, where only sort of starting to understand why do we have these differences and what does it mean for our health and for our wellness? Well, that's really, really impressive.

It's really interesting to think about, but when we think about an Atlas, the reason an is, is, is useful, like in a roadmap Atlas is because it, it, it has a universal set of, of, of guidelines and guidance. And so does the protein Atlas. Provide us with any kind of cross-section across populations or was this from maybe one person or is this done from many different individuals that, you know, span a variety of different genetic backgrounds and sexes?

That kind of thing. That's a very interesting question. So when we order this and since this is a rather expensive exercise, and even if we did it on relatively few individuals, it still [00:15:00] created 10. Images we have created 15 million webpages, so this is a, quite a massive product, but we focused on relatively few individuals.

But we also, but we try to always have an age differentiation and always again, there is a difference. So you could see both the males and females but. A little bit also across different parts of the, of the geography, but what we see, which is maybe surprising for many people is that we are incredibly similar on the molecular level.

When you look at these proteins in the, in, in the different parts of the human body, Almost impossible to, to at least on the resolution, we have to see differences between different people. This is a little bit different from the blood, as I said, [00:16:00] where there, we can see differences, but in the tissues, we are very, very similar.

And maybe it's not that surprising since all of us has to do exactly the same. In, in these tissues and liver has to be there to detoxify the body. The kidney has to clean the blood and so on. So we need to have the same function and we are incredibly simple. Similar when you actually look at least at the sort of level that we are looking right now.

I guess that's my big question is that you're taking, when you're interrogating tissue with an antibody, you're taking a snapshot of that developmental state. And so, you know, do we look across time as well? Like where, you know, early development versus late development and having a sense of what players are there and changing through that.

So it is of course, [00:17:00] a very important product. Also look during development. It has actually been for us a little bit challenging to get ethical approval for that. So what we have done is that we are. Focusing on looking at development in model animals, such as pig macaques, which are primates and then also mouse and rat and try to learn about development by using these model organisms.

And again, The these model organs have very, very similar to humans. Although there are of course quite big differences, but you know, a liver in the pig is very similar to the liver in the, in humans. And so on. Well, all of this seems like just an amazing resource and I've, I've played around with a little bit and taken a look, but I need to dig in a little bit deeper and I will do that.

But after the break, we'll talk a little bit more about the data that are [00:18:00] present and your access to them. This is collaborative talking biotech podcast, and we'll be back in. And now we're back on collaboratives talking biotech podcast. And we're speaking with Dr. Mateus. Uhland, he's a professor of microbiology at the Royal Institute of technology in Stockholm and also the director of the side life lab in Stockholm.

And we're speaking about the human protein Atlas and this amazing resource, which is an extension of. Efforts from the human genome project that now talks about the proteins and where they're located through a, what we've heard is a massive effort to identify when, when and where they're expressed in different tissues.

And one of the real nice hallmarks of this work is, is that anybody can access it. So why is it important to have this kind of open access biological? Yeah. So it is very important to have an open access [00:19:00] to the data in this kind of projects. And, and there are three reasons for that. One is that basic knowledge is, should be available to researchers around the world.

And obviously this is very basic knowledge about something that almost everybody needs to know. And therefore it's important to have an open access. The second reason is that a lot of the future research will be done with artificial intelligence machine learning, using big data to actually.

Provide that and and explore that and massage the data and this should be done by different groups around the world. And then if you have an open access to the primary data that can be exported and a lot of people can look at the data. And and actually then learn from that. The third reason is that it is not.

[00:20:00] Easy to actually validate the results coming out of big data. And therefore, if you have the primary data, it is easier for people, external researchers to actually validate the conclusions, the results. So it is also important for that reason. But I think the main reason obviously is that when it comes to fundamental knowledge, About us health disease.

We should share that in the community. And we are very fortunate that the foundation, the vulnerable foundation, a nonprofit organization here in Sweden that has funded this product, they are very much in favor of open access of, of data. Well, very good. So when you start talking about open access, we start thinking about applications and w can you talk to us about precision [00:21:00] medicine and how the protein Atlas is really a central assistant in diagnostics or developments of new therapies?

Yeah. So precision medicine obviously is the objective is to allow the right, the treatment to the right patients and try to move towards a more individualized treatments. And. Of course to use then molecular tools, both of the agnostics to stratify patients. So you get so you know how to treat them, but also to monitor afterwards the treatment to see that, you know, you're on the right path.

And with the new tools that have been developed we have now fantastic. Opportunities. And I think the next 10, 20 years will be an absolute gold mine for all the research moving into precision medicine. A lot of that is [00:22:00] genetics and trying to do. Stand what, you know, genetic variance due to people, but even more important.

And I guess I'm biased here would be to actually learn about the proteins in blood and also in cancers and so on, and try to learn from that, how you should treat but also how to diagnose. So, you know, what, what the CS, a particular patient has. So this is very much what we're trying to do now, moving from the human biology that we, and the normal biology that we have been working on for 20 years.

And now in the next 10 years, sort of move into diseases and try to see what happens when you have Alzheimer's. What happens when you have Parkins on and, and how does that affect your blood profiles? And can that be used then to treat or, or find best drugs or, [00:23:00] or, or diagnostics for a particular disease?

I see that's, that's really a great application because if you're comparing disease to. Versus normal tissue. You can learn about what proteins are changing. And if you know that, now you can start thinking about potentially drug designed to target those different proteins that are changing. If they seem to be causal to the pathology, that's there.

But you mentioned this idea of blood profiling. Could you really learn something from changes in the blood that correspond to a disease state based upon proteins that may be detected? Absolutely. So, so obviously the protein profiling and blood is a huge technical challenge. The difference in concentration between.

Very abundant proteins like Sierra malware mean, and then very low abundant proteins, like the cytokines and the neurological [00:24:00] proteins is more than a 10 billion the, the difference and then make it all it has in the past. Been almost impossible to symbol tenuously, analyze more than maybe a few protein targets in the blood.

This has changed now in the law. Five 10 years, that has been fantastic developments. One is down in Boulder, Colorado a company called Soma logic and another one in Uppsala, Sweden, a technology called Olink oil and explore. And in both these cases, you can now actually analyze thousands and thousands of proteins simultaneously.

And then get very quantitative results. So this is an absolute revolution for all of us that are interesting in the blood profiles or of humans, both you know, w to cover or to analyze wellness, but [00:25:00] also to analyze disease. Yeah. And maybe even have some predictive power in saying that we see this particular blood protein come up at an early onset of a specific cancer or something like that.

Yeah. So that is of course the dream and that, I mean, I would be disappointed if we didn't use these new technologies in a few years and we actually had screened ourselves maybe once a year. And we will then actually see if we have cancers or other kinds of diseases maybe long, long before we have the symptoms of that cancer or some other disease.

The challenge here of course, is something what we call false positives for when you do screening. Obviously even if you are. A few pro, a few samples that are sort of considered positive, but they are acting negative. You cause a lot of distress in those patients. So it is very [00:26:00] important then that you can combine this with a very quick validation to say that you have this disease or not.

But this is of course something that we're I, I am extremely you know, developing. To to be part of, of, of developing such tools for the future. Well, you mentioned that images are available for open access, but are there quantitative data? So like, can we tell how much of a protein might be available where, and then those data can be related to other datasets of say gene expression or, or other types of metabolic data to understand.

Maybe the basis of disease or, or other types of you know, gene variants that maybe contribute to some sort of disease state. Oh, that's a very important question, actually. So one of the disadvantages with the antibody technology is that it's. It's not a solid quantitative. It is sort of semi [00:27:00] quantitative, and we've been trying to use something called tissue arrays to make it a little bit more quantitative.

And we also combining this with RNA analysis, there is something called single cell transcriptomics and so on, which is much, much more quantitative and then actually cook together. Can give us rather good indications of the actual levels of the particular protein, both in tissues. And, and of course in blood, because we have these new technologies for blood analysis, when it comes to variation.

This is of course very important. And again, this is something that you, it is Robert difficult to do on a sort of comprehensive way, except for looking in DNA. But of course this is something that a lot of people are looking at in a cancer. If you have this variation, [00:28:00] this of course means that it's more serious and so on.

So this is something that is very interesting for the future. So if people would like to access the data or maybe be able to, you know, scan images or look at the data in the human protein Atlas, where would they be? So the easiest way is simply to Google human protein, not and then you will end up in the link protein off class.org and there you would then be able to then go in.

And if you are interested in a particular protein, You will then search for that one. It's a little bit like Google and then you will get all the information for that protein. And, but of course, it's not that easy to go in and look at all the data since we have 15 million webpages. But of course they have been trying, we have been trying to make them accessible.

When you're looking at something, if you're interested in the proteins in the blood, you have one section. [00:29:00] If you're interested in the proteins, since human cell lines, you go to another section. If you're interested in the proteins in the brain, this is the third section and so on. So we've been trying to then stratify the data.

So it's easy, especially for research. And then. Knows what they're looking for. Very good. And do you put out updates over Twitter or anything like that? Yeah, absolutely. So we have the Twitter account at protein Atlas and we also have try to, to add the webpage also to have news articles. Trying to have that at least one news articles per week.

But the easiest access would be to go to Twitter and look for ad protein. Now. I guess the last question I would have is I'm a plant biologist. And are there similar efforts underway in plants or do you think that maybe this would be another extension [00:30:00] of this kind of work? W there is quite a lot of efforts following this in the mammalian model.

I paid my cock and mouse and so on plants. Eh, we have been thinking about that. But of course it's raw, very expensive to make these antibodies and the antibodies which are made for. They don't usually work in plan. So you have to sort of redo this, the good thing about moving to pay again, mouse and so on is that most of these antibodies also work in humans.

So also work in mouse and, and, and so on. So, so we can sort of reduced the antibodies in a different sort of molecular setting or a species setting, but. We have to do it again. And we, no one has sort of got the funding to do this kind of making antibodies to. Proteins. [00:31:00] Yeah. Yeah, it would be a couple more decades.

So I just, I had to ask just to see if it was on your radar. Well, Dr. Ellen, thank you so much for joining me today. It's a fascinating project and I'm really excited to talk about it and learn that this is really such a great resource. Thank you for joining me today. Thank you for. And to the listeners.

Thank you again for listening to collaborative talking biotech podcast, check out collaborators software, and the different tools they have to make your laboratory work more efficient and have all of your data in one shared space. This is the talking biotech podcast, and