Talking Biotech with Dr. Kevin Folta

Parasites are a massive threat to human and animal health, underlying a significant number of important diseases. Dr. Jessica Kissinger of the University of Georgia has led and coordinated efforts to curate growing databases of pathogen genomic information that inform understanding of pathogen evolution, host and vectors. This information provides a better understanding of parasite biology, and ultimately provides valuable guidance about how pathogens are prevented and treated.

Show Notes

Parasites are known contributors to human disease and suffering, spanning a wide range of organisms. Dr. Jessie Kissinger from the University of Georgia has spent the last two decades curating genomic data from hundreds of parasites, their vectors and hosts. The information helps researchers generate hypotheses about parasites, and presents a fertile resources for comparing genomes and understanding similarities and differences across this diverse set of organisms.  (Vector and Eukaryotic Pathogens Resource Center) (Clinical and Epidemiological resource with DIY analyses and many BGMF studies)
@jcklab   (Dr. Kissinger twitter)  (lab website)

What is Talking Biotech with Dr. Kevin Folta?

Talking Biotech is a weekly podcast that uncovers the stories, ideas and research of people at the frontier of biology and engineering.

Each episode explores how science and technology will transform agriculture, protect the environment, and feed 10 billion people by 2050.

Interviews are led by Dr. Kevin Folta, a professor of molecular biology and genomics.

Hi everybody and welcome to this week's Talking Biotech podcast by Col Labra. And this week it sounds like I'm in a tin can, but this is only for the introduction. I'm recording the introduction to this week's podcast through a funny microphone in a funny spot in Chicago and some emergency family business.

So thank you for bearing with. Today's podcast is a really important one because it's about parasites, mostly the genomics of parasites and what we've learned by comparative genomics and understanding parasites and parasite evolution. We're speaking with Dr. Jessica Kissinger, and she's an expert in parasites and parasite genomics, who's curated this database for a couple of decades.

And now is more involved in many aspects of parasite biology and at parasite evolution. So I'll end the introduction very quickly today and turn this over to the normal recorded interview. So I'll start out by saying, welcome to the podcast, Dr. Kissinger, and it'll sound horrible. And then it'll sound really good,

So here we go. And thank you all for listening. I really do appreciate it. It's we're episode 371, so really all because of wonderful listenership. So thank you so. Here we go. So welcome to the podcast, Dr. Kissinger.

Welcome, Kevin. It's

great to be here. It's nice to have you on. This is a pretty exciting topic that we don't cover enough, so you're gonna be very illuminating today.

So here we go, . So let's start out really fundamental. What's the definition of a parasite and, and how is it different from something that we might refer to as a pathogen or our parasite's pathogen?

Okay. Well, pathogens are things that make their hosts or the things they infect sick. Now when one organism infects another organism it.

Can do so in beneficial ways as well. And we would refer to those as these types of symbions can be commens in which both organisms might benefit from the relationship. But when the thing that does the infecting causes harm. Right. And to the thing that it hosts, right? That, that it to the host, it has invaded or infected, then we consider it to be a parasite.

Okay. So what about things like viruses and bacteria? I mean, they're, they invade us and they cause us harm. So are those parasites or are they strictly pathogens?

You know, I'm sure somebody will come along and correct me on this. I think there are many opinions on this. I think many would consider viruses micro parasites.

I mean, they come in, they co-op your machinery. Mm-hmm. , they need their machinery in order to survive. They cannot survive usually for it to complete their life cycle outside. And that is not the case for all pathogens which often can survive outside of you as well as inside of you. Ah, other environments.

Yeah. You can see the Venn diagram in my head this morning. I was sitting here thinking about this, prepping for this, thinking about where that delineation was, and it seemed like viruses are micro parasites. Parasites we know are hand in hand with human and animal populations. And how significant are parasites to contributing to human health worldwide and maybe some examples from animal health.

Yeah, so they're a huge burden, a really huge burden. And I know you had Stefan's fired down here a few weeks ago talking about his great breakthroughs working on plasmodium, which is a malaria pathogen. So thinking just about malaria, right? There's over 240 million cases estimated of malaria alone.

That was in 2020 that resulted in 627,000 deaths. And this doesn't even include all of the diarrhea, pathogens, worms, fleas, lice, right. In animals, parasites are hugely significant. So I'm in Georgia I marry as a prominent pathogen and cryptosporidium as well of chickens. Poultry industry is very large here.

But for those that work in animal husbandry or even your own household pets, parasites are very. Globally. And they, in fact, very large numbers of, of humans, their pets, their animals, your food sources, right at your livelihood. They're really, really critical. And in some cases they're kind of so endemic or are there that they don't always necessarily get the attention that they deserve in terms of the extreme levels of morbidity and mortality that they do cause.

Yeah, I, I agree a thousand percent. We farm some space down here too, and we do more treatment for intestinal parasites, more shots of ivermectin you know, worms, all kinds of goodies that, that we. Are constantly dealing with, with you know, between pigs, dogs other critters here on the farm. Chickens, ducks, geese ducks don't get 'em.

But, but for the most part, it's, it's ubiquitous. You're constantly surveilling for pathogens for parasites rather. And we've talked about for humans malaria, as you mentioned before on the podcast. What are some other major parasites that affect human health that maybe we don't see on the front page?

All right. Well, I'd say probably among the most common would be toxoplasmosis caused by Tolas. So anyone who's ever been around, a pregnant woman who was told not to change the cat litter that's, that's the parasite we're talking about. It's particularly damaging to pregnant women who are slightly immunosuppressed because they're carrying, and it can in fact the developing fetus as.

Foodborne parasites, cyclospora very important. Cryptos, radios, giardia and ameba for those backpackers and hikers are global travelers. These are water born parasites. Mm-hmm. that folks get some of the, the less often heard of over here, state. But really important. For example, troops coming back from the Middle East was leishmania.

It's a kinetic plastic parasite. And in that same family, the Trone troma br causes African sleeping sickness or troma. Cruisy causes Chagas disease. So there are a lot of very prominent human pathogens that cause a significant amount of morbidity and mortality. Is

giardia this ameba that swims in your ear when you're in the lake in Florida and the alligators don't get you?

No. Be the famous brain eating . Ava a Gloria fy .

That's the one I'm worried about. I don't have much left. So , I'm real careful, but, so this, this is the brain eating aba. Okay. Okay. Well really, I mean then I'm just amazed by the number and the scope of these different pathogens. And when you talk about sleeping sickness and Chagas and all of these different diseases, there's so much that we can attribute to this class of organisms that sometimes we don't think about enough.

Maybe cuz is most of this in the developing world or do we still see really profound outbreaks in the industrialized.

Some of them are, are restricted to the developing world, but for example cyclospora and cry radios they happen right here at home. I think one of the largest outbreaks of cryptos radios happened in Milwaukee, probably now.

Yeah. 15 years or 20 years ago, I think nearly 500,000 people became infected. Mm-hmm. Because it's waterborne. It's not killed by chlorine. So we'll hear about it in the news fairly often at outbreaks at various water parks or water related activities. So no, and on Tola it's, it's, it's, it's everywhere.

Matter of fact, it's estimated that roughly. Third of the world's human population is infected with tolas. The acute phase of the infection only lasts for a week or two, and if you're not immunocompromised, the parasite goes and makes a little cyst in your brain and it lies their dormant. Unless it's activated again.


But is that the toxoplasmosis one? Yeah. Oh, okay. So, but third of the people worldwide. But it's, so is it really only a problem with pregnant women when they're dealing with a litter box? Or is this something that can become reactivated just because they were once cleaning out a litter box?

It. Hey, that is one of the more common ways in people which, which people get infected.

The second most common way people, because it, so the cat is the, what we would call the definitive host, where the parasite has its sexual life cycle and it produces infectious, infectious, ah, infectious osis that it sheds. With its feces. But cat shed feces all over the place. So improperly washed vegetables from the garden are also a source of infection, as is undercooked meat.

So you can have tissue cys in a variety of organisms where it's not in the brain, but it's in the tissues. And then undercooked meat can affect you as well. Wow.

This is this neat stuff. So if we're talking about significant parasites, the ones that are, well talk about all parasites that you can think of.

There was an effort to really kind of catalog these things at the genomic level and characterize them. So could you tell me a little bit about your efforts that have been done over the last 20 years to curate and develop those data resources?

Yeah, so As we all know DNA sequencing has become really important to being able to understand both the basic biology, but also to study a number of pathogenic organisms including parasites.

And as a result governments and funding agencies have dedicated a large number amount, large amount of resources to helping the scientific community. To sequence the genomes of very important pathogens, just as we all witnessed the sequencing effort during our recent Covid pandemic. However, these genomes are much larger than the viral genome.

Matter of fact, the the first parasite genome that was sequenced was for the malaria parasite that was published back in 2002. I think it will go down in record as the cost. Parasite genome, I think it costs nearly 35 million. Wow. Somebody can correct me on that if I'm wrong, but it was incredibly expensive.

It was very at rich. It was very, very hard to sequence, but it was so important to get it done. And then there were any number of parasites, and especially now as sequencing technologies have made it possible to sequence. Ever more pathogens with much less material. So things that can't necessarily be cultured can, can be sequenced now.

That really became the need to. Collect these data and put them in places where the community could not just access archival records of the initial data deposition, but where you can kind of create living records where the community can add updates or they can curate if they want to, and more importantly, as part of this omic revolution, you can add in.

So, Extra additional data on gene expression, on the proteomics, on the metas, on the pathways. Understanding how the chromosomes and chromatin are modified, you know, during gene expression are at different life cycle stages. And so I was very fortunate to be involved with what is a very large group of researchers in a project that started a long time ago when I was a postdoc.

Back with David Ru at the University of Pennsylvania to make databases and our very first database was plasma db which had the, the plasmodium genome in it originally. The genome there are now. Thousands of them that have been sequenced for different isolates all around the world to look at variation.

But that grew. There was a to DB and a crypto DB and T CRUZI db and, and this all sort of evolved in NIH has pulled together large number of community resources, not just for parasites. But for bacteria and for viruses, and in our case the database became what's called a bioinformatics resource center that's funded by the National Institutes of Health, as well as several other funders globally even.

And it houses, it's called View Path DB for vectors and new pathogens. So it contains mosquitoes, ticks, even the snail, which is the vector for Cytosome Cystos. And The fungi and the protest pathogens, and it contains. Only the genome sequences, but a lot of other associated data.

And maybe I can play devil's advocate for a second here just because I'd love your to hear your answer because I think it helps other scientists who are listening articulate something very important.

So the question that someone may ask is, okay, so you spend a lot of money 35 million on a malaria, on, on the plasmodium genome or, or any of these genomes. What good is a parts. Of the genes that are there. I mean, how do you use that to solve a problem?

Ah, well, you know, if you don't know what the parts are, it's very hard to predict where the problem will be, right?

And to understand how they're connected. Different researchers use these types of data in a variety of different ways. There's a number of ways in which I use them, but I'll, I'll start out with some of the most common this is just a what? Various metabolic capacities. Does the pathogen or the parasite even contain, right.

What can and can't it do? Then when you start comparing to other, let's say, global isolates of the same pathogen, you can ask, well, which genes are varying and which genes don't vary. That becomes really important so that you can understand which things are under purifying, selection, and very, very important for the pathogen because they can't change.

And those that might be under diversifying selection, especially if they appear to be encourage gene products. Proteins that are targeted to the surface, they might be involved in the immune response, right? So for then, the whole field of people that's interested in looking at vaccines becomes very interested in that class of genes.

Well, what about the hosts or vectors? What do we know about the hosts and what do we find about specific parasite and host relationships from genomic data?

Oh, that's a great question. So first I just wanna explain a little terminology. So, hosts, of course, are anything that can be infected by the parasites, but some parasites have more than one host.

Sometimes there's a host that we refer to as a vector that carries one life cycle stage. So, for example, we'll fall, go to our fallback parasite today. The malaria parasite pathogen, I mean the malaria pathogen plus odium. It's transmitted by mosquito. And so it serves as a host for lifecycle stages that can develop within the mosquito but then transmits it to another host, right?

The, the human inside. The human, they, these parasites only undergo asexual replication, their definitive host Is the mosquito. So some pathogens, parasites in particular are locked into particular hosts. There, there is a co-evolutionary history that is very, very long and these parasites cannot infect other hosts.

And we might all have that experience with, you know, sometimes taking our, our, our animals the vet and, you know, and learning. Yeah. Can I catch it? Well, no, you can't. So they're, they're quite locked in. Others are incredibly incredible generalists. So Tolas is one of these, it can only Completed sexual life cycle in feel.

Its right, the cat family. But it can infect almost any other worm blooded animal. And if we think about plasmodium the parasite is much older than humans are. So their original host was the mosquito. I think humans are sort of the pla du. Today , there are many, many plasmodium species that infect many, many other types of animals.

Wow, that's really fascinating because the plasmodium has been around much longer than us. And so how did it ? I never thought about that before.

That's pretty cool. Yeah. Yeah. Well, the one that infects us definitely came up through the the lineage. That's been really nicely shown in some beautiful work.


And you can see that from genomic data, kind of just how it changes over time in relation to like maybe a more primitive subgroup that then further. Evolve through the primate lineage.

Yeah, I dunno that it has to be primitive. , it can affect anything that's alive today, right? But yeah, so like I think you know, we didn't talk a lot about animals, but I mean, so there's probably now I believe, I'm gonna get this number wrong too, but 30, 32 name species of plasmodium.

Only five of them infect humans. One infects humans and non-human primates. The others mostly infect birds and reptiles. You know, something we don't think about a lot.

No. And been bird pathogens become more and more relevant every year because of the bird flu and things like that. So it's been yeah.

Or avian influenza, you know, major one this year. So how many parasites are currently curated in UAT DB at the genomic level?

So we have as of our last release, which was November 9th, 2020 two, six hundred and seventy one genome sequences, that doesn't represent 671 species. The number of, of species is much smaller, but that includes fully assembled.

Different strains as well, not just sequencing for variation. There's thousands and thousands of those, but, but actual full genome sequences, and that includes the produce, the fungi, and the vector. So that takes mosquitoes and the sna. That's

neat. Is there a genome browser that folks can plug into and just take

a peak?

There is absolutely a genome browser that people can go in and take a peak. Although I have to say the strength of our databases is not going in and taking a peak. It's actually testing hypotheses, so I'll just tease you with that. Ah,

wonderful. No, I love that. And so is it, so you mentioned that there are other data in there as well, other than just strictly DNA data and genomes.

Is there, you know, expression data and other data too?

There are. So as of the last release, we have a total of twenty eight hundred and twenty two data sets, of which only 671 were genome data sets. So you know, nearly three times as. Other types of data are in the database and as you intimated mostly RNA-seq expression data, but chip seek data, which lets people pull down pieces of DNA based on what proteins are bound to it, DNA variation dataset.

So there's a lot of looking at population studies to understand what's happening perhaps in the face of drug administration or not, or eradication efforts. We have proteomics, metas DNA and protein array data, and even. Can be small molecule data because again, a lot of people look at the database to try and understand therapeutics and metabolic pathways.

Yeah. So you gotta, you gotta have a little time rather than just taking a peak. Right. ,

you can't. Yeah. Right. Well, you know, you don't just take a peak, not

understand it. It's, it's, for me, it's like the old rabbit hole problem. I go into play in a genome browser and I'm in there forever. So, But I come out the other side with the , the smoke coming outta my ears.

We're speaking with Doc Dr. Jesse Kissinger. She's a distinguished research professor and professor of genetics at the University of Georgia. And this is Collabs talking Biotech podcast. And we'll be back in just a moment. And now we're back on the Talking Biotech podcast. We're speaking with Dr. Jesse Kissinger.

She's a distinguished research professor and professor of genetics at the Institute of Genetics and the Center for Tropical and Emerging Global Diseases at the University of Georgia in lovely Athens, Georgia. And we're speaking about parasites, but mostly parasites and the way in which their genomic data has been cataloged.

So, Scientists can access this and learn more, test a hypothesis develop hypotheses from these genomic data. And we really just set this up in the beginning by talking about what are parasites, what, how much of a problem are they and what are the databases like? But how do we make 'em go to work? And are we learning anything about the evolution of parasites from these.

Oh, we're learning so much. So the first thing I would like to do is to just explain to our listeners that parasites as a group don't represent one evolutionary lineage, right? So it's not as if paras evolved once and everybody. Descended from that, or Proje, or progenitor is a parasite. So parasitism has evolved hundreds of times on the tree of life many, many, many times.

And I don't consider myself to be a card carrying parasitologist, but I am an evolutionary genomicist. So I, I'll I'm gonna answer, answer the second part of your question. So I, I tend to look. These notorious yet amazing organisms at the level of their genome sequence and their gene repertoires and the pathways that they contain.

And I think some very interesting commonalities have, have popped out. And so the first is there's very often a loss of genes. It's not a universal, but some of these genomes have become incredibly. Right. Nearly bacterial size in some cases. So one of the, the smallest genomes that I work with on Cryptosporidium is only 9.1 million base pairs.

Wow. The malaria parasite ranges like 23 to 26 million base pairs. Ty Babesia Parasites of, of cattle and livestock. There are like 8 million I think there's an AL is three or 4 million base pairs. Toxoplasma is one of the large ones at about 66, 60 7 million base pairs. And to put that into perspective, You know, the human genome is 10 to the nine base pairs.

So, so that's been very interesting and, and so there's been a lot of loss and it makes sense as if a parasite, you're stealing things from your host so you can get rid of some baggage if you can get it for free from your host. So some of that loss makes sense. There's been some surprises though, because you know, these parasites.

I mean, some of them are, but they're not all evolutionary related. They've popped up all over on the tree of life. And, you know, one commonality is, it seems really easy to lose purine synthesis. Lots of parasites have lost the ability to make purine their DNA residues. And I think it's also interesting to just look at how few genes you can survive with, right?

So cryptosporidium pathogen is down to about 3,400 protein en coating. The malaria parasite has a little over 5,000. Wow. I mean, yeah. I mean, in my head I just imagined this deck of cards, of these pieces, and yet, you know, with all the money and all of the creative research that the globe has thrown at them, you know, we are still not able to treat them

Yeah, it's

really fascinating. I mean, I've never known any of this stuff. It's really, really cool. How much of that's driven by humans like that, that intervention and maybe things like resistance to our drugs that we use to treat them, which I think is really a issue of plasmodium. And is that reflected in the molecular data within the database?

So another really outstanding question. So The actual evolution of the sorts that I just described to you. I do not believe to be driven by therapeutics. And after all, therapeutics have only been out there for the last, you know, I guess maybe, maybe close to a hundred years now. A little over a hundred years.

I'm not exactly sure when Chloroquine entered the stage, but it was in wide usage by the 1970s. And sure enough chloroquine resistance developed in parasites. And, and the resistance though, I just, again, for the listeners, resistance in ucar pathogens doesn't. I wouldn't say move around. In the same way we think of it with bacteria in terms of there is a resistance gene for a drug on a plasma or something that can be acquired.

In this case, resistance appears to be of selected from natural variation with existing within the population. In the case of Chloroquine that's appears to have happened twice at least twice around the globe. But because it's a sexually. Public organism. It can spread its alleles that are advantageous and resistance to the drugs spread around the world.

And you absolutely can see this in the isolates. I mean, so you can go in and survey plus odium parasites in a given area and assay, you know, not only the level of, of particular polymorphisms in the DNA that are indicative of resistance to chloroquine, but of resistance to any number of other drugs that have been developed and thrown.

This very clever organism that seems to have the ability to Evolve resistance to everything that has been thrown at it, including our, our newest frontline drug arteries.

And that's really interesting, isn't it, that the small number of genes, small amount of dna, but a really nice toolbox for figuring out how to get around the problem.

Lot of generations. , , and a very large number of parasites, right? So when you just think about mutation rates and, and the large numbers that are there, they can do it. It's, it's, it's infuriating and, and yet, Illuminating . .

What about new parasites? It seems like when you start looking for things like when they do all these microbiomes and metagenomes of organisms, that maybe some new things might show up.

Anything interesting?

Yeah, so one of the organisms that, that I worked on, I work on wasn't even considered a pathogen originally, and it's called Cryptosporidium. It's a water born pathogen that causes some of the worst diarrhea of your life, or if you're unfortunate enough to be immunosuppressed is life threatening.

And it wasn't really recognized as a pathogen of humans until the HIV epidemic and aids. And so once there was a large immunosuppressed population or those undergoing transplant surgeries, immunosuppressed all of a sudden you began to see the emergence of pathogens that we didn't know about Cryptosporidium being chief among them as an AIDS related pathogen.

And what's really interesting is, As we come to learn about these, these organisms you know, you can go out and then ask, well, just how prevalent is it? And in the case of Cryptosporidium it was a really beautiful, groundbreaking study into the causes and prevalence of diarrhea and infants that was funded by the Bill Melinda Gates Foundation.

And it revealed that Cryptosporidium is the second leading cause of diarrhea and infants under the age of a year old second only to ro. Right. And, and that was a shocking finding to something that we didn't even consider a pathogen until relatively recently. So yes, our ignorance is, is still there.

Well, that's, that's a good thing. I mean, job security, right?

Oh, well, yes. But we, I'm really happy about the wonderful biotech breakthroughs and sequencing now. I mean, I think our ability to uncover these things is, is, you know, increasing rapidly. And that's a. Well, we

talked already about the unique genome size and some of the other features that seem to be common to parasites, even though they don't share a common evolutionary lineage.

So are there any other kind of interesting genomic trends that happen around these really diverse organisms?

Yeah, thanks for asking that. So there are there, there's two other things that I would point out and again, I'm, I'm an evolutionary genome junkie, so this is, this is what I think is really cool.

First is we find in many of these pathogens, very little of what. I think most genomic researchers would call the junk or the repetitive mobile element, transpose and containing portions of dna. A lot of this appears to just have been removed as part of this genome shrinking process, and so I think that's really fascinating.

Matter of fact, it led to a really neat discovery when I and another researcher got in a bet as to whether or not there was a U period genome that didn't have any transposable elements. And there are some genomes without transferable elements in them, but we found some other types of repeats. So that was cool.

And second, I think another feature that folks might not think about immediately when they think about parasite genomes is because they come from lineages that are so far away from model organisms and the types of organisms that people traditionally study in the laboratory. Many of them have 30 to 50% of the genes that we have no idea what they are.

So we can see that they're conserved because we can sequence many different lineage of, of these parasites, but we have no idea what they do because they're not related to anything that currently exists in the databases.

Oh, that's, that's really interesting. What about things like ribosomal repeats, are there still are those islands of what are not normally considered to be traditionally repeated dna?

Are they much.

Yeah, , I would say they're nonexistent in some lineages of FairSite. So, for example, in the ipy complexions that contain cryptosporidium, toxoplasma plasmodium things we've mentioned many times today, there are not the traditional. Clusters of ribosomal genes. You'll have still, you'll have the small and large sub units still connected together in a typical cassette, but there will just be one cassette on this chromosome and then another cassette on another chromosome where the organism may only have, you know five to a dozen of these genes.

And in the case of OnePlus moded species, the, the cassettes have diverged from each other and are actually expressed in different life cycle stages. I mean, it's bizarre. Yeah, that's

super. It is. Its, and one of the fun things about having a really nice genomics resource is that this is really where the hypothesis starts.

You know, it's a hypothesis engine. What are some of the fam, what are some of your favorite hypotheses that really you could test because of these comprehensive databases?

Oh, outstanding question. I think that, you know, as I mentioned earlier that the, the databases that I've been fortunate enough to be involved in really were designed not for browsing, but for testing hypotheses.

And so if, when you think about, as a scientist, when you're just like, oh, if we could just develop, you know, a vaccine against this then you could say, Okay, well, what would your ideal vaccine target look like? And you're like, well, it would be a protein, it would be on the surface, and it would be expressed at a life cycle stage when the, the pathogen or the parasite is inside, you know, the host.

And, gee, wouldn't it be nice if you knew if it was under diversifying selection? Because the immune system sees it. And, and now right in, in a resource, like if you have db, you can actually go in and you can. You can even ask, is it evolutionarily conserved with other related, you know proteins from related organisms, or is it not present in the human host?

And, and so you can take what would've been, you know, a lifetime's worth of work and looking at existing data, at least say, what, what are the, what would that list of possible proteins look like? And, you know, you can take the, the five. The 7,000 proteins that are in that genome and narrow it down to a list of 20 or 30.

And matter of fact, if you ask that question that I just asked in the Plasmodium database, more than half of the proteins that have already been tried as a vaccine target will pop up on that list. Right. It's, it's really quite cool.

No, that's neat. So go to the other side of it, rather from hypothesis to conclusions.

And are there any good examples that you can think of where the database was a fundamental part of answering a really important.

The databases have been a part of answer. Hundreds, if not thousands of questions. Too many to talk. And that's primarily, and this is kind of, I think like one of the hidden gems of the database.

That's hard to quantify. And that is because of the time that's been saved. If you think about, if every research lab had to go to archival repositories, retrieve every data set clean, and then normalize every data set to each other in order to be able to. Test some of their hypotheses or ask what happens if type scenarios.

The amount of time, graduate student time, post time grant money, salary, money spent trying to do that is enormous compared to the facility of the database. And so we, we get reports all the time. So the database has thousands or the database says collectively have thousands of, of. Literature, citations and papers where they facilitated the research.

And I don't know, we're over 20, 25,000 like Google Scholar citations where people mention the database in their papers. But I, I, I did pull up in terms of the second answer to that question, sort of a juicy tidbit of something exciting that came out of the databases. And that was where somebody who was working on PLAs Moded.

Was engineering the parasite to express a protein called VA two csa, which binds a specific type of kadri and sulfate that's exclusively expressed in the placenta. So, as we, as your readership may know from listening to this podcast, I women in their first pregnancy are particularly susceptible and, and then their, and their babies to malaria.

And the placenta can become, And they were able to show that this same conoid and sulfate modification is present on a high proportion of malignant cells that can be targeted by a recombinant dys recombinant protein. They demonstrate how an evolutionarily refined parasite protein can be exported and used to target a common but complex malignancy associated glycosaminoglycan modification on these tumors.

And so I think. It was a really, really exciting finding, not only for within developments within the world of parasitology, but showing some of the versatility and the power that comes from bringing data together from diverse sources into a data ecosystem that allow you to cross boundaries. Between fields now.

Really nice. Is there any good news on malaria that's been spawned from a more comprehensive understanding of the different variation that happens inside plasmodium?

There? There's so much that has been and is being uncovered. With respect to malaria biology, I guess I'm gonna put it maybe in the most, one of the most interesting things that we've learned is that has to do with the population genetics of the, of the parasite.

And that a, we've learned how incredibly diverse this parrot, this parasite is across. The planet and the countries where you find it. But more importantly, in those countries that have been undertaking eradication efforts and are getting closer to succeeding, we've learned like what happens to the parasite populations and we're learning that they lose diversity.

As you're getting closer to eradication. And I, and again, I think that's also a really important finding is we think about strategies, you know, for how we are going to handle, you know, the very important. Humanitarian goal of eradicating plasmodium. And then I think again, having the database with annotated pathways knowing which genes are under selective pressure, knowing which pathways might have a choke point.

Again, it just allows people to come at tackling these really important pathogens from any number of vantage points and

perspective. And I know that I've, I've heard you speak before about v a DB and some other associated online resources that really can help people who are interested in learning about genomics and how to use genomic data.

Maybe some YouTube videos. Can you give a hint of some of the resources that are available as add-ons to this database that can help somebody start to think about the resources within.

Yeah, absolutely. So two things. So, so first yeah, so I'm a shamelessly plug the databases. So it's a view path, V E U P A T, HDB for Vector and U carry pathogen

It. Com, we have a YouTube channel and an active Twitter channel and a Facebook page, and there are large number of resources where a researcher can learn and go to the database and learn how to use it. We have an outstanding outreach staff. We have several full-time people. Who help to answer questions and guide researchers through depositing data, mining data, all of these sorts.

I also wanna put in a plug for our, our newest databases. We have one dedicated to microbiomes, microbiome TB that has some of these same really dynamic flexible queries where you can go in and slice the data, or I need all X that do this or have property y. And also clin EPI db, which has clinical epidemiological database data in it as well.

So same sort of thing I need, you know, I'll study participants that, you know, showed this response or were infected with X or Y or had this type of housing or plumbing and also showed, and you can pull in other corals as well.

That sounds really good. I'll, I'll make sure I include all of these URLs inside the show notes.

So, So would you recommend these databases and resources as a good starting point for folks who wanna understand bioinformatics and maybe want to have a, a several genomes or a multitude of genomic resources at their fingertips to really kind of play with and test their own hypotheses?

I'd absolutely recommend them to wet lab researchers who have a hypothesis that they wanna test that might work on one of the pathogens in our databases or the vectors or their, or their hosts.

We have a few hosts, but we only have host pathogen associated data sets. As for beginning to learn bioinformatics, I would say less so, primarily because the. Most of the bioinformatics in our resources is taken care of. The, the, the biggest thing to learn is the different data types and what types of information each omic technology is able to give you.

So you can think about the ways that you would like to slice and dice them. I think anybody who's interested in thinking about multi omic approaches to data mining, I really do encourage you to go have a look at our YouTube videos. Even if you don't work on this particular, our particular organisms because I think we've come up with a fairly innovative and intuitive approach to tackling this problem.

But as for just learning how to do what I'd call mainstay bioinformatics we absolutely make a resource called Galaxy available to users where they can go and run. Tools against our genomes. They can map their, your RNA-seq data, your proteomics data, et cetera. You can look for orthologs genes that are shared in the data sets.

But it's probably not the best learning environment for, for hardcore things. But all of the basic things are made doable in a web browser, which is fantastic for those that wanna focus on the.

Well, all of this has been really fascinating. I've learned a lot from this particular episode. So if I or other people would like to continue to follow you where would they file follow this infor, the database or maybe you personally online.

Well, they can follow the database on our VR Twitter feed at, at View Path, V E U P A T H db or me at JCK Lab.

Perfect. Well, thank you very much for joining me today. Really great episode and best wishes going forward. And if you get anything, well, I shouldn't say if I always do that when something exciting hits

If you would like to talk more about it, please let me know. It'd be nice to talk to you again. It'd

be my pleasure. Kevin, thank you for a lovely conversation and

thank you as always for listening to this episode of Collaborates Talking Biotech podcast. Tell a friend, write reviews do what you can to help share the word because.

The exciting research is being done, and it's our job to communicate it, to make others excited too. So share this story about toxoplasmosis across the Thanksgiving dinner table. , this is a Talking Biotech podcast, and thank you very much for listening. We'll talk to you again next week.