Conversations with scientists

This podcast is with Dr Sethuraman Panchanathan who directs the US National Science Foundation. He talks about his nickname, about AI and data science, about training AI models, about transparency, about the language of collaboration, competitiveness, about talent. He says: "I think what we need as a nation is not only to unleash every ounce of talent in our country, the domestic talent, at full force and full scale. And we should welcome and aggregate and retain every ounce of global talent at full force and full scale."  (Art: J. Jackson, Music: Golden Era by Steven Bedall and licensed from artist.io.)




What is Conversations with scientists?

Scientists talk about what they do and why they do what they do. Their motivations, their trajectory, their setbacks, their achievements. They offer their personal take on science, mentoring and the many aspects that have shaped their work and their lives. Hosted by journalist Vivien Marx. Her work has appeared in Nature journals, Science, The Economist, The NY Times, The Wall Street Journal Europe and New Scientist among others. (Art: Justin Jackson)

Podcast transcript: A conversation with NSF director Dr Sethuraman Panchanathan

Note: These podcasts are produced to be heard. If you can, please tune in. Transcripts are generated using speech recognition software and there’s a human editor. But a transcript may contain errors. Please check the corresponding audio before quoting.

Dr Sethuraman Panchanathan
The day I came to NSF, I've been saying this, because I firmly believe in this. I think what we need as a nation is not only to unleash every ounce of talent in our country, the domestic talent, at full force and full scale.
And we should welcome and aggregate and retain every ounce of global talent at full force and full scale.
Vivien
We should welcome and retain every ounce of global talent. Strong words from Dr. Sethuraman Panchanathan who is the director of the US National Science Foundation. In case you are job-hunting or thinking about studying in the US or planning a career that involves AI in the US, these words might speak to you. You will hear more from Dr. Panchanathan in a moment.
Hi and welcome to Conversations with scientists, I’m science journalist Vivien Marx. I write stories and do multimedia for example for the Nature Portfolio. My stories always have space constraints, of course, that’s normal, and I just think it’s fun to give you all a listen to some things people said as I reported a story. Today’s podcast is with Dr Panchanathan and about AI I just want to give a bit of background.
These days, alas some funding news in science is not great. That’s true in the US and also in countries like Argentina and Brazil.
In the United States for example the US BRAIN Initiative, that’s the acronym , the long form is Brain Research Through Advancing Innovative Neurotechnologies, will have fewer new grants and some current grants will be trimmed.
The National Institutes of Health’s All of Us program that is collecting health and genomic data from ultimately one million people is seeing big cuts to its funding and staff numbers will be cut for this program, too.
But last week some good funding news reached 35 labs and groups and the announcement was made at The White House. A link to the list of funded projects is in the show notes.
They are the first to be funded as part of the Pilot project called the National Artificial Intelligence Research Resource, the NAIRR Pilot. It is devoted to offer better and broader access to AI resources, which means computing resources, remote access to supercomputing, but also datasets such as those that you need to train AI models.
The awarded projects sit in many areas. They are about fine-tuning models that look at the interactions between antibodies and proteins, there’s a project focused on improved weather prediction, there are projects about trustworthy ethical AI. And others.
The announcement about these projects was at the The White House because the White House Office of Science and Technology Policy and the National Science foundation set up a Taskforce to build NAIRR, which is a national cyberinfrastructure.
Besides spurring innovation and increasing the diversity of talent, the plan with NAIRR is to improve the capacity of AI and advance trustworthy AI. NAIRR is led by NSF and there are 12 other federal agencies and 26 non-governmental partners, including companies Amazon web services, Databricks, Google, Hugging Face, IBM, Intel, Microsoft.
https://nsf-gov-resources.nsf.gov/2023-10/NAIRR-TF-Final-Report-2023.pdf?VersionId=2RqgASgtGLzEI6QKsMIL.MWITnjgrmh_
The Taskforce came about because of the Biden Administration’s executive order signed in October 2023 on Safe Secure and Trustworthy development and use of AI.
https://www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/
Part of the order is a requirement to invest in AI education and programs to attract AI talent from around the world so that they come and— this is a quote— “not just to study, but to stay” because the project idea is that AI companies and AI technologies of the future are to be made in America.
For all those labs and scientists struggling with complex visa situations, this is perhaps good to know. And I know this podcast has listeners all over the world, thank you for that. And for people people thinking about studying and working in the US this is also perhaps also good to keep in mind.
90 days after the Biden Administration executive order was issued, the NAIRR pilot was launched, The National Artificial Intelligence Research Resource. Projects were solicited and the first been awarded.
https://nairrpilot.org/awarded-projects
When the announcement was made last week I got the chance to chat with the director of the National Science Foundation Dr Sethuraman Panchanathan. He was bit hoarse when when he spoke with me, he was going from meeting to meeting and there was a commencement address mixed in his schedule, too. I thought I would ask him about NAIRR and more broadly about AI .

At his White House presentation, Dr. Panchanathan mentioned he asked an AI to shorten his name and the result was Panch. I wanted to fact-check that because I had heard in previous presentations that people liked to use that nickname Panch so maybe it wasn’t that new. Here’s Dr Panchanathan.

Sethuraman Panchanathan [5:15]:
I was just joking. In a light-hearted way, that you know, when you'd have a very complex thing these days, it is always good to know o how do you how do I deal with this complexity? So I was sort of making that kind of a point that we all seems to go to AI to help us relieve complexity. And so I was just joking around.

Vivien
The NSF director had just come back to Washington after giving the commencement speech at Northeastern University. I asked what he told students. There were likely students graduating with a PhD. They and post-docs but certainly students might be wondering about the resources they need to use AI. Yes, of course funding is part of this. But they might not be part of the labs that won these NAIRR awards that were just announced.

Sethuraman Panchanathan
I didn't address this specifically at Northeastern. But the question itself is very, very important. And people do ask me about this. You know, I've always looked at this as literacy, right?
Because, you know, nobody ever asks, do we need English language literacy, right? So to me, I will tell you, this is not just now, when I started the founded School of Computing and Informatics in 2003, I talked about the fact that the importance of informatics literacy was important for every student of the university.
And now that matured to data science, literacy, and now, AI literacy, I still believe that computing literacy, and AI literacy, informatics literacy, data science literacy is very, very important for any student of any discipline.
Why do I say that? Because information is in every discipline. And therefore, these tools and understanding are exceedingly important for people to be successful in doing whatever they are doing. Whether it is a discovery work, whether it is you know, work with enriching, enhancing, you know, activities, right, I think it's exceedingly important thing to have, as a toolkit in your tool in your back pocket to be able to express it in the simplest form, whatever that you're doing.

Vivien
Beyond the awarded projects, through NAIRR, US-based scientists can apply for access to computing resources for their projects. NAIRR gives people access to resources with ‘compute resource credits’ so I wondered how far those credits reach.

Sethuraman Panchanathan [7:40]
That's a very, very good point. See, it is not just getting the credits, whether it is compute or model credits, or other kinds of credits. It's what you do with it, as you rightly point out. And that's where I find partnerships are extremely vital. See, what happens is, in these kinds of things, you want to not just be a computational scientist can work on AI, fundamental core activities. Absolutely. But when it comes to application scientists working with computational scientists or AI experts, these kinds of questions can be answered in a much better way, when you're working across disciplines, mutually inspiring each other.

Vivien:
Working across disciplines is not always straightforward because language, the jargon can get in the way. But says Dr. Panchanathan, the language you need also emerges as you start to collaborate.

Sethuraman Panchanathan [8:40]
The common language development. Actually, interestingly, Vivien evolves when you start talking to people, even though you may not have the common language.
Oftentimes, people have this notion. If I don't have the language, I cannot communicate. The language emerges because you start to communicate, right? And you start to evolve : Oh, you meant that, I thought you meant this. Okay, now we know that this is what we're meaning. And we will then either adopt one of them or find a new term that describes that. That's how we evolve, you know.
And so I find that happening. And I'll give you an example. Again, I go back to unfortunately, I'm gonna go back to 2004-2005 time.
I started hiring people, a lot of people between disciplines; I hired a person between anthropology and computer science, right. And at that time, it was thought that I had gone crazy or something. But I always believed, Vivien, that if you want to build computational systems for working with humans, for humans with humans, helping with human tasks, then you need to understand the fundamental core of what humans are all about, and the human spirit is all about. Therefore, working with colleagues, social science colleagues is exceedingly important, right.

So then comes this language issue that you alluded to, then people start to communicate. In fact, there are stories, I won’t take time today of how things evolved. There was so much skepticism when I asked people to have a cup of coffee and talk about things.
But after that, they would come back and tell me, ‘Oh, my God, I didn't realize there is a new discipline that we can work on, by bringing our ideas together, right. ‘ And this anthropology and a computer scientist came out with the idea of social networking as something that they could work together on, right?
Now it's more common place these terms, but that's what it is about, you know, that's what it's about. And this is the age, this is the age of cross-disciplinary, trans-disciplinary, interdisciplinary, kinds of inspirations that helps not only enrich what you do, but, you know. In fact, I would argue that it also enables new areas being birthed, which you would have never thought about before, without the fusion of those kinds of ideas. AI can make that happen much faster.

Vivien
Collaboration not always easy. What also gets challenging I hear is preparing data to be fed into AI models. Cleaning data takes time, lots of time and that can get tedious. I wondered what kind of consolation he offers people getting bleary-eyed from data prep for AI.

Sethuraman Panchanathan [11:20]
What I tell them. Yes, it depends on where you want to spend your energy and resources and what you get for them. Right? If you're willing to invest the time energy and cleaning the data, and offering it to a thing that could now make possible amazing connections, new revelations that you might not have gleaned, then is it worth it, that's the first point that I would say.
The second to use your word console, the consolation here is, there will be a time through this process, that there will be people who will develop AI tools that can actually help cleaning data, organizing data, AI will do that, too, right. So we will get there, we will get there, there is always a process to go from here to there.
And you always weigh the benefits that you accrue, because of the tireless work that you put in. The same postdoctoral fellow or a PhD student, you ask them, of course, you work really hard, don't you, but the end goal is, is a paper in this phenomenal publication called Nature. Now, you would die for it, but didn't you I would as as a faculty member, a paper in journal Nature, I will do anything to sit in slog 18 hours a day, in order to do whatever it takes to expand my frontiers of my thinking, analysis in our discovery potential, so that I might get the paper out to show to the world that I have this fantastic idea that is worthy of, you know, pursuing further. So that's the point at the end of the day, how much are you investing in terms of you energies and time and resources? And what are you benefiting from that?

Vivien
One of the many issues with AI is that a training set for an AI model might seem perfect. But the AI can’t generalize from the training set and analyze new data well.

Sethuraman Panchanathan [13:15]
Generalization is generally about availability of lots of data, right? To go from where you are to generalization, t hat requires a lot of data. I understand the anxiety. You know, let me tell you, it's very interesting, because how I go back in history, my own PhD work was in vector quantization 30 years ago. And that was a machine learning problem for images to be able to learn from images, to be able to code images for compression purposes, right.
So, there are two kinds of things right this. So I remember to give this as an example. Now, you will take all kinds of images, and you will generate a code book, as we called it a code book from all kinds of images. The problem was that when you try to code for a specific image, that codebook was not as good as it could have been, if I only took the images from that kind of images.
What do I mean by that? Let us say, I had radiology images and face images. If I created a codebook from all images, like radiology images, face images, nature images, bird images you know, bird images, whatever it is, I created a codebook, when he tried to code a face image, it only have so much accuracy of representation.
But if we took all face images, and then he created a codebook, that will be much more accurate, right. But that requires me to get a variety of face images to be able to create the codebook, it also requires then for me to be able to deliver the fact that I may not be able to compress as much, because now you have to send the codebook to the person that is receiving it.
So what happens with these kinds of things is more data is required. And the elegance of what you're doing may not be realized to the fullest. When you try to do it all in the beginning, you know.
These are very good questions, by the way, because, you know, early in the development of anything, there is always this, this this, this, this anxiety, the aspiration, rightly so, to want things to be perfect.
And that is the quest of research: as much perfection as we can get. But it takes time, and patience and perseverance. So that's what this is about, there are no silver bullet solutions yet. maybe in that it is just everybody is trying to look at this, as you know, as they say, not every you know, not every hammer is looking for a nail. So we are in that stage right now, so trying to trying to treat everything the same way, it can be frustrating. And I understand the hype that sometimes gets built in these things, right? Oh, you have used AI And when people say? So if you don't know something? That's it? How do you use AI? And yes, you can use AI, But it depends on what you use it for, how you use it, where is it in the evolution to accrue the full benefits of it. This is still a nascent evolving situation.

Vivien
Another issue with AI can crop up when one does not keep test and training data separate. And that then makes it hard to verify if your AI is giving you reliable conclusions.

Sethuraman Panchanathan [16:20]
To verify the veracity of the performance of something, you want to make sure the test data is not necessarily present in the training data.
But let's talk about this a little bit more. If you're really interested in fine tuning a system to perform, then your data set should be reflective. The training set should be reflective of what you're trying to perform.
In other words, I talked about the radiology example, if you want the radiology images to perform really well. And if your training data had very few radiology images, of course you will not have. But that doesn't mean that the same tes. images have to be in the radiology data sets. It's good to exclude them and then train it with those radiology images still in order to be able to get the optimal result.
And but again, I'll tell you, it depends on the dataset size. If you're training dataset is large, it doesn't matter whether the test sets were inside or not inside. It is only when the training sets are small, I understand these concerns when people are claiming that my performance is great, then you need to have, as I said the veracity of your test, then, you know, people question it. So you want us to be careful on how we extract like a, either or it depends on the context.

Vivien
With AI comes a certain amount of opacity. Looking at someone else’s results and considering their models and how well their predictive power is working, it’s not an entirely transparent process. Some groups researching AI are looking at this. Here’s Dr. Panchanathan.

Sethuraman Panchanathan [18:05]
Even beyond this, you know, being able to relate to other people's datasets or models and things of that nature. We've always said that in AI, we want it to be verifiable, explainable AI. I mean, even if it is a black box, people want to understand, how does this black box work at a macro level?
But now let's come to the question that you asked, clearly, you know, you want people to have be able to, you know, have their datasets and models be available for others, with a certain amount of transparency that is there so that people were using it know, what are they using and what are they using, and what benefits they can accrue by using that. So that level of transparency is very, very important. And I can tell you that that's what you get with the NAIRR project, right.
Because when you are not wedded, in a sense, for a certain outcome as a commercial outcome or other kinds of things, then you are likely in the academic sector always is about openness, transparency, reciprocity, you're surely because you know, I will give yours you give me mine, all of that. But it is about openness and transparency, and science prospers in an open world with transparency, science prospers, thrives and prospers. And that's very important. And that's why when you have these kinds of resources available, you're not binding people by any means.
And you're giving them an opportunity to be a lot more transparent, with the data a lot more ready to engage, explain, you know, be willing to offer the time, that's what scientists do. They just don't say, you know, here's my thing, just deal with it.
They're willing to sit and talk about it, they're willing to help people. That's what the scientific community is all about. So I'm hoping that the NAIRR as a resource is going to make people think about much more along the lines of, you know, trustworthy, responsible, ethical, equitable, you know. Of course, we want to think about privacy, preserving security, and all that, all of that is important, too. But you want the fundamental tenets of what scientific processes is that that integrity is maintained through through the service, and there is good I ever have the confidence that there is going to make that, you know, much more possible and explicit about it.

Vivien
AI in and of itself isn’t automatically an equitable technology. If you want to use it you need to computing power to do so and unless you know someone with well-endowed computing resources maybe even access to a super computing center, some ideas are not going to come to fruition. So how can one make AI equitable?

Sethuraman Panchanathan [20:50]
The equitable part, Vivien to be very specific. I think I mentioned this in the remarks at the White House. Since coming in and before that, of course, I came to NSF with this. And I've always, if you look at all my actions in NSF, all my programs, is in this fundamental tenet of belief: talent and ideas are democratized.
They're everywhere, all across our nation, across the socio-economic demographic, across the diversity of the nation. It is our responsibility as a nation to ensure that every ounce of talent is inspired, motivated realized, so that the nation benefits from that. And through that, of course, we are offering opportunities for everyone everywhere, but you're also making possible innovation everywhere and anywhere. This has been the singular focus that I've worked with everything at NSF.
And AI happens to be an exemplar application that demonstrates this. That's why you have the 25 AI Institutes launched in the last three years spanning all across the nation, across applications, of course, with the fundamental core of AI at the same time being enriched and enhanced and empowered.
The Expand AI program is about helping minority-serving institutions to be able to be given the ability to build the talent, that that is there in those institutions. So I want institutions, minority-serving institutions, community colleges, technical institutes, research institutions, all of them, because the nation needs every ounce of talent and idea,
Because we can then not only out compete on a global competitiveness per minute perspective, but we will guarantee economic prosperity, national security and health and well-being for all our citizens and prosperity for all citizens. I'm a fundamental believer that.
And this AI therefore, which we talked about, has got to be that we talked about this as something that people have in their back pocket to be able to express their ideas and talent, it should be available for everyone.
And so back to the question that you asked, it is not limited to a chosen few who have connections or resources, access to resources, but for anybody and anybody who has good ideas, they could then get access to these resources. I think the President made it very clear in his executive order, and the bipartisan support of Congress is very, very exciting. And that is a view, you know, all of Congress to be able to make this a thing that you know, everybody benefits from.

Vivien
NAIRR is called a two-year pilot project, which seems to sound like it’s not here for long.

Sethuraman Panchanathan
No, I don't look at this as a two year thing as an end , the two year thing is a first step. Let's be very clear, because we never want anything to be started and ended.
Because this requirement is only going to you know this Vivien, it's only going to increase, it's only going to expand. Now, the two-year mark is more to challenge ourselves to build something that can show the promise, the potential, the possibility, and lay the pathway all the pieces, promise, potential. Okay, pathway, Okay, that can that get the progress established.
So we are going to look at this, as you know, at NSF, we are going to be continuing to work on this to make sure everybody has access to this on an ongoing basis to be able to do this so that the democratization need will never never go away.

Vivien
Statistics were released earlier this year in a report that included the Science and Engineering indicators. A link to the report is in the show transcript https://ncses.nsf.gov/pubs/nsb20243 The United States performs more researcher and development than another other country but as the report points out the country’s competitiveness relies on foreign born scientists and engineers. I
In many US labs there are visa challenges and there is some general belt-tightening going on with budgets. Not everyone is feeling empowered right now.

Sethuraman Panchanathan [24:35]
We talked about competitiveness, our competitors are starting to hyperinvest. So this is a moment for us to continue to invest, and scale our investments, so that we can not only out compete and out-innovate any any nation on Earth, but more importantly, to make sure that we are providing the opportunities for everyone, right here in our nation.
The day I came to NSF, I've been saying this, because I firmly believe in this. I think what we need as a nation is not only to unleash every ounce of talent in our country, the domestic talent, at full force and full scale.
And we should welcome and aggregate and retain every ounce of global talent at full force and full scale. This is an additive mission. I have said that for far too long we have brought the global talent and substituted for domestic talent. And now we need to make sure that the global talent and the domestic talent are additive, and global talent is retained, domestic talent is unleashed both at full force and full scale.
That's what our nation needs to out innovate, to remain in the vanguard of competitiveness, yes, but also to make sure that we are realizing the full potential of this innovative capacity of our nation and beyond. So that's the first comment that I will make.
Okay. And so it's very clear from the indicators, that trajectory of investments need to the federal investments need to be you know, a lot more into the future. So that's very clear.
And the second thing that I also see is that talent and ideas have to be nurtured at all levels at the k to 12 level, at the community college level, at the four-year colleges undergraduate college level, and research university levels. Talent and ideas, multiple pathways have to be provided for talent, so that they can express itself.
And on top of that, you want to be able to build the upskilling, reskilling, you talked about AI, people are going to enter exit enter, exit, enter exit. That's why use the word pathway, not pipeline anymore.
People can come in and go out come in, and go out that is the future. Yesterday , in my Northeastern University, I pointed to the lifelong learning, the only constant is change. And lifelong learning is an important imperative that people walk away with. I said the following thing, and I just made it up. I guess I always do that on the spot. The word commencement could have been called conclusion. It is not called conclusion. It's called commencement, which means this is the start of your journey of learning. It is not the end of your journey of learning.
So I think that's what I would like to say is that the Indicators really point to, again, the same words I will use that we'll use the potential, the promise, the possibilities. Now we developed the pathways so that we can make progress all the five P's.

Vivien
And that was Conversations with scientists. Today’s guest was Dr. Sethuraman Panchanathan, director of the National Science Foundation. And a shout out to Michael England at NSF and his colleagues who helped set this conversation up. The music in this podcast is is Golden Era by Steven Bedall and licensed from artist.io. And I just wanted to add because there’s confusion about these things sometimes. NSF didn’t pay for this podcast and nobody paid to be in this podcast, this is independent journalism that I produce in my living-room. I’m Vivien Marx, thanks for listening.