Policy Prompt

Scientists and AI: Partners in Discovery (understanding AI’s role in scientific research with Rebecca Willett)

Episode Summary

AI is transforming how scientists discover, predict and solve problems in our world.

Episode Notes

Artificial intelligence (AI) has had a profound impact on science, from data analysis to scenario simulation and predicting protein structure — its full potential is still unknown. Today, many scientists are dedicated to better understanding AI and how to integrate it into research to accelerate the pace of scientific discoveries without compromising rigour and principle. Is there a future where AI will make new scientific discoveries on its own?

Join hosts Vass Bednar and Paul Samson as they speak with Rebecca Willett about the role machine learning and AI play in scientific research now and how she sees it impacting scientists in the future. Rebecca is a professor of statistics and computer science at the University of Chicago and faculty director of artificial intelligence at the University’s Data Science Institute. Her research focuses on machine learning and making sense of complex, large-scale datasets, as well as data science. Rebecca completed her PhD in electrical and computer engineering at Rice University and is a member of the Computer Science Study Group at the Defense Advanced Research Projects Agency (DARPA).

In-Show Clips:

00:15:03: NOVA scienceNOW, "What Will the Future Be Like?": FoldIt: A Protein Puzzle Game (PBS LearningMedia, 2013)

00:15:17: Nature Video: Foldit: Biology for gamers (YouTube, August 4, 2010)

Mentioned:

Google DeepMind AlphaFold: deepmind.google/science/alphafold/
Protein folding computer game Foldit: fold.it/
The COVID-19 Citizen Science Study: pmc.ncbi.nlm.nih.gov/articles/PMC8407439/
People-powered research platform Zooniverse: zooniverse.org/

Further Reading:

Rebecca Willet, professor of statistics and computer science at the University of Chicago and faculty director of artificial intelligence at the University’s Data Science Institute. Find her bio and works here: willett.psd.uchicago.edu/

Credits:

Policy Prompt is produced by Vass Bednar and Paul Samson. Our supervising producer is Tim Lewis, with technical production by Henry Daemen and Luke McKee. Show notes are prepared by Rebecca MacIntyre, Libza Manna and Isabel Neufeld, who also handles social media engagement, brand design and episode artwork by Abhilasha Dewan and Sami Chouhdary, with creative direction from Som Tsoi.

Original music by Joshua Snethlage.

Sound mix and mastering by François Goudreault.

Be sure to follow us on social media.

X: @_policyprompt
IG: @cigionline

Listen to new episodes of Policy Prompt on all major podcast platforms. Questions, comments or suggestions? Reach out to CIGI’s Policy Prompt team at info@policyprompt.io

Episode Transcription

Rebecca Willett (guest):

Right now, AI tools cannot do a lot without a lot of human intervention and guidance and expert knowledge. But the field is changing fast and it's hard to say what things will look like even just a few years from now. And so we're collecting data to overcome this scarcity problem, but we're trying to do it in a principled way because it's not like the data is just lying around and we have to soak it up using some rub crawler, but we have to make very deliberative decisions about what data to collect with limited resources that you don't necessarily see in say commercial AI contexts.

Paul Samson (host):

Vass, how's it going?

Vass Bednar (host):

It's good. How are you?

Paul Samson (host):

Good, good, good. I'm going to start off with a question for you right away.

Vass Bednar (host):

Okay.

Paul Samson (host):

AI can now beat us at math, spot patterns-

Vass Bednar (host):

Speak for yourself.

Paul Samson (host):

... we'd miss. Well, you must be pretty darn good. Spot patterns that we would miss, and even write interesting passable poetry in some ways. But can it actually experiment like a scientist? Can it invent something genuinely new?

Vass Bednar (host):

That is a huge question, and I've noticed some people talking about vibe coding with AI and engaging with questions of physics or mathematics that the technology is helping them maybe get closer to breakthroughs. So your question is the question about the evolving role of these algorithmic systems. Can AI applications be truly creative and generate new ideas that weren't pre-programmed in some way? And of course with that creativity could mean good but also bad or unhelpful or hurtful ideas. So my sense is that I think the jury is still out on whether AI will be truly creative or not, but it is kind of exciting and interesting to think about how maybe a conversation with a large language model can test or challenge people's thinking in the advanced sciences.

Paul Samson (host):

Yeah, it can't be ignored. It is there in our face now and everyone's asking these questions. So today we're going to tackle those issues in discussing the topic of what's known as AI for science, what and who's involved in that, where it might be going. And outside of the science community, AI for science is not getting really that much attention, I think the attention it deserves, because it does get at those really deep issues that you're talking about, Vass. And including ethical ones, but ultimately how much science will be done by AI in the future? There's lots of guardrails around the science we do now. Will that hold with AI science in the future?

Vass Bednar (host):

I don't know. I love that you ask me as if I'll know anything.

Paul Samson (host):

Please answer. Please tell us.

Vass Bednar (host):

Why don't we ask our incredible guest today? Joining us is Dr. Rebecca Willett. She's a professor of statistics and computer science at the University of Chicago, and also the faculty director of AI at the Data Science Institute. She's been widely recognized for her many contributions, including by the National Science Foundation. She's also a member of the computer science study group at DARPA. That's the part of the US Department of Defense that's responsible for the development of emerging technologies. She's incredible.

Paul Samson (host):

Yeah, and she's trained as an electrical and computer engineer, and her research is focused on machine learning and making sense of those complex and large-scale data sets and data science. She's the perfect guest today.

Vass Bednar (host):

Hi, Rebecca. Welcome to Policy Prompt.

Rebecca Willett (guest):Thank you so much, Vass, Paul. It's just a delight to be here with you today.

Paul Samson (host):

Yeah, it's great to speak with you and thanks for joining us, a lot. And so to start off, could you define a little bit what we're talking about here? And let me put this in two ways. One, when you talk about your own work on AI, what do you include? What's in there? What isn't in there? What's AI for the work you're doing? And maybe describe machine learning given you're focused on that. And then secondly, could you describe what AI for science means to you in a broad sense?

Rebecca Willett (guest):

Right, yeah. So let me start actually with the second question, and I'll start with a story from when I was in graduate school when I was studying medical imaging. So I was across the street from my department at the hospital to talk with a collaborator, and they had this sign in the elevator that said, "If you sign up for our fMRI study, we'll give you a CD with all of your brain data."

Vass Bednar (host):

Oh, wow.

Rebecca Willett (guest):

So this was an offer I could not refuse, and I immediately signed up. And it was super cool. I laid in the scanner and they were trying to understand how our brains respond to pleasure by having us drink Kool-Aid, literal Kool-Aid.

Vass Bednar (host):

Yum.

Rebecca Willett (guest):

And I just learned a lot. But a couple of weeks later, the professor who was leading this study gave me a call and he said, "Becca, when we started analyzing the data, we got to your scan and we found a brain tumor, a meningioma. And you need to go get a clinical scan." And I did. And they were able to remove it, and the biopsy was negative and everything was totally benign. But I think this is a really interesting story because this took place 20 some odd years ago. And if we think about how that story might look today, AI could just transform every aspect of this.

It's affecting the way neuroscientists design their experiments, the way we have scanners acquire data, the way we form images, the way that we do diagnoses, the way we decide whether surgery is the right thing to do, and even what drugs get prescribed after a surgery. And I think that this example revolves around healthcare, but there are many areas of the sciences where AI is having a profound impact. Things like trying to understand the laws of nature or rules of life, making better batteries, building quantum computers, predicting extreme weather, or even just understanding what happened right after the Big Bang. And so I think this is an extremely exciting area.

Paul Samson (host):

Thanks for personalizing it. And we will jump into some of those example areas that you're talking about, where things might go.

Rebecca Willett (guest):

Right.

Vass Bednar (host):

Better batteries, yeah, sign me up.

Rebecca Willett (guest):

We need it.

Vass Bednar (host):

Maybe just to expand from there a little bit, and to be really basic, frankly, I think sometimes when say students are thinking about ways to continue their studies and sort of what they're pursuing, we frame science and algorithmic applications as being almost different and distinct pathways, but absolutely they're not. And there's this fantastic integration and these new applications that we're going into. Could you tell us a bit more about the evolution of the space? Where these disciplines that maybe have traditionally being more separate have started, how they've started to come together in these productive and really fascinating ways?

Rebecca Willett (guest):

Yeah, absolutely. I think going back to Paul's first question, if we try to think about, well, what is AI and what is machine learning? I think at their core they're really about how do we learn from data? How do we recognize patterns that allow us to make accurate predictions for future examples? So for instance, when you're a kid, you're trying to learn the difference between cats and dogs, and people point out things like the size, or the ear pointiness. And you build a mental model, so that when you see a new animal, you can predict it. We try to train computers the same way. We give them lots of examples of images and label them either cat or dog. And hopefully from those examples, they can figure out a model that will allow them to make an accurate prediction for some new image.

So then as we think about integrating that kind of core idea within the sciences, the first thing I think many of us might think of is, oh, I go off and I collect my scientific data and then I'm going to plug it into a machine learning system so that I can understand something more about that data. And that is a super valuable use case, but I think it's only one component of the potential that AI has in the sciences. And in particular, when we were in grade school, we learned about the scientific method. And I really see AI as having the potential to play an important role in almost every aspect of the scientific method. So not only data analysis, but the way that we even generate hypotheses in the first place, or the way that we run simulations to try to understand different phenomena and how things might progress or test different theories, or even the way that we design future experiments in future data collection.

And so as people try to push the boundaries of what AI is capable of, in the context of scientific machine learning and scientific AI, I think these are the questions that are really at the forefront. How do we give this powerful tool to scientists so that they can do their science in a really rigorous way?

Paul Samson (host):

Yeah, it's fascinating about even now as you apply AI to certain disciplines as a leveraging tool. Let's talk about a couple of examples where it's enabled science to move forward. And the best one known is probably the AlphaFold DeepMind in biology, protein folding structures. But there are others, I think, and emerging others in material science and things like that. But are most of those augmentations of existing science? In the sense that, PhD thesis dissertations were written on protein folding before, it just took a really long time to do them. So there's not something brand spanking new there, but it's an augmentation. Is that kind of where most of this AI for science is clustered? Or is it starting to get into those new hypotheses? If you could give us a couple of examples that would be super interesting.

Rebecca Willett (guest):

Yeah, fantastic question. So in the context of things like alpha fold and understanding protein folding, I think some of the early work would come up with a mathematical description of sort of the energy of different protein configurations. Some are lower energy and some are higher energy, and we think that the low energy configurations are just more natural and maintainable. And so then we would try to search for which configurations would computationally search for the configurations with the lowest energy. But that wasn't really data-driven in the way that AI and machine learning are. We would come up with this mathematical optimization formulation, try to solve it, and then see if it seemed consistent with what we observe in practice.

Whereas in contrast, machine learning methods like AlphaFold are saying we just have tons of examples of proteins and knowledge of how they fold, this is in fact stored in the protein data bank. And so can we use a machine learning tool like a neural network to come up with a prediction of how a protein is folded for new protein that is accurate but isn't based purely on this notion of energy, but rather is placing much higher emphasis on actual examples that we've observed in nature. And so that's not really leading per se to a new hypothesis, but it is a major paradigm shift from the way that a lot of the classical methods examining protein folding operated. And it really highlights the power of machine learning, and what we can do by integrating these rich sources of data.

Well, you asked for another example where maybe we're uncovering something new, and that's of course a harder task. But what I would point you to perhaps is some of the work people are doing on using AI to improve weather forecasting. So in some of the early days, people would train a neural network to take the current state of the weather and predict the weather out to say one or two weeks, and they would train their machine learning models off of all of the old weather data that we had, maybe some simulation data in addition. And what they found in the early days is that they could get very accurate forecasts, but which with much lower computational costs than the classical numerical weather forecasting systems could achieve. So that, right away, was exciting.

But more recently, some of the models, the machine learning models that have been explored are actually producing forecasts that are sometimes more accurate than what we would get with the classical numerical weather prediction methods. And I think it's not entirely clear why that is. I think one big question is, what's implicitly represented in these machine learning models that's not in the classic physics-based models that we've used? And if we could uncover that, maybe we'd understand something new about weather systems. And so I think there's potential for these things to lead to new discoveries, but certainly they're achieving accuracies that we hadn't necessarily anticipated a priori.

Vass Bednar (host):

Is there a participatory element to any of that weather work? I ask because Paul was mentioning AlphaFold and that was throwing me back to Foldit, which I came across years and years ago as a kind of example of participatory gamification for the public good. And this was for science.

Rebecca Willett (guest):

If I understand correctly, you're referring to citizen science where citizens get engaged in the scientific process either by running tools or looking at data on their own computers. Is that right?

Vass Bednar (host):

Sure, yeah. Foldit was an online puzzle video game.

Clip:

Adrian's first science video game began when biochemists had a problem. To fight diseases, they needed help solving protein puzzles. Computers are terrible at visual puzzles, but humans are great at them.

Clip:

We are here to provide a sort of refinement to the folding algorithms that protein servers use, to try and do this automatically. As we get better and feedback to the Baker lab, the automated algorithm for the computers will get better and ultimately we'll be left with nothing to do.

Vass Bednar (host):

And people would sort of play around and kind of fold the structures of selected proteins best they could. And then some of the highest, I guess, scoring elements were then analyzed by scientists in terms of is there a native state that could be applied to those proteins? So it was like a blend. You're right, citizen science, citizens, an open invitation for people to sort of participate or play around with something that kind of had a higher scientific application.

Rebecca Willett (guest):

Well, I will tell you that there are examples of just trying to get the public engaged in science in a variety of ways. I'm not deeply familiar with the project you just described, but one example is Zooniverse or the Galaxy Zoo where astronomers collect gorgeous images of galaxies throughout the universe and they will have citizen scientists look at those images and help with the process of labeling them like, well, what's the geometry or morphology of the galaxy?

Vass Bednar (host):

That's cool.

Rebecca Willett (guest):

And that information can be really useful when we look to training machine learning algorithms, for instance. And so I think that's really exciting. I think in addition, I've seen cases, for instance, when people were trying to develop COVID therapeutics where they were allowing people to submit candidate molecules that could be developed potentially into a drug. I remember one speaker telling us about this, and one candidate that was really effective at killing off COVID was Vodka. It had some side effects though. But yeah, there's a number of different ways in which I think people can get engaged in these processes and make really valuable contributions. And also just get excited about where the science might lead us.

Paul Samson (host):

Vass, it makes me think that you were playing games like Foldit about protein folding when you were a kid. I was in tree forts and stuff, so didn't see those games. But it's great. So here's another question that's a bit of a... It's tough one to answer, and I don't expect you to have any kind of formal view on it or certainly an institutional view.

But one of the things about AI is that some AI-generated discoveries, let's call them, or findings or conclusions, would lack an ability for interpretability or kind of the typical explanation that you might use in a legal process to say, "This is how we proved X and here's all the steps that were taken, that we unpacked it." And so in some ways there's an opportunity in that, that you can just shortcut to good outcomes or good conclusions. But does it require us to have faith in AI sometimes where we won't be able to prove your math homework? It's just too time-consuming and even impossible to go through that. So how do you think about this issue? And presumably, it's evolving to become even more prominent an issue?

Rebecca Willett (guest):

Of course, scientists really care about the rigor of their science, and don't like to take much scientifically on faith, including AI. And so this is of course a concern. There is research done on things like interpretable AI, but many of the methods out there have their limitations. And so we can't, at this point, rely exclusively on those tools. So I would say a couple of things. First of all, there's certainly work done perhaps more on the engineering side where we're less seeking new scientific knowledge and more designing, say, designing a new material or a new battery, where our goal is to do better than the current state-of-the-art. And there I think there's no question that AI could be a valuable tool, even if we can't interpret why it selected the new design that it selected. It can still lead us to more efficient systems or better properties of our materials or what have you. That's a context in which the interpretability is a relatively minor issue.

I would also say that there are people working on things like equation discovery or sometimes people call it symbolic regression, where what you'd like to do is to take data observations of some physical system and use the kinds of tools that we use in machine learning and AI in order to find an equation that can describe that data. Now, that's hypothesized model of the system that's being observed, but I think that can then be useful for helping to design future experiments to find evidence that supports, or does not support, that hypothesis. And so I don't think we're at the point where AI can do all of that automatically. I think that these things are more of an interaction between human scientists and AI tools that can make them more efficient or perhaps generate hypotheses that they might not have identified otherwise. And hopefully, together, that will accelerate the pace of scientific discovery.

Vass Bednar (host):

You mentioned having your brain data or information exported to a CD-ROM. I feel like I'm still thinking about what did that mean? Is it just imagery? What else did they have, and do have? It strikes me that so much of the AI for science conversation depends on the availability of large and high quality data sets, volunteered or not maybe. Could you kind of take us backstage a little bit and help us understand how scientists are overcoming this challenge of data scarcity or noisy data from so many disciplines? And how data is obtained or volunteered or shared or produced in the first place?

Rebecca Willett (guest):

Yeah, excellent question. That's really at the core of a lot of scientific machine learning questions. We were talking about AlphaFold earlier, and I mentioned that, that was trained on the protein data bank. And it took decades for that to be built. People started building this using federal research dollars long before we were imagining using AI to predict folding.

Vass Bednar (host):

Oh, wow.

Rebecca Willett (guest):

And that was transformative, right? I mean, the work done by DeepMind to build AlphaFold is undoubtedly phenomenal, but it would not have been possible without that excellent high quality data. And so I think in many settings though, we don't have something that's quite so clean where we've got so much data, and where the data is directly related to the thing that we care about. In that case, protein folding. So for instance, if we're thinking about climate systems, the physical models that we have are maybe on elements of the climate or the weather state that are a little bit different from what we can actually measure with our sensors. And so there's some indirectness there that can be difficult to overcome.

And I think in other settings, we just don't have nearly as much data as we would like. We have this data scarcity issue that you mentioned. And then the question is, how do we even decide what data is most important to collect? For instance, if we've got limited resources. So one of my colleagues, for instance here at the University of Chicago studies microbial communities. So I mix a bunch of species of microbes together and maybe ultimately I'd like them to perform some task. For instance, break down plastics or improve somebody's gut microbiome. And so you could imagine, well, if I grew a whole bunch of these communities and measured their efficacy, then I could build a machine learning model, but there's just trillions of combinations we can't possibly try them all out. So which are the ones that we actually build in the lab and grow and measure? It's very hard to answer that right up front.

And one of the approaches that people are developing are these sort of iterative or sequential methods where I might grow a few different communities, see what works better or which ones work worse, and then use a preliminary machine learning model to figure out what's the next set of experiments I should run?What's the next set of microbial communities I should grow to grow my data set? So I'm kind of using machine learning in order to build that data set or to guide the way that I collect data. And so we're collecting data to overcome the scarcity problem, but we're trying to do it in a principled way because it's not like the data is just lying around and we have to soak it up using some rub crawler, but we have to make very deliberative decisions about what data to collect with limited resources that you don't necessarily see in say commercial AI contexts.

Paul Samson (host):

I do want to follow up here. Let's talk a little bit more about synthetic data. First of all, what it is, that it's produced data. But there's good and there's bad and there's different uses. Could you just describe a little bit how you would frame the issue of synthetic data, and it's pros and cons, let's say? First of all, what it is.

Rebecca Willett (guest):

So one example of synthetic data would be data that comes from a simulation. So for instance, when people are making climate forecast, they're running models that are grounded in, first, physics principles, and they run those on giant and very expensive supercomputers. And so this is helpful because we maybe know a lot about the physics of weather and climate. And so by running these simulations, we can make accurate predictions.

And the climate in a hundred years might be very different than the weather that we're experiencing today. And so if we were to use weather data from today, and even for say the past 50 years, and try to learn a model using machine learning from that data, it might not be predictive at all of what we're going to see in 50 or a hundred years, because we know that there are trends and things are changing and evolving over time. And so using the simulations in addition to the weather data that we've already observed can help us incorporate that knowledge of physics encapsulated in simulations into our machine learning models. And I think that's going to just be far more accurate than just relying on observed data that might not be representative of the future we want to predict.

Paul Samson (host):

So that's in a way hard synthetic data that's quite framed or even kind of curated in a certain way to give you something that is clearly valuable. There's a lot of other synthetic data that is from more dubious sources or even intentional misinformation that could be to try to pollute data sets potentially. Are those very significant risks of just random data that gets captured into maybe not so much in scientific experiments, but in broader data capture? Do you see some concerns around uncontrolled synthetic data?

Rebecca Willett (guest):

That's a great question. I think in the context of some of the scientific applications that we've been discussing, I haven't seen that be a huge concern yet. For instance, when NASA has some space telescope downloading data, that data stream is very carefully controlled in various secure. But one thing that people have been exploring is using large language models, things like ChatGPT to help scientists understand the literature, right? Maybe I have a question about something I've observed in an experiment and I go to one of these large language models and I say, "Well, what part of the literature would help me understand what I've observed in my experiment?" Or, "How can I think about this?"

And I think that's where the data that's been used to train those large language models, I wouldn't necessarily call it synthetic data, but it might not... Some pieces of that data used to train that are going to be more useful than others. And it's not entirely clear how to control those models or to guide those models to ensure that they're producing accurate and useful results for scientists. It is an ongoing effort within the scientific community, but I think many challenges still remain.

I think one of the key problems is extrapolation. I know everything that I've already observed and now I'm just trying to say, "What if I did something totally different, what would happen?" And that's an area where traditionally machine learning kind of struggles because we're trying to learn from examples. And if the examples look nothing like what we're trying to predict, then it's pretty hard. I'll just give you an example. You mentioned at the jump about AI being used to solve math problems. And so one class of math problems is doing say symbolic integration. So when you take calculus, maybe you've got a homework problem where they say, "What's the integral of X squared?" And then you have to come up with an expression describing that integral. So that's symbolic integration. So I'm not outputting a number, I'm outputting an expression.

So there were some early benchmark tests where large language models were doing really well on this task. But then there was a team from the Technion in Israel that said, "What if we just altered that test a little bit and we add in a bunch of constants?" So for instance, what's the integral of A times X to the 2B? So when we learn calculus in high school or college, we learn how to handle those extra constants A and B and how that would affect the solution. But these large language models that were doing so well on the vanilla problems were really struggling as soon as these constants were added in.

Half the challenge is just assessing the accuracy of these AI models and making sure that the tests that we run are comprehensive enough that we can identify what their weaknesses and failure modes might be, and ensuring that they have sufficient accuracy to really help us with science is going to be a major challenge, especially as people try to use these large language models within their scientific workflows.

Vass Bednar (host):

Picking up on that, I mean the discipline of science has long valued transparency, skepticism, falsifiability. Does integrating these AI methods kind of challenge those norms or harmonize with them in some way?

Rebecca Willett (guest):

I believe scientists are going to remain skeptics, but I think that using these tools is not anathema to that. I think that we can think of this as a new way of searching among multiple complicated hypotheses to identify things that are more plausible or more consistent with data. And using those tools can be impactful.

And I don't think scientists are going to just accept the output of an AI system blindly, but I think if it says, "Oh, here's..." Just hypothetically, right, if an AI system says, "Here's a hypothesis that might be worth exploring." Then a scientist would potentially consider that and think about whether that makes sense, why it does or does not make sense, whether it influences what new experiments they want to run. But I think any kind of rigorous, careful scientists would use this as a jumping-off point instead of a final result.

Vass Bednar (host):

Policy Prompt is produced by the Center for International Governance Innovation. CIGI is a nonpartisan think tank based in Waterloo, Canada with an international network of fellows, experts and contributors. CIGI tackles the governance challenges and opportunities of data and digital technologies including AI and their impact on the economy, security, democracy, and ultimately our societies. Learn more at cigionline.org.

Paul Samson (host):

So thinking forward a little bit, the way things are evolving, we touched on it already, but the interdisciplinarity of AI for science and a lot of projects that are going on now. At the center of that, often are data scientists. And so in a way that seems like the role of the data scientist is evolving a lot to be at the center or integral to a lot of research scientific discovery. Are data scientists now becoming co-discoverers in that way? And is that kind of how your institute at the University of Chicago operates? Are you now a facilitating node to some degree that these things didn't exist in the same way previously? Are things evolving?

Rebecca Willett (guest):

Yeah, I think so. So we run a number of different initiatives including training programs, one supported by Schmidt Sciences. Where we train people whose backgrounds are in the natural sciences to use modern AI tools. But simultaneously, we're training data scientists to advance and develop the next generation of AI tools to tackle problems that existing off-the-shelf methods cannot tackle. So for example, as I mentioned, people are using AI to improve, for instance, weather forecasting. And there's a desire to use these tools to also improve climate forecasting. And so within that context, one core challenge is that many of the AI models are unstable in the sense that what these models might do is take the current state of the weather and make a prediction about what the state would be six hours from now. And so if I want to make a prediction about what the weather will be in two weeks or in two months, I would take that model and I would apply it recursively; just over and over and over again. In order to get to a point where I'm making a prediction say two weeks or two months out.

Problem, when I talk about instability, is that if I keep running this process long enough, the outputs will blow up. It'll produce things that are completely inconsistent with what we know about physics, like predicting the daily high being a thousand degrees kind of thing. We just have no reason to expect that to occur. And so data scientists and machine learning and AI experts, including people in my group, have been thinking about how do we rethink the way that we design and train neural networks so that they can make accurate forecasts and sidestep this whole instability issue? So that when we apply them for a long time, things remain consistent with known physical constraints and known physical properties.

And so this is where you really do need this collaboration and interplay between the scientific domain experts and the people with real expertise in the underlying computer science principles, mathematics principles, and statistical principles. Just as another example, in the sciences, often we don't want to only make a prediction about what would happen, but we want to have some uncertainty associated with that prediction. You don't want me just to tell you, "It's going to rain tomorrow." You would like me to tell you, "Oh, it's got a 60% chance of rain tomorrow."

Vass Bednar (host):

Honestly, Becca, if you could tell me for sure that it was going to rain tomorrow, that is actually what I want to know.

Rebecca Willett (guest):

Yeah, the problem is if I tell you it's for sure going to rain, and I'm wrong.

Paul Samson (host):

That it definitely won't rain is what I want. I don't want rain.

Vass Bednar (host):

Sorry to interrupt you.

Rebecca Willett (guest):

No, no, no, that's a great point. But yeah, what we'd like to do is to accurately tell you how certain, I as an AI, would like to be able to output. Not only what my prediction is, but how confident I am in that prediction. And I want that to be real in some sense, not just a made up estimate of uncertainty. And so statisticians are working jointly with domain scientists and computer scientists to figure out how we can actually assess what these uncertainties might be so that people have a sense of when the AI model is very confident versus a little uncertain.

Vass Bednar (host):

Well, and picking up on that collaboration, do you see a universe in the future where an algorithmic system is sort of cited or stands as a co-author on scientific papers or even kind of a co-theorist?

Rebecca Willett (guest):

That's a fantastic question, and I don't know. I know that there is a lot of interest in using AI tools to augment and help with human creativity. And I think of that as including things like hypothesis generation, or a little bit more on the math side, conjecture generation. And so in that sense, I mean, I certainly see these things as having the potential to collaborate and add to whatever we discover. The right way to attribute credit though is unclear. Right now, AI tools cannot do a lot without a lot of human intervention and guidance and expert knowledge. But the field is changing fast and it's hard to say what things will look like even just a few years from now.

Paul Samson (host):

So in a way, co-authorship or co-theorist accreditation and things is maybe a little bit further out. But right now, there's very much a question about transparency of what AI tools were used in this experiment or in this research result, in the social sciences as well as the natural sciences. And I think it's becoming a bigger and bigger issue as trust goes down in a way about what's really the source of the information I'm looking at. So the accreditation is probably going to become more important. Right now, in your work, it's just standard to talk about the kind of machine learning that would be used, those are tools you've been using for a long time. So it's perhaps less new for you, but it feels like this is becoming a real issue about just full transparency on what's going on behind the curtain of who's generating what.

Rebecca Willett (guest):

Right. For the most part, the work that I and my collaborators are doing, we would precisely describe the AI tools that we're using, or more commonly, how we've needed to change and adapt existing tools to be more effective for different scientific problems. And so there's not really an attribution issue there. And even when I see people trying to use things like large language models to assist them with a literature review. Like I type in, "Oh, tell me what the scientific literature currently says about topic X."

Even then I see that more as sort of a starting point where maybe it'll output a list of papers, but then it's still our job to go and look carefully at those papers, assess the quality of the research, and make sure that the way that the AI tool described the conclusion is accurate and sufficiently nuanced for the task at hand. And so, at least in my group, we're not at a point where we would offload any key component of our research to an AI system, but we certainly can use them for things like idea generation, literature review. And then of course, as kind of a core of our research on developing the next generation of AI tools.

Paul Samson (host):

It's a lot harder in this social science domains right now, I think. And so it's more of a Wild West as to what's appropriate, what's being used, and I think scientists are more comfortable with using these kind of tools and being clear about it. So it's a space to keep watching. A big question of course that is always out there is how does bias come into play in any discussion of AI? In the context of AI for science, of course bias I would just say is a feature not a bug of human systems. There are biases that are there. So this idea of zero bias world to me personally is not possible. But is there a risk of biases being amplified or creating particular problem soft spots in science and having a significant implication? Is there an increased risk of bias perhaps in this age of AI?

Rebecca Willett (guest):

Great question. So first of all, when we talk about bias, I think many of us, it's a loaded term, right? Many of us might immediately start thinking about things like racial or gender bias. And those are of course significant challenges, particularly in the social sciences. I think they play less of a role. I think bias in a much broader sense where we're more predisposed to some predictions than others because of the AI tool that we're using, is a much more general phenomenon than just worrying about racial or gender bias, for instance. So just as an example, you could imagine that if I were using a large language model to assist with my literature review, that it might be biased towards papers coming out of famous universities, and be less likely to give me results coming out of less well-known universities or less well-known groups. Even though it might be highly relevant and very high quality research. So that would be a type of bias that would concern me.

But I would also say there's very strong biases in any of these systems in a more kind of technical sense. In the sense that, how do I want to phrase this? Imagine that we're building an AI system to help with medical imaging. So you go to get an MRI and the scanner collects a bunch of data. Now, if you and I look at the raw data, it just looks like this arbitrary list of numbers. It's not something that a radiologist can visualize. And so we have to use computer algorithms to form an image. And recently, there's been a lot of work on using AI and machine learning tools, neural networks in particular, to take that data and translate it into an image. That process, of course, is going to have implicit biases built in because there's maybe an infinite number of images that could be a good fit to the data.

And what we're searching for among all those infinite images is the one that not only fits the data, but which also is somehow consistent with the way that we know brains look, or the kinds of training data that we have available to train our AI system. So any of these systems are going to have biases, and some of them are extremely useful, right? I'm biased towards things that are more likely to correspond to a physical reality. That's exactly what I want. And so the challenge is just figuring out, well, what are these biases? Can we assess them? Can we use that assessment to identify bad biases that we want to eliminate or where we are lacking the good biases about consistency with physics that we want to preserve?

But in the social sciences, things can get quite challenging. I mean, just as one example, you could imagine that I have a bunch of data representing police interactions with a community and that those have reflected biases of society. And if I use that data to train a machine learning system and ask it, "Well, what should we do in the future?" Then it's going to be biased towards doing what we've done in the past, which is reflective of those societal biases. And so trying to make sure that these systems are not susceptible to those types of biases is extremely important. It's not the subject of my own research, but I think as we see these kinds of tools being used in increasingly high stakes settings, not only criminal justice but also healthcare and finance and other things, it's extremely important to make sure that they are not reflecting problematic biases from our past that we want to move beyond.

Paul Samson (host):

I want to come in with another question right now that is about future-looking, again, about where there are students listening to this podcast, there are people that are planning their career as young researchers or just people heading out there into the professional world. And how do we prepare the next generation of researchers to use AI? Especially if they're going into AI for science, but what's the right mix right now? If you're a student, an undergrad, do you want to be super deep in domain expertise or do you want to move off a little bit of that and knowing how to master some of these tools and applications as a really important part of your journey? Which may not have been part of the curriculum as clearly in recent years.

Rebecca Willett (guest):

I think that deep domain expertise will always be valuable, no question. But I would encourage students to try to understand the foundations of AI and machine learning, because even if they're not developing the next generation of tools themselves, it's highly likely that they will be users of AI-based tools. And if you don't understand where those tools came from or what their potential pitfalls or failure modes are, then you're more susceptible to making mistakes with them without even knowing what you should be checking for or what the potential issues are.

And so I don't think everybody needs to be an expert in the state-of-the-art development of new AI models, but I think they need to understand the extent to which these are not magic black boxes, the extent to which we understand some of the issues associated with them and where things can fall short and what kinds of tests they need to run if they're using these models within their scientific research. And so I think that's just going to become an increasingly important part of science education moving forward.

In terms of people who are really focusing on developing AI, I think that it's also increasingly important that they understand not only the coding and software development aspects, but also the mathematical and statistical aspects of it. So understanding, for instance, biases from the perspective of mathematics or understanding statistical challenges like how to deal with uncertainty quantification, are all going to be extremely important. And important for people who want to develop new tools and want to do it in a way that's going to be effective and useful across many different domains.

Paul Samson (host):

And even credible, like in the sense that these weren't people just playing around with AI tools, they kind of understood the systems and the right questions and outputs in a way.

Rebecca Willett (guest):I think that's right. And I'll tell you, I think that one big challenge that we face is that we don't have good core principles right now on how to design the next new revolution, let's say in neural networks. There are common architectures that maybe people already have heard of, like convolutional neural networks or transformers. And these are great designs, they've been extremely powerful tools in many different contexts. But at the same time, the way that we come up with new architecture designs that aren't just sort of incremental changes of existing designs is, at this point, a little bit ad hoc.

We don't have core design principles that could let me tell you, without a ton of experimentation, what kinds of architectures are going to work better or worse. And so when we lack that foundational understanding, it kind of limits our research in new AI methods to being a little bit ad hoc and a little bit guess and check. And so I would also hope that people will continue to try to understand those kind of foundational challenges so that we can get to a point eventually where we can make the design of new systems more principled.

Vass Bednar (host):

I find the messiness of the ad hoc guess and check kind of charming and wonderful too. And with that in mind, just to round out, Becca, is there one kind of scientific problem you'd love to see solved with AI in our lifetime, your lifetime? Is there something that you're following, or sort of see as being very promising? Maybe give us something to grab onto and think about looking ahead.

Rebecca Willett (guest):

Yeah. I don't know if I can narrow it down to just one.

Vass Bednar (host):

Fair.

Rebecca Willett (guest):

I think some of the things that we've talked about with improving climate modeling, with improving the design of better materials, with understanding these treasure troves of astronomical data that we suddenly have access to, and improving things like therapeutics, different drugs, other kinds of therapies based on the microbiome. All of these are very important to me, and I am really hopeful that AI will let us have progress in these spaces much more quickly than we would have otherwise.

Paul Samson (host):

So you're saying there's plenty to work with now. There's reason to be enthusiastic about what we can do with even existing data sets, right? We're not waiting for something new, we can work with what we've got.

Rebecca Willett (guest):

That's context-dependent. So there are some settings where existing data sets are extremely powerful and we just need to figure out the best way to leverage them. And then there are other settings where we definitely do need new data. So for instance, the drug design process is this iterative process. We can't test out every molecule simultaneously, or we talked before about microbial communities. Definitely new data collection is needed in these settings. In settings like astronomy, we've got new images coming from the Vera Rubin Space Observatory on a regular basis now. And so that data set just continues to grow, and it's super exciting. All my astronomer friends are watching this daily. And so yeah, I think it's enormously powerful.

I would also just add though, I think even though we talked about machine learning as learning from data by example, a lot of work that people are doing in the sciences is saying, "How can can I learn from data when I also have physical knowledge? How can I impose physical constraints or how can I leverage simulations in addition to observed data to get the best possible predictions?" And there, I think, there's also just a ton of potential because people have spent decades building beautiful models and understanding core principles. And combining those with data is where I think the real future lies.

Paul Samson (host):

That's great. Thanks for spending time with us today. That was a wonderful conversation, and thank you again.

Rebecca Willett (guest):

My pleasure. Thank you so much for having me.

Paul Samson (host):

So AI for science is a super active area in technical universities around the world right now, and in national laboratories, etc, etc. But it doesn't get that much attention because it is very technical. And people are probably still thinking, yeah, it's kind of far away before we really enter a new phase. But it may not actually be that far away. And our guest was great, Becca, at describing what's going on in an accessible way, and this is not that far out, really.

Vass Bednar (host):

It's not. I wonder about attribution and authorship here in science too when we talk about breakthroughs and what facilitates them, and what we're appealing to or relying on these algorithmic systems to sort of help us surface. Of course, there are lots of algorithmic to many dimensions of science as we heard, especially with genomics. So I loved learning more about how the scientific community is grappling with this, but also doing its best to tentatively embrace the best elements for the best outcomes.

Paul Samson (host):

Yeah, totally. So it's super exciting to think about that. You think about those breakthroughs in pharmaceuticals, new materials even. The list goes on and on. And we do hear a fair amount about that, but it's daunting to think that we may fairly soon have these autonomous laboratories. Think of something that's working 24/7, it's got a network of super computers. These things are... What are they going to cook up? That's a serious set of computers. Do we lose control somehow with that? Example of tech moving fast and policy and governance just can't even keep up with this, and it might just kind of be in our face very, very soon with lots of positives, but a lot of unknowns.

Vass Bednar (host):

With that in mind, are we at risk of sort of giving up some of the best elements of the pursuit of scientific breakthroughs? I think about, not as a tangent, but certainly related coverage of AI avatars who are selling items 24/7 online because they're not humans. They can just keep trying to sell you things. And sure, there's a micro economy around that and money's kind of going somewhere, and we can frame it as a breakthrough, but what are we gaining and what are we losing when we again appeal to these systems as a substitute for, instead of a compliment to the work that we already have to do? Huge conversation. One that's ongoing, and I'm glad we got to probe a little bit more on it.

Paul Samson (host):

Yeah, you just scared me with those avatars coming after you, as opposed to texts that say, "Please sign up for something." You just kind of ignore them. But if an avatar is kind of videoing you every 20 minutes, that's going to be... That's intense.

Vass Bednar (host):

Policy Prompt is produced by me, Vass Bednar and CIGI's Paul Samson. Our supervising producer is Tim Lewis. With technical production by Henry Daemen and Luke McKee. Show notes are prepared by Lynn Schellenberg. Social media engagement by Isabel Neufeld. Brand design and episode artwork by Abhilasha Dewan and Sami Chouhdary. With creative direction from Som Tsoi. The original theme music is by Josh Snethlage. Please subscribe and rate policy Prompt wherever you listen to podcasts, and stay tuned for future episodes.