ChemCatBio Webinar: Artificial Intelligence for Catalysis (Text Version)

This is the text version for the ChemCatBio Webinar: Artificial Intelligence for Catalysis video.

ERIK RINGLE: Well, hello, everyone. And welcome to today's webinar, artificial intelligence for catalysis presented by the Chemical Catalysis for Bioenergy Consortium or ChemCatBio. I'm Erik Ringle, the communications representative for the consortium. And before I introduce our speaker today, I'd like to just cover a few housekeeping items, so you know how to participate and learn more about the consortium.

ChemCatBio has several resources available on our website, chemcatbio.org. You can find tools and capabilities, publications, webinar recordings, interactive technology briefs, and a lot more. We regularly update this site, so be sure to bookmark it and keep an eye on the news section on the homepage for notices on ChemCatBio's sponsored events and opportunities like this one today.

You can also navigate to our tools from the website, including the catalyst property database and CatCost. Another great way to get updates from the consortium is through the ChemCatBio newsletter, called The Accelerator. Subscribe to this resource to learn more about ChemCatBio events, funding opportunities, publications, new technologies, and research projects, research teams, and more.

A little bit on participating in the event today. You will be in listen only mode during this webinar. You can select audio connection options to listen to your computer audio or dial in through your phone.

You may submit questions to our panelists today using the Q&A panel. If you are currently in full screen view, click the question mark icon located on the floating toolbar at the lower right side of your screen. If you're in split screen mode, that Q&A panel is already open and is located at the lower right side of your screen.

To submit your question, simply select all panelists in that Q&A dropdown menu. Type in your question and press Enter on your keyboard. You may send in those questions at any time during the presentation, and we will collect these, and if we have time, address them during the Q&A session at the end today.

All right, now, if you have technical difficulties, or you just need help during today's session, I want to direct your attention to the chat section. This chat section is different from the Q&A panel I just talked about. And it appears as a comment bubble in your control panel.

Your questions or comments in that chat section only come to me. So please be sure to use that Q&A panel for content questions for our panelists. All right, automated closed captioning is available for the event today.

To turn it on, select show closed captions at the lower left side of your screen. And we are also recording this webinar. We will post it on the ChemCatBio website in the coming weeks along with these slides.

Please see the URL on the screen here for where to access that. I'll put it in the chat here in a second as well. All right, lastly, just a quick disclaimer I need to read.

This webinar, including all audio and images of participants and presentation materials may be recorded, saved, edited, distributed, used internally, posted on the U.S. Department of Energy's website or otherwise made publicly available. If you continue to access this webinar and provide such audio or image content, you consent to such use by or on behalf of DOE and the government for government purposes. Any knowledge you did not expect or approve will be compensated for such use.

OK, our speaker today is Rajeev Assary. Assary is a computational chemist and group leader at the materials science division of Argonne National Laboratory. He is also the Argonne principal investigator of the U.S. Department of Energy's Consortium for Computational Physics and Chemistry or CCPC for short.

Rajeev's interests, include AI for materials chemistry, decarbonization, and digital discovery for energy storage and conversion. Rajeev has published over 200 peer-reviewed manuscripts and proceedings. And as another quick means of introductions, I'd like to start us off with a short video, and then we will dive right into the presentation.

RAJEEV SURENDRAN ASSARY: I am a computational researcher. I use supercomputers to do computations that can enable catalysis or chemical reactivity at surfaces. Similarly, I use power of data science techniques to come up with connections we can make to predict new catalysts or better conditions for catalysis.

Yeah, this is an awesome opportunity to connect with the various aspects of catalysis from discovery all the way to deployment. So the computations, as an enabling technology, it can tune into various aspects of this research. So that is one of the reasons I'm part of the ChemCatBio community from the day one.

We have a very good working relationship. If we don't know certain things, we will reach out immediately so that we can accelerate our learning much more effectively, basically.

ERIK RINGLE: All right, Rajeev, the floor is yours.

RAJEEV SURENDRAN ASSARY: OK. Good morning. Good afternoon to everyone. Thank you, Erik, for the kind introduction and organizing this ChemCatBio webinar. First of all, it's a great pleasure to contribute something to the ChemCatBio webinar as a decade long CCPC member of this ecosystem.

So today's webinar is all about AI for catalysis. First of all, I am a group leader at materials science division at Argonne. My major research interests are energy storage and chemical catalysis and decarbonization.

The various tools, such as AI for materials, autonomous material discovery, these are used in moving towards clean energy and sustainability. So these three topics have some strategic importance to Argonne National Lab. And I'm happy to be involved with this.

So ideally, what we need is utilize this power of AI and high throughput experimentation and this robotic experimentations to make discovery of tomorrow's materials much faster so that we can achieve decarbonization and clean energy and sustainable economy. So let me start with a catalysis dream. Imagine if you could design, discover, and demonstrate, deploy optimal catalysts faster and cheaper with the help of AI.

I think that's the real promise of AI at the moment. So if you look at the bottom of the slide, you can see an optimal catalyst, which is a-- this is something we really decide based on multiple optimal properties. For example, cost has to be cheap. Our operating cost has to be cheap.

And it has to-- catalyst has an optimal selectivity or high selectivity and stability and recyclability and all of those objectives. So I want to-- we want to make sure that we maximize all these positive objectives using all the available tools at the moment, for example, literature, limited experiments, analysis, modeling, and the AI tools. I think the great aspect, a great opportunity for AI is AI can help analyzing all these things.

And hopefully, AI can help us moving to design and discover new effective catalysts. So now, what are these AI tools? I mean, if you look at the slide at the bottom, I mentioned about AI.

AI is about creating intelligent agents that can reason, learn, and act autonomously. Most importantly, it has to exceed or match human intelligence. So we know that human intelligence is a subjective question, same as artificial is itself a subjective question.

So almost all the tools we developed over the last couple of decades can be coupled into-- or a part of, a subset of this AI, for example, machine learning, where we use algorithms to understand the data or classify the data or predict certain properties. A simple example is image classification. You can classify images in multiple buckets.

Similarly, we can predict, let's say, cost of a material or cost of a product. A step up to this machine learning is deep learning, which is also a subset of machine learning. But it has been developed much, much faster. And it's highly parallel mechanism for understanding complex patterns of the data, utilizing deep neural networks.

The principles, we can-- it's highly parallel. And it is a GPU or a graph neural-- graph programming unit compatible. So quite a lot of recent developments all happened in the deep learning era of image recognition and natural language processing, et cetera.

The top two are recent developments. GM means generative models. And LM indicates language models. So language models is the recent trend of machine learning world.

It's utilizing our predictive capability of, for example, text prediction based on contextual learning of what's happened in the MDL universe. So it has access to almost all the world wide web information and many publications and reports. So these are the generalized, some generative models, some large language models also, I will mention towards the end of the presentation.

But majority of the scientists at the moment are working in the interface between machine learning and deep learning to understand their data. So how can we use this AI and computations together to make some solid contributions to catalysis? So I just want to point out, I'm a computational modeler so my catalysts are much more beautiful.

If you look at the right side, this is a beautiful catalyst. The surface in this case is a molybdenum carbide catalyst. So three types of activities of computations of AI can help the ecosystem, catalysis ecosystem. One is AI-guided design.

So there are two type of approaches you can use. You can use an informatics-based approach: collect all the data or all possible information regarding the catalysis. It's a library of fundamental science.

And also, if we know existing catalyst regions or existing catalyst materials and conditions, we may be able to help to optimize these conditions. So AI-guided design is mainly used in the discovery and design and optimizations. The bottom two lines are a little bit more tricky. For example, creating new materials.

For example, two processes there is automated chemical synthesis, that's synthesis of a catalyst using robotics, and also automating or AI-enabled materials characterization. These two areas are emerging areas. I think we require-- we need quite a lot of developments in robotics and also data management to make more progress.

In the final part, I think this is the-- the world is going such a way that we need to access all the public, accessible information regarding the catalyst. And also, it has to be verifiable and also has to be accurate. Otherwise, we are modeling based on completely wrong information.

So careless knowledge accessible for everyone has to be one of our prime concern. And this has to be included. The accurate information has to be included as part of next-generation foundational models, et cetera.

So one slide about AI for materials science in general. If you look on the left hand side, I think this is the larger umbrella of AI and machine learning and deep learning. So if you look at it, machine learning, it's a commonly used technique because computationally, you only require a 100 plus 100 or more data points, you can start making the power of certain algorithms available. For example, Naive Bayes or random forest or Gaussian process regression.

So these are all computationally less intensive because they have a closed form mathematical solution. So you don't need a supercomputer to run any of this, provided you have a decent data, good data. Deep learning, it mentioned-- there is quite a lot of development happening with deep learning.

They always require more than 500 or 1,000. In my experience, many times you need 10,000 or 50,000 data points, high fidelity data points, to make a meaningful prediction. But it can understand any complexity using, let's say, a simple function, it can optimize or understand any function as a solution.

For example, the input of your model can be just a text or, let's say, a descriptor matrix, it's a tensor or even an image or a video or a combination of all of these as a multi-modality thing, this deep learning has developed into that point. On the right hand side, I have shown a picture from a paper published by Meta and Carnegie Mellon University. I think it's called Open Catalyst Project.

So this is probably the largest data set available for catalysis in terms of atomistic modeling. So this is around more than a million DFT calculations. It's still limited to simple oxides, uninary oxides, and binary oxides.

And also, it's limited to hydrogen evolution of reactions and the oxygen evolution reaction. Nevertheless, this is one of the best models out there. And they do not have any kinetics information.

So one more slide about the ecosystem associated with these materials. For example, quite a lot of development happened in the last couple of years. This is showing a bunch of examples.

ChemCatBio itself have a data hub, can provide quite a lot of DFT information. Similarly, Materials Project, it is a database for solid state materials. And Materials Data Facility is another example.

So the slide meant to show that the community has started making some targeted data, data generation so that we are all getting some benefits for utilizing this data to develop machine learning models and some sort of predictive design. So this is also meant to show AI for catalysis is an extremely growing area. If you look at any flagship catalysis journals like ACS Catalysis or Green Chemistry, the amount of AI papers coming out for predicting some catalyst conditions or a catalyst active site or exploring the complexity of reaction networks or accelerating the kinetic predictions are enormous. And we have plenty of opportunities to grow.

For example, if you look at the general AI papers are in the 10 to the power 4 is the relative intensity. I think now, LLM, for example, large language models are creeping up because everyone like this promise of large language models. AI for material science and material informatics, they are a teeny fractions at the moment.

But quite a lot of discovery requires some information. So I think this histogram, the bar will definitely be going to grow up. So let's recap. Now, we have a lot of tools available for catalyst design

If you are a computational chemist like me, I think this is an exciting period of time. So I group these tools as six categories. So first two of them are expensive.

For example, first one, my favorite tools is physics-based simulations, DFT, and molecular dynamics simulations so that you can look at the catalyst active site and possible chemical reactions at a very small length and time scale but reasonably accurately. To extend this timescale and length scale, you can make statistical models or a Newtonian dynamics or a Monte Carlo type of simulations.

So you can-- that is the category number two. And the three is a data-driven catalysis there. So there, we can utilize quite a lot of this data developed in the number one and two category and the experiments to make reasonable predictions.

Four is on the fly approaches. For example, if you are interested in doing on the fly understanding of a spectra or a synthesis of materials, we need to make sure we will get an actively learned system, so that we're improving the predictions after every iterations. So it's more a self-supervised learning.

And five and six is the latest trend about large language models or, for example, programs like ChatGPT or Llama. This kind of information is-- this will take care of a lot of information from the internet or all the available data. I think this is the place we need verifiable data that can get by on CCPC. And all the community can play a vital role in providing very accurate information.

So I just want to point out, quite a lot of exciting work happening in the atomistic scale modeling, mesoscale modeling, and continuum scale modeling at CCPC. And in my own experience over the past 10 years, I think we've worked quite a lot of chemistry from vapor phase chemistry to surface catalysis, ring opening chemistry, reductive verification, ketonization, et cetera. So, so much beautiful chemistry topics happening.

So we started applying this. These are all complex chemistries. So the AI and DFT and machine learning, we started applying to this slowly. So now, back to our two topics.

I am very much interested to tell you the story. This is two stories. One is about designing molecules. Another is understanding properties of the catalyst.

So these two works are published in Digital Discovery last couple of years. I think both of them are open source journals. I think all the data, all the GitHub information are free to get. So you can download-- you don't need any subscription.

So let's get the first point. Here, I'm talking about how to discover a new molecule. So in this case, this molecule is a hydrogen carrier. It's very similar to, let's say, a benzene or a cyclohexane. So they can store hydrogen.

These molecules can store hydrogen so that you can use a catalyst to take hydrogen from the molecule, which is a dehydrogenation reaction. And you can put the hydrogen back whenever you need, whenever you can using other catalysts or hydrogenation catalysts. So that is also-- catalysis is one of the most important step.

But in this point, I'm just talking about, so there are some molecules already available for storing hydrogen. A simple example is methanol and ammonia. So in this case, we are talking about a completely new type of molecular understanding or a new type of molecule.

So the question we asked is that, is there any molecules in the chemical universe that we haven't thought about utilizing as a hydrogen carrier? I think that is the question we asked. So for example, first-- so the scale is slightly big. So that's one of the reason I'm showing this study.

So we know that there are some molecules available based on chemical intuition and also studies by organic chemists. For example, Bob Crabtree from Yale has come up with this type of molecule. It's called liquid organic hydrogen carrier, around 35 molecules or 40 molecules.

So if you know any of these molecules, so first thing we can look at it, is there any molecule look alike that has similar properties? I think that is the concept we took here. So we looked at one data set precisely. I just want to tell you about GDB-17 which is 17 heavy atoms.

So we identified a data set from literature that is called GDB-17 which is around 166 billion molecules. So we used this similarity criteria. In this context, the criteria is called a Tanimoto similarity.

It's about looking at how these molecules are similar with respect to this 35 molecules that we know. And when you look at this large number, 166 billion, not everyone can do it. I think this is one of the reasons we utilize the power of supercomputers.

So in this case, we looked at the 166 billion molecules around 8,000 nodes of supercomputer, around 3 million molecules per second. So we identified that there is around a million molecules have a highest amount of similarity with our test set or the realistic molecules available from the chemistry studies. So from that, also, we have 1 million itself, a lot of molecules to recommend to an experimentalist.

So we reduce it further using some scoring criteria. The development of the scoring criteria is also a chemistry-involved process. You need to understand how the molecules looks like.

So there are some rules we utilized to understand the-- utilize for a heuristic screening. So the details of a scoring criteria is in the publication. Further, what we have done is we have utilized the DFT-based approach.

For example, in this molecule, if you want to take hydrogen out and put hydrogen back in, it has to be reasonably easy within 40 to 70 kilojoules. So with that information, and also, we looked at these molecules, whether they have a melting point and boiling point matches to the liquid organic hydrogen carriers. So we use a practicality screen around 37 new molecules from this one.

I think the data set is available in the Github. But a couple of them are-- a couple of the examples shown here, these molecules were reported not as liquid organic hydrogen carrier. But the story is not finished. I think this is-- we are continually looking at the opportunities for doing the experiments, real experiments, which is most likely the difficult steps for this process.

And also, as I mentioned earlier, we have this area of liquid organic hydrogen carriers is getting a lot of attention. Part of-- maybe there is an opportunity for mixture of liquid organic hydrogen carriers. So there is a need of more data.

One of our coworker, Hassan Harb, has already developed 10,000 accurate energetics of these materials, 10,000 of these new molecules. And hopefully, that can also help us to develop further information regarding the system. And also, we are interested in developing some sort of kinetics index for this.

So that was a complicated paper. I think one of our coworkers from Argonne, Joe Harmon, he wrote a very nice taxpayer description of these highlights. I think, a majority of the group members started understanding much more about this project after reading this nice description.

Let me get back to the catalysis problem here in my talk To switch suddenly from molecules to catalysts, so this is completely developed as part of ChemCatBio and CCPC. So one of the principal objective is how to develop an active and inexpensive deoxygenation catalysts.

We know that-- a majority of you guys know already, removing oxygen is one of the most difficult process in the catalyst-- in the catalysis of biomass. So to get this, I think we started looking at an example of a problem. In this case, is removing oxygen from bio-oxygenates and the bio-oil.

So we utilize a catalyst of molycarbide And one of the reason is majority of the best catalysts are a platinum on alumina or platinum or some other oxides. These are expensive.

So if it's a molycarbide type of catalyst, they're reasonably cheaper. But they all-- typically almost all catalyst, all the transition metal catalysts, has a tendency of oxidation, which is thermodynamically inevitable. So bottom there is a couple of publications. This catalyst can be used even also for a CO2 reduction.

So the question is, when you have a catalyst, when you remove oxygen, they bind on the surface of the catalyst. So now, we have not only just a catalyst, but it's a oxygenated catalyst. But now, if you want to regenerate the catalyst, you need to remove the oxygen by feeding hydrogens.

So, for example, oxygen can be removed as a hydroxides. And hydroxides combine with each other to remove water. So if you can see that oxygen binding is strong, it's not a secret. It's strong so it's very, very difficult to remove the oxygen as water.

But if you-- further, if you decorate the surface with something like a nickel, the removal is a little bit more easier. So this is something we want because we want to regenerate the catalyst a bit more easier. This is very clean facets and calculations done in molycarbide 100 assays. It's very clean chemistry here.

So now, we ask the question, can we design a molycarbide catalyst or something like this to facilitate oxygen removal much faster with reasonable stability? So the key message here is oxygen binding, it can be used as a descriptor. So in future, if you can predict the oxygen binding of a catalyst, it is much more desirable to predict this oxygen binding of this catalyst.

And this can be helpful for construction of a real catalyst and understanding local reactivity. But there is not much data available. And also, in this kind of systems, we are only looking at oxygen binding as a descriptor. It's very simple systems.

So we decided, as part of CCPC, we develop this catalyst data using in silico approaches. For example, in this case, we looked at molycarbide with their seven lower miller indices and almost all possible type of termination. So around 54 type of terminations. And we doped the surfaces with almost 23 dopants from the periodic table.

So we come up around 20,000 catalyst structures. And we have 20,000 catalyst structures, then we include the binding energies of oxygen. Oxygen can bind pretty much a lot of places. There are sophisticated tools, like a CatKit and ALCC; computational tools can be used to develop this stuff.

So the long story short is 20,000 calculations or 40,000 calculations cost around 15 million CPU hours. I think, we got through an ALCC allocation at Argonne. I remember, I think, Trevor and Jeremy were very supportive of this one to get this allocation.

So utilizing this data, now, we have a catalyst, and we have binding energies. Utilizing this data, we develop a graph neural network. So in the future, we don't have to do DFT calculations. We can just show a graph of the structure, graph of the catalyst or local site, you can predict the energy.

So this approach is called a local coordination graph neural network. So now, we have a-- let's say on the left side, there's a histogram that's showing 20,000 binding energies of oxygen on this catalyst. On the middle, we utilize this graph to predict the binding energy and compared with the computed binding energy because we already have the truth value.

So while we publish this data, I think the mean absolute error was 0.18 eV. That means if you just show a graph structure, it will predict the binding energy with a reasonable set of accuracy. I think, now, we have even reduced this significantly lower because of the help from a computational force.

So on the extreme right hand side, you can see that around 15k data set is needed to even predict this one. And these are quite good numbers because molycarbide is not like a simple palladium catalyst or palladium oxide catalyst. So this is a little bit more complicated catalyst.

So we made a-- we are very happy that we got a decent value, while we publish the data. One more quick example of power of this deep neural networks. If you look at it on the extreme right hand side, if you take out a machine learning model and do a gradient of the final output, you will get a normal gradient or saliency map.

So if you carefully see the saliency map higher and lower, it indicates oxygen binding is less and weak. For example, tantalum, we know that tantalum is a higher Lewis acid than cobalt or nickel. They can bind-- they bind the oxygen very strongly.

And also, from a t-SNE plot, we can also see the binding energy is also associated with this number of coordination with oxygen. So it is taking care of almost-- the machine learning model is taking care of almost all the physical constraints or molecule knowledge we are providing. So this is a good opportunity to utilize this machine learning for probably in future for a [INAUDIBLE] construction kind of stuff.

Similarly, I think Joe also wrote a nice article about this one. The beauty of this one, you can predict this oxygen binding in milliseconds compared to five hours of periodic DFT calculations. So I think that is the beauty.

And hopefully, we will be utilizing this model for other binding of, let's say, carbon and oxygen for CO2 utilization, as well as water formation. So in the last two slides, I just want to quickly mention that recently, there is a huge interest in natural language processing and these large language models. This is one interesting quote I saw.

LLM can do jaw-dropping things, but nobody knows exactly why. It's from a technology review in MIT. This is true. I think, we-- in the community, I think there is a lot of information out there.

Some of them are useful. And some of them are not really useful. And on the right hand side, you can see quite a lot of large language models now. And autonomous agents are now leveraging chemistry and materials knowledge. It's quite a lot of them.

I think this area is growing so rapidly at the moment. So we need to utilize these tools for a catalyst design or a process design in the coming age. For example, there are ways you can do it now.

We can look at any of this, let's say, CataLM, which is a latest electrocatalyst large language model released from China. These kind of large language models, you can take it and you can retrain them or fine-tune them based on some existing catalyst data, which you trust a lot so that it can take care of what is happening in the outside world and whatever the data set system you have.

And also, you can provide quite a lot of your domain knowledge from publications that you have and you collected over the period of many years as a PDF document. And you can convert this to some sort of a rag model, which is called a retrieval of argumentative generation models to generate more new chemistry. So soon, I think-- I think, now, Siri is also want to do-- want to work on open AI. So soon, you may be able to ask Siri for a catalyst recommendations. I think the text on the right, it's a bunch of possible outcomes you get from Siri.

But verifying these things require quite a lot of domain knowledge. So this won't take away anybody's job, but it will create-- it will give you a large amount of tools that you will be able to utilize for your discovery, catalyst discovery. I think that's all I have.

And in summarizing this one, I think, in a high level, I showed molecular discovery is possible from a extremely large chemical space, if you can combine with the atomistic modeling. And also, data-driven material property prediction depends on the data. And near term, this binding energies and the macro kinetic modeling can be sold for a given catalyst in terms of designing the catalyst. I think the experimental synthesis of a catalyst or the measurement or verifying the kinetics is going to be crucial.

Similarly, the red hot topics is: there is a vast opportunity for us in community to use the growing strength of AI, especially the large language models and deep learning. And I showed the examples of we can do-- we can develop huge data sets with dedicated data generation efforts. So as I mentioned, there is a huge opportunity for AI for synthesis of catalyst, as well as categorization for catalyst.

With that, I thank you CCPC and ChemCatBio folks. And I'm happy to answer your questions. Thank you.

ERIK RINGLE: Awesome. Well, thanks, Rajeev. Very interesting and informative presentation. We do have time for a couple of questions. So as a reminder for the attendees today, you can use the Q&A box to submit your question right now.

Just type in your question, and we will ask it to Rajeev, if we have time. And just to kick things off, Rajeev, I'll warm things up with a question here. So what would you say is the role of ChemCatBio in the AI world, as you've described it in your presentation today?

RAJEEV SURENDRAN ASSARY: Yeah, I think the good thing about ChemCatBio is a huge ecosystem. I think we have both of them. We have modeling. We have experiments and characterization, synthesis and everything.

So a topic-- if you are interested in a genuine catalyst area, I think we can provide a significant amount of data sets from the design to testing, which is the high-fidelity data of the catalyst Which is not easy for, let's say, a university. Because we can provide a-- we can cover large aspects of this catalyst discovery process.

ERIK RINGLE: Great. And then a question about catalyst characterization. Is there a role of artificial intelligence in doing that kind of work?

RAJEEV SURENDRAN ASSARY: Yes, I think. So catalyst characterization often is a complex process. I think the signals are complex. So you need some sort of data reduction techniques.

So the AI can definitely help, I think. And there is a huge value for it because we need to understand what is happening on the catalyst surface very clearly so that we know what is the desired outcome of the catalysis process. So the imaging techniques and multi-scale modeling can really help.

So the AI of this area is slowly emerging. I think this is the time we should invest more time and efforts for characterization.

ERIK RINGLE: OK, I have another question that just came in. So how does or how do large language models convert graphs and pictures, if only the catalyst data is available from literature or just image data? Let me know if you need me to repeat any of that.

RAJEEV SURENDRAN ASSARY: No, I understand. I think all of the large language models can take care of text or tables or images and get the information. Large language models generally good at learning the context or make connections from one another. So if you show many of these images and many of this context. For example, molycarbide has this spectrum.

And [INAUDIBLE] carbide has another spectrum. So they start understanding the relationship between these, the images, as well as and the text and the tables and everything. So it requires a little bit more fine-tuning because of this.

These are not necessarily completely available in the open data sets, open literature. So we may have to fine-tune a little bit for our specific task. So that area is autonomous agents. It's an activity flourishing at the moment.

ERIK RINGLE: Yeah, thanks for that. A couple more questions came in. Is ChemCatBio or ElectroCat or CCPC interested in developing a LLM or GPT or agent model?

RAJEEV SURENDRAN ASSARY: Yeah, developing any models is not easy task. It requires a quite a lot of-- I mean, we're talking about millions, tens of hundreds of millions of GPU times and manual times. So we may not develop anything. Most likely, we will gather information on the catalyst of interest and probably use a fine-tuning approaches. So it requires significant development investments.

ERIK RINGLE: All right, I think we have time for maybe just one more, Rajeev. And this is a little bit of a longer one. So I'm just going to read it here. Since the atomic scale models can calculate efficiency and performance based on specific atomic structure and catalyst atom location, can these tools be used to, one, predict performance from a mixture of catalyst structures and two, show potential synergistic roles for such mixtures? Let me know if you need me to repeat any of that.

RAJEEV SURENDRAN ASSARY: No, no, no, I understand. I think this is a very good question. I think atomistic model itself is a very simple representation of what's going on in the catalyst. I think to understand how this synergy works on various catalyst sites, I think, we probably need to construct a set of model situations when the active sites coexist and chemistry happens between these sites. I think we need a dedicated-- some deep dive studies or a descriptive analysis for this stuff for answering for sure.

ERIK RINGLE: All right, well, I mean, I think we're right at the end of our time here. I appreciate all the questions and engagements. And thanks for everyone who just joined us. Rajeev, thank you for sharing your experience and expertise and research here.

Just as a reminder to the group, a recording is going to be available on the ChemCatBio website as soon as it's available, probably in a couple of weeks or so. If you have questions in the meantime, we encourage you to contact ChemCatBio through our website, chemcatbio.org. Or you can email Rajeev. You can see his email on the slide here.

And then while I have you, I'll make one last plug for the ChemCatBio newsletter called The Accelerator. I'm going to put a link in the chat. This is a great resource to keep tabs on further updates or other events like this one from the consortium.

So with that, I think we will take our leave. Have a great rest of your day. And remember to stay tuned for future ChemCatBio webinars. Thank you.

RAJEEV SURENDRAN ASSARY: Thank you.