Episode 84 Martin Pacesa on BindCraft An Automated Pipeline for De Novo Protein Binder Design

April 14, 2026 | How is BindCraft, the automated pipeline for de novo protein binder design, changing the protein design industry? Martin Pacesa, assistant professor of pharmacology at the University of Zurich, joins The Chain to discuss how BindCraft is helping non-protein designers learn how to design proteins—and how the role of the protein designer will evolve. With host Chris Bahl, their conversation explores how BindCraft may help with drug development, its role in the democratization of tools and resources, and how to work more collaboratively with AI models. Pacesa will also be speaking at May’s PEGS Summit in Boston.

Links from this episode:

AI Proteins

UZH - Institute of Pharmacology and Toxicology

University of Zurich

Models & Molecules Podcast

GUEST BIOs

Martin Pacesa, Ph.D., Assistant Professor, Pharmacology, University of Zurich
Pacesa’s interest lies in using computational and experimental methods to interrogate the interactions between proteins and nucleic acids and how we can accurately extract the dynamics of such interactions. He did his doctorate in structural biology (cryoEM and crystallography) with Professor Martin Jinek studying the mechanism of CRISPR-Cas9 gene editor activation and off-target tolerance. He then did his postdoctoral research in the lab of Professor Bruno Correia, developing computational methods for the solubilization of membrane proteins and the design of highly specific protein binders.

HOST BIO

Chris Bahl, Ph.D., Founder and CEO, AI Proteins
Chris Bahl, M.S., Ph.D. is the president, CEO, and founder of AI Proteins. Prior to AI Proteins, Chris was a founding faculty member at the Institute for Protein Innovation in Boston, with co-appointments at Boston Children's Hospital and Harvard Medical School. Chris pioneered the ability to computationally design miniproteins de novo as a postdoctoral fellow with Nobel Laureate David Baker at the Institute for Protein Design in Seattle. Chris is dedicated to spreading knowledge in computational protein design through initiatives he founded such as the Boston Protein Design and Modeling Club and the Latin American Workshop on Artificial Intelligence for Protein Design. Chris serves on the executive council of The Protein Society, and the executive board for Rosetta Commons. He is a technical advisor for IMPRINT Labs, and a scientific advisor for Applied Photophysics and BioLoomics.

TRANSCRIPT

Welcome And Introductions

Announcement 0:01

Welcome to The Chain, the podcast exploring the lives, careers, research, and discoveries of protein engineers, scientists, and biotech professionals. We look at the impact their work is having on the field and where the industry is headed. Tune in to stay up to date on the newest advancements and to hear the stories that are impacting the world of biologics.

New Lab Plans And BindCraft 2

Chris Bahl 0:25

Hello and welcome to the Chain Podcast. If you've tuned in, I'm guessing it's because you've tried out BindCraft at some point. It is spectacular. I'm here with the first author of BindCraft, Martin Pacesa. He's a new assistant professor at the University of Zurich. He's also the young scientist keynote at this year's PEGS Boston Conference. I'm your guest host for today, Chris Bahl. I'm the founder and CEO of AI Proteins. All right, so to start us off, Martin, so excited to chat with you. Can you please tell us a little bit about what are the first projects that are going to come out of your new lab?

Martin Pacesa 1:01

Yeah, thanks, Chris. It's it's great to talk to you about this, and it's great to be here. Well, my lab to literally just started, so it's formed officially 1st of January 2026. So it's an easy day to remember for me and to celebrate. And it's sort of continue the first, very first project that we're obviously working on is BindCraft 2 to get people even more excited. So we're doing some finishing touches on that, and there's a lot to be excited about. I think there's some new things coming out that I haven't seen in other binder design papers yet. So there's going to be a lot of new cool features to try. But if I'm going to keep it under reps to keep the suspense up in the in the months to come. But that's sort of not our my, I would say, my overall goal. I think my lab, I would say, is more under the blanket of programmable biology across scales. So going from in now at this point designing creating protein interactions in a more static way, then going on to more programming, more dynamic interactions, and then going to bigger scales, so not just proteins alone, but assemblies, ligands, new nucleic acids as my particular passion. And then going all the way, you know, more for like a few years from now or a few decades from now, trying to achieve some sort of level of programmable biology on a, you know, on a cellular scale. That'll be a sort of my my big dream, why I got into this in the first place.

Chris Bahl 2:19

It's very cool. For folks, maybe that are tuning in that haven't heard of BindCraft. You know, this is a very exciting tool that I think is the first real example of truly democratizing protein design for non-protein designers. And the reception from the community has just been spectacular. So we'd love to learn a little bit more about your motivation for developing BindCraft. Like what inspired you to start working on this?

Martin Pacesa 2:48

Well, I think it was already that there have been PyPoints before that tried to democratize a protein design, like collab design from Sergei Ofchiniko, upon which BindCraft is heavily based. And I think I started playing around with collab design and Alpha Full when it came about during the pandemic. I was at home, just had my computer, and I was like, okay, this is something I wanna wanna try out. And this was before I really worked on protein design. And then when I joined Bringer Correa's lab, one of the big things in Bruno's lab was binder design because that at that time it was still a very difficult thing to do. You had to screen thousands of designs to get any resemblance of a hit. And I really, really hate yeast display. So I tried to do everything possible not to do yeast display. And this meant basically coming up with a new way on how to approach design. And we had a bunch of targets that with the traditional Rosetta techniques, we're not able to get any designs for. And I basically spent about six to nine months doing the background coding on BindCraft, and then we tried it on the first design, which is the classical PDL1 case. And I think the the previous best that we're able to achieve was like one in a thousand hit rate.

Martin Pacesa 4:03

And then I did the first pull down just as a proof of principle, and suddenly I got seven out of ten hits, and I was like, there's no way something went wrong. And then we repeated the experience a couple of times to realize, oh wow, actually the hit rate's pretty good. So we tried it on a couple of other cases. I started working with Leonard Nickel on this, and he had a bunch of targets that he was not able to design against. And yeah, it turned out to be pretty good. And we never intended this to be a project, we just wanted it as a tool internally to be able to do some cool biology. But yeah, at that point, we were after we've already done like 10 targets, we realized, well, I guess we should publish this. It was it was great. We really put a lot of effort in making sure it's easy to use for non-computationalists because I'm also originally not a computationalist, and we have a lot of people in our lab who also aren't, but we also wanted them to be able to use it. And I think that's really how it became what it is. You know, this tool that many people could use without any prior experience in protein design.

Chris Bahl 5:02

That's awesome. You know, I think a lot of the attention that BindCraft has gotten is because of the pretty spectacular success rates. And I think you're well on your way to negating the need for yeast display for initial hit discovery, at least for many targets.

How The Pipeline Filters Designs

Martin Pacesa 5:17

Absolutely.

Chris Bahl 5:18

So, you know, the the actual algorithm itself uses a like kind of pieces together a couple of different algorithms, right? Like you've got Alpha Fold 2, Protein MPN, both you know, for backbone generation as well as filtering. Which of these right now is like the you think the big bottleneck? And is there is there like what are some of the take-home lessons from the automated filtering?

Martin Pacesa 5:42

Yeah, I think the reason we made it multi-stage is because oftentimes if you have a tool that's a generative tool that's trained to do one thing, it will always give you an answer, no matter whether that answer is good or good or bad. With this kind of you know multi-stage approach where you have alpha fold multimeter that has been trained on complexes, hallucinate sequences that presumably have good interfaces because it knows things about interfaces, then you have MPNN, which improves the solubility of those initial designs, because we've found very early on that just hallucinated sequences on their own are not very soluble. And then lastly, I think this is and this is the main bottleneck, but at the same time, the reason why we do have such high experimental success rates is prediction with the alpha fold monomer model. So alpha fold monomer meaning a model of AlphaFold that has never seen a complex and was only trained on individual chains, which is a little bit counterintuitive because one would want a model that can predict complexes. But actually, the the rationale was that if a monomer model that has never seen a complex is able to predict that binder in complex with its target, then the interface must be so obvious that it's probably a very good interface. So this was sort of the what was going out through my mind when I was putting this together and why it was assembled this way. And I think it that is the main bottleneck. So we definitely lose a lot of potential binders at that step. But I think there is a trade-off is that you know, on the other hand, you're enriching first sort of the most obvious solutions, which is why we do get such high experimental success rates. I guess going forward, I must admit, I would probably not change the way I did if I would like retrospectively have to make different choices. I think that's really the the discounterintuitive choice that made it so powerful.

Learning From Failed Binders

Chris Bahl 7:24

Nice. So, you know, not every so there's many designs that are filtered out that don't make it to the end of the of the pipeline. And then even, you know, while the success rates are very high, there's still a lot of designs that don't work when we when we go make them. And it's you know, one of the things I guess that's been hard in the field for so many years is how do you learn from stuff that doesn't work? And just just curious, did you think that your protocol, like what can we learn from the stuff that doesn't work?

Martin Pacesa 7:52

Yeah, I think about this a lot, especially in the last few months. You know, how can we get through this little the last hurdle of getting this, you know, from this on average 10% success rate to like at least 50%? I don't think we're ever quite get to 100% because you know, failure is relative. You know, it could be set up, it could be the reason it the way your experiment is set up, or you know, you're just just a bad day for one of the designs. I think there always the limitation will be all always the model or your oracle that you're scoring with. So I think we've sort of reached the level that Alpha Fold can give us. Like this is the best, you know, false positive rate we can get. And I think the only way to move forward is we either get better, make better data sets, like different training regimen for the PDB structures, or actually start associating experimental data with that. Problem is where do you get all this binding data in a reproducible manner? This this is very tricky. And I don't think unless you pour millions into that, of millions of dollars into that, I don't think that is a very solvable problem. So I think the next step would be to find more clever representations of these interfaces or interactions that could help us maybe, you know, through us through that to help the models discriminate better.

What It Takes To Be A Drug

Chris Bahl 9:11

That's fair. I guess the the Oracle component is interesting, right? And maybe dovetails into another question of you know, I think a lot of folks are using BindCraft to make research tools, and it's really awesome for that. But there's a obviously a big industry that wants to use these protein design tools to make drugs. And a binder in a in a research tool is a very different bar to, you know, than making a drug. And you know, the oracles, you know, these don't understand anything about physics explicitly, right? There's no temperature or pH or solvent concentration. And in the models that they're trained on, like the protein data bank, are also most of them are cryostructures, so they're frozen proteins. So they don't necessarily even represent what a protein is going to look like in solution at room temperature or at 37 Celsius. So you know what what do you think of as the as maybe like the key advances that we still need to make to go from, you know, will it bind to it's a drug?

Martin Pacesa 10:15

Absolutely. I think the only reason binder design is sort of the more let's say the most successful part of protein design problems is because the binding problem doesn't really require any knowledge of dynamics because you want just protein to bind and be pre-ordered, and that way you prepay an entropy cost, and you know, you know, it buy it binds a lot better. So I think this is the reason why it works so well compared to something like enzyme design, where you have to know, you know, about the transition states, about sidechain of you know, conformational heterogeneity, etc. I think that a lot of it is solvent, you know. We don't we really neglect the effects of solvent and ions, which are in the end the biggest enemies of your binder binding to its target because it first has to displace all the solvent. And I think unless we start accounting for this, I think it will be really difficult to reach, let's say, that one design, one binder paradigm or state of the field. So I would say we need to start incorporating some. I see machine learning as a great tool for the time being, but I would love to get to a stage where we don't need machine learning anymore because our physic knowledge or this at least the models are so good that you know they just got us over the hurdle.

Chris Bahl 11:30

So sounds like you're advocating for old-fashioned learning, as people used to call it.

Martin Pacesa 11:35

Oh, yeah, yeah. It's seems I guess that's what I used to from back in the day.

When Models Resist User Intent

Chris Bahl 11:40

Nice. Well, well, maybe speaking of that and the algorithms maybe being too smart for their own good, you know, recently you sort of jokingly mentioned that you know BindCraft can sometime sometimes like outsmart users by you know rejecting weak binding sites that that people might have selected and kind of redirecting the binder to a potentially more favorable one. You know, obviously this might have a heavy dose of bias from multiple sequence alignments, whether explicit or implicit. But just would you know, I would love to explore that thread a little bit around AI agency versus user intent. Yeah, what as we move towards this more autonomous design, how do we start to work more collaboratively with our models to pick pick what we do?

Novel Interfaces And Hard Targets

Martin Pacesa 12:23

Yeah, that's a that's a great question because in the end, the machine learning model only learns what you show it, right? It's learn mostly learns within distribution. It's very rare. That's how simple models learn anything out of distribution. And the sort of hallucination approach allows you to go a little bit out of that distribution because sort of you only use it as a scoring device, not a as a yeah, not as a true generator. So the the too smart for its own good. What I mean by that is that if you choose a bad binding site, the loss function in in BindCraft is constructed in a way that it sort of pulls you away from that bad site for two reasons. First of all, sometimes you know we don't know what a really a good binding site is. We have these general rules like, oh, it has to be somewhat hydrophobic and you know not too too loopy or not too flexible. But in the end, we still use AlphaFold to score. And if we target size that Alpha Fall doesn't like to begin with, then we're never gonna get a solution. So we're it's it's a little bit of self-fulfilling prophecy in that sense, but it seems to work in the wet lab, which in the end is the only thing I consider a real success. So, you know, if you see a paper with only in silico benchmarks, I don't know. I gotta see the wet lab. So far, the wet lab has been pretty good to us, you know, with this strategy. And the reason you know this happens in the first place is because the loss function is multi-composite, so it's not just binding, but it's also you know the foldedness, the compactness of the protein, because in the end, these are all properties that you will need to be able to produce that protein. If you can't produce it, you can't test it, and then well, you have a nice model in silico, but that's as far as it gets you.

Chris Bahl 14:03

So does that mean that there the types of interfaces that BindCraft can design have to look something like a natural protein interface? Or do you think it can do stuff that's like pretty unlike just I guess maybe another analogy, thinking about like, for example, Alpha Go, the algorithm that learned how to play Go and you know was beating all the grandmasters, and it was coming up with novel tactics that no human had ever developed before. Do you think BindCraft is able to come up with novel tactics for creating interfaces, or is it really just using the same playbook that it's observed from nature?

Martin Pacesa 14:39

It certainly learns some tricks, it certainly learns the rules of the game, but we have so far yet to see it recover fully a natural interface. So it's probably doing some sort of Frankenstein thing under the hood. But because these interfaces, they're all so different, it's really hard to detect any patterns. The only patterns you really see if you go down to peptide binders, and sometimes you see like anchor residues that have been previously shown. But on the other hand, we we so far we had very few proteins that we were not able to bind, even proteins that don't have any structures in the PDB at all or don't have any traditional binding sites. Like my favorite example are the allergens from our paper, because those are sort of the the prototypical not bindable proteins. They're supercharged, they don't really have any known natural binding partners, or as far as we know, they're really terrible targets, and yet, yeah, we were able to make pretty efficient binders against those.

Chris Bahl 15:35

I mean, that's super impressive. Maybe pulling the thread a bit on the on the stuff that didn't work. Do you you know, is there any insight that can be provided there of like the the nature of these interfaces that don't seem to be bindable?

Martin Pacesa 15:48

I wish I had a good answer for that. I not really be they don't seem that there is a pattern of okay. Obviously, site sites that are very loopy and floppy are a bit more difficult to bind than others, but in terms of, for example, how charged they are or how hydrophobic they are, not necessarily. The one saving grace I would say there is that I can say this every time we were able to get passing designs, we were also we also got binders in the wet lab, but we had cases where we're not able to even get passing designs. So we had some targets where you know, no matter for how long we ran the pipeline, we didn't get any passing designs. And this is kind of quite what I like about it, is that it's not going to always give you an answer just because you want it, and then you go into wet lab and you spend two months trying to test them and nothing comes out. But you sort of have this pre-filter step there that you know, okay, if I'm not getting anything, is it even worth going into the wet lab with some of these designs? Now, this is sort of why we've stuck with it for so long, even though there's a lot cooler models out there right now. But this this sort of self-check is sort of something that's missing for me. And I think this is particularly why web lab biologists are are so drawn to it, is because they know if they get it, it's worth trying, it's worth testing them. But if they don't get anything, well, then you know, maybe it's not even worth trying the others.

Chris Bahl 17:03

So do you think that you know, if there's is there is there something to be learned about the druggability of a specific site if if BindCraft can't come up with a nice solution for it?

Martin Pacesa 17:12

I don't think so. I wouldn't go that far. I think it certainly has its biases and it certainly prefers very well-defined binding size, like such as on on cellular receptors. And this bias, of course, comes from the alpha fold training. But I wouldn't say I don't think personally that there is such a thing as an undruggable site. It's you're either just using the wrong modality or just looking at it wrong. But I think every protein site is druggable. So I think there's a binder for every target out there. That's that's very bold.

Announcement 17:41

If you're enjoying our conversation today, we recommend a podcast called Models and Molecules, produced by Enpicom. They host candid discussions with former RD leaders, computational biologists, and AI innovators about the ideas, challenges, and technologies redefining how drugs are discovered. Visit enpicom.com slash models-molecules, or see the link in the description. At the chain, we'd love to hear from you. Please subscribe to the podcast and give us a rating. It helps other people find and join the conversation. If you've got speaker or topic ideas, we'd love to hear those too. You can send them in a podcast review.

Druggability Versus Specificity In Practice

Chris Bahl 18:24

I think we could have a whole conversation about the the physics of specific finding sites. I wouldn't I personally I'm not convinced that any surface can be bound. Because also druggable is a very charged word because if it's quote unquote druggable, it's not just, I mean, I can make a binder to any surface right now off the top of my head. Right. So that's not a drug, right? Even though it's a binder.

Martin Pacesa 18:53

So no, it's it's very different. The function and the binding, and yeah, the binders is mostly for the research use. The therapeutics and drugs are for the therapeutic use. And there one also has to know the biology, not just the the binding side. And here, I guess, circling back to your previous question, here banker could be helpful because it does go for the co-evolutionarily conserved binding sites, preferably. And we do see it if you if you go to receptors, you know, you can almost almost every receptor we can get a good binder against. Whereas if you go for some more soluble proteins, this that's where it becomes a little more difficult. So maybe that it does say a little bit about that. But yeah, in terms of specificity, that's another question. That's a question I would love to answer in more depth. We we've of course tested the specificity of some of these. We barely see any off-targets. We've tested, like we always try to test against the most similar protein we have on our hands to make sure it's as relevant as possible. Of course, it's not always possible. I think it would be really interesting to see what the specificity is on a genome-wide level, depending on, of course, how much you express the protein. But I think this will really depend on your choice of target side because the optimization is to make sure that the the your binders interface is as geometrically and chemically compatible to your target as possible. But if you choose a site, of course, that's very common. Well, yeah.

Binding Kinetics And Antibody Challenges

Chris Bahl 20:14

That's a really good point. And it maybe it brings me back to something you said a few minutes ago about the rigidity of the proteins that you design and the entropy of binding. Like it's pre-organized in the binding competent conformation. I mean, all the proteins that we design, when we look at the binding kinetics, we find that kind of regardless of what the equilibrium binding affinity is, our association rate constants are, you know, how quick does it bind? It's really fast, always, because the structures that we design are super rigid and and it's it's pre-organized in the binding competent conformation. It doesn't have to, it's not an ensemble that's sampling a bunch of diverse states. And then but this is really, I would say, super different than what you see with antibodies, right? Where you've got these big loops, they're pretty flexible, and you can have much slower association kinetics with with traditional antibodies. And so I, you know, I think these designed proteins are because they're so much more rigid than an antibody, are gonna bind really differently than a typical antibody. And you know, just thinking a little bit about is that good? Is it bad? What what are the pros and cons of each of those? And maybe along those lines, how far off are we from being able to design with BindCraft like ease antibody CDRs? Because I think that's kind of the you know, the trillion dollar question.

Martin Pacesa 21:40

I do agree with you. I think it's it's both a good thing and a problem the fact that these interfaces are preorganized, because there is not really any induced fit happening. So usually what we see is very quick on rates, but we also see very quick off rates most of the time. I would say it's it's quite rare that you actually We see very slow off rates with these types of binders, probably because that interface is so rigid and it's also more easily displaced by Solvent because of that. And a level of induced fit would be a good thing. And I think we are working our way towards that. And in fact, these hallucination approaches are actually really good for that because they're easily adaptable, unlike, you know, for example, some diffusion approaches where you have to always train or or fine-tune a model to be able to get a new functionality. And you know, you have to then do the whole benchmarking loop all over again. Whereas with the hallucination, why I'm a big fanboy of that is that you know, you just you have a pre-trained model, you know what it can do. Just your the only limit is how well it can predict things, but then you can always just slap a new loss function on top of it every time you want something new, like an antibody. So we are definitely experimenting with that. Not gonna lie, of course, antibodies are super difficult, and any model that relies on co-evolution is an enemy of antibodies, which really don't have any, at least not in the areas where it matters. So I think antibody design is gonna hit a ceiling for a while until we figure out a better way to predict. And I think you know, here we might have to go at some point old school with really either some docking or some induced fit level of docking to actually be able to score these things much better.

Democratizing Access To Compute

Chris Bahl 23:17

So the I guess the antibody design problem is maybe a good point to shift gears and talk about a little bit about democratization. So, you know, I BindCraft has enabled a lot of folks to be able to do protein design who didn't train in a protein design lab, which is really, I think, what historically that's how our field has worked. If you didn't train with a protein designer, you were it was very unlikely that you were to ever be successful at doing protein design. So, you know, with with the democratization of the of the techniques, there's also some discussion now about access to hardware, right? I know there's lots of folks who are brilliant doing really insightful research in resource-strapped labs around the world that want to use these tools, but but maybe are don't have access to tons of high-end GPUs to run this sort of stuff. And just wondering about you know how you how you think about increasing the accessibility of these tools.

Martin Pacesa 24:17

Yeah, absolutely. Our first version of BindCraft was certainly not easy on the GPUs. That's something we're trying to actively address because, well, it not it's also costing us a lot of money to run these designs. And we would really like to make it more acceptable so that you could run it on a regular home PC. I wouldn't say it's super limiting because you can still run them on regular gaming GPUs, so I can run it on my home gaming computer. I mean, of course, you're limited with the size, but generally, since you trim your target, you can usually fit very well into most of these GPU sizes. It still runs for pretty long, but if we do like comparative head-to-head benchmarks, for example, with class product more established tools like RF diffusion, it's it sort of comes out to the same generation time if you count the whole filtering, you know, actually the entire pipeline to get a decent amount of binders. So even though each individual step is maybe faster, you have to do a lot more of it to get actually some decent design. So I don't think that the computation time is any more limiting than with other tools right now. We would love to see, or we're trying hard to make it make it better. And I think the GPU time is it's sort of becoming inevitable that every lab needs to have some sort of access to a cluster or GPU resource if you want to do this. It's can be expensive, but it still is very small compared to what the wet lab experimentation costs. So I'll be honest, I'm willing to spend a few hundred extra bucks to just run a few more designs, maybe better filtering, than to spend two months in the wet lab and synthetic and and money on synthetic genes. So I think I hope it gets better to the point where really we can improve not just the the generation time, but the the filtering to a point where I just order one gene and get one binder like this is this is the dream I want to achieve. And the the fact that I know for a fact that this this binder will bind.

Wet Lab Data Beats Paper Benchmarks

Chris Bahl 26:14

You mentioned the importance of wet lab a few times, and I completely agree. Do you have ideas for also like a kind of a BindCraft equivalent in the wet lab, or is it really using BindCraft to slowly chip away at the you know the domain of the wet lab?

Martin Pacesa 26:30

I so we're trying now in my new lab, we're trying to set up more automate automated assays, let's say, so something where you can test more things in parallel, not because I want to go into higher throughput regimes, but it allows you to screen binders against more targets. So basically just increase the the throughput of the tools we can develop rather than having to screen more, if that makes sense. So going back to yeast display? Oh god, no, no, no, no, no, no, no. Just classical in vitro. So for me, you know, with working with purified proteins is the cleanest way because you know there's no noise. The only thing that you're really looking at is the interaction between two proteins. Any sort of high throughput or in-cell assay is always gonna have some level of of noise. And if we want to use this sort of data to improve the models, I think in these lower data regimes, you just cannot afford any levels of noise or just minimal levels of noise. So we're never gonna collect you know millions of data points this way, but at least the data points we get are really reliable and can be used to guide the models further. So I would say wet lab is absolutely essential. And I see a lot of papers and a lot of preprints these days showing this benchmark, whoa, we're two percent better than the previous state of the art. And I'm like, yeah, is that relevant? I mean, I mean, I can, you know, we can always make pipelines faster and churn out design or more designs, but if they don't work, then what what's the point?

Open Source Models Need More Structures

Chris Bahl 27:57

Yeah, I agree. I what's the line is in God we trust and all others must bring data. I mean I'm an atheist, but I still really enjoy that that phrase. Yeah, I you need to see the data in order for it to be believable. But as far as the you know, all the folks that are publishing things that are incrementally better that may or may not be validated, you know, there's also the the phrase of baby steps towards greatness. And you know, if a 2% improvement is real, you know, I think that's still valuable. You know, I have some concerns around people that are working in the opposite direction that you are, Martin, where you know, rather than open sourcing and democratizing access, it's closed source. And you know, I think nobody is really that threatened by 2% better and a closed source model. My concerns are when you stack 2% better over the course of a few years, and if the open source community is not keeping up, then all of a sudden the closed source model, if you believe, might be twice as good, three times as good as what's free. And I think that's where we start to become potentially in danger of having a few folks control protein design. It's not necessarily good for global science. Yeah, how do you think about ensuring that these tools stay freely available?

Martin Pacesa 29:18

I think I'm a little more optimistic there. I don't think the models are, even the closed source ones, are significantly better or can be significantly better than the ones that are open source. In the end, it's really going to come down to the data that you use. I mean, unless you're a huge company that had is able to generate their own data sets, okay, then you have an advantage. And there I believe you can probably make better models. But since most of the models that we work with today, or the at least the very successful ones experimentally, rely on protein structures that have been solved in the PDB, I really doubt there's gonna be a closed source company that can outcompete what's out in the PDB. And I think the really only way, and this is my message to the young people, keep solving more structures. Structural biology is not bad. We still need more and more interesting structures, particularly the things that AlphaFold cannot predict. So as I said, I'm a little more optimistic. I'm not against closed source per se, you know, capitalism and all that, but I don't I don't see closed source overtaking open source anytime soon.

Advice For Investors Betting On AI

Chris Bahl 30:23

And that's that's comforting to hear. And maybe, you know, speaking now directly to any investors that might be listening, you know, there are several VC funds that have placed, you know, triple-digit million dollar bets on companies whose value proposition is creating a closed source algorithm that smokes anything in the public domain. So I guess what what advice would you have for those people in the investment community who are wanting to place bets in this area?

Martin Pacesa 30:48

Yeah, I would say models are temporary data and biologics are the things that are gonna make a difference in the long run. I think in the end, it doesn't matter what good how good of your model is if you have no idea what to do with it. And this is I'm a bit you know pragmatic this, and don't get me wrong, you know, I love models and I have one foot in some of in some startups, but it's in the end, I want to see the pot the therapeutic potential of your model. If you're just gonna keep giving me, well, I'm 10% better than the competitor, I don't care. Your competitor has cured cancer. So yeah, it's I think it really will come down to see what these companies are gonna do do with their models. Just partnerships, I feel these days is not enough because you know big pharma can change their mind from one day to another based on where the financial winds blow. So these companies in the end will have to rely eventually on their own own IP generation. And this is where we're gonna see who makes it out alive in a couple of years.

Chris Bahl 31:50

Yeah, for sure. So maybe shifting into you know safer waters for the next question. Gotta be a little controversial here.

Martin Pacesa 31:59

Oh, yeah. I mean, I'm never getting any investor interested in me again, but thanks.

Protein Design As Targeted Biology Tools

Chris Bahl 32:05

No, no, I mean I think that's I think that's very wise advice. But okay, so you know, thinking about as academic doing protein design, it it really used to be that only protein designers could use protein design. And in, you know, if you were a protein designer, you know, you collaborated super broadly with with lots of folks with biological questions. Do you think that is still going to be the role of the protein designer going forward? And and and how do you see that evolving?

Martin Pacesa 32:31

I personally hope so. I still see right now we're make we're working on making models, but I think the reason we're doing this is to do cool biology. And it's something that I would pre prefer like to do at some point. I just don't want to just be a guy who develops models. I want to do something with it. Because right now our struggle is finding good and interesting targets. But on the other hand, you know, it's opening up so many worlds. If you think about it, for example, the way biological pathways used to be pro was like was like genetic knockouts. We would be take super long. And if you knock out a protein, you knock out all of its auxiliary functions and you know the whole cell goes haywire. But now you could just introduce a binder that only, let's say, knocks out a particular function of that protein in a very controlled manner. And this is this is sort of a new type of tool that we didn't really have before because with antibodies, well, I mean, antibodies, you have to fix the cells to go inside, but with but with these mini-binders or even peptides, you can now really try to do cool things happening to a cell. You can see like what's happening to the signaling. Can I bias the signaling towards a particular way and in a way develop new types of model organisms, even or or model cell lines where it's no longer mimicking a mutation, but you could even mimic a particular phenotype or a particular, yeah, I would say a particular phenotype with the binder. This is very exciting. This I find it's really a way of biasing, let's say, biology towards a particular way. And this is very exciting too. And this is an avenue I would like to very much to explore once we know that we're at, you know, we we don't have to screen too many binders to get there.

Student Advice And Protein Literacy

Chris Bahl 34:07

It's a it's a very exciting new era. I think the success of our field has also attracted a lot of students to want to learn protein design and study in protein design labs. What advice do you have for those students who are interested in in protein design?

Martin Pacesa 34:23

Oh yeah, I'm flooded with emails about people wanting to join the line. And it's really it's really exciting to see people use it because the computational tools have I think it again is about the accessibility. Because let's face it, you know, you could have people they could have used Rosetta back in the day, but you don't really would have to spend a long time to even learn how to how to use these tools, whereas now it's really almost the click of a button. I think for the students, certainly don't just neglect to go for the computational part and neglect the wet lab. Try to learn the wet lab part as well. And I would say for for the last few years, the most successful projects I've seen around me were the ones where people were had both, you know, that feed in both of the waters, like both the computational and the wet lab part, because you sort of have a good sense of what you should be designing for and how. So definitely try out both. I think this this would be my advice. Like, don't just say, Oh, I'm I'm just going for the computational part or just the wet lab part. Like, definitely try both. It's it's harder. There, you know, you will have to learn some new things, but the reward will be much greater.

Chris Bahl 35:28

So chatting with a friend of mine recently, Nick Politzi, and he was bemoaning the loss of protein literacy, as he was calling it. And maybe another way of saying it is you know, we grew up in an era of when you design proteins, you looked at them with your eyeballs and and thought, hey, does this look like a protein? I've, you know, spent countless hours training my own, you know, moist neural network on what protein structures look like. And, you know, does this pass my own brain test here? And you know, we saw this paper from NVIDIA come out recently with with their new binding model. And eyeballing some of those things, I would have never ordered them because they don't look like folded proteins. And you know, and they put that in a in a manuscript, clearly showing that they don't know what proteins look like. There's no protein literacy there, it's just computational metrics. So that I think that's maybe an important thing to highlight for students is protein literacy is not yet obsolete.

Martin Pacesa 36:25

Oh, absolutely. I think in the end, what was also helpful for me, I came from a structural biology background. So I wasn't I was building those structures before I started designing them. So and and before, before like alcohol, you literally had to build those structures, you know, residue by residue into the electrode density. And then you just because you spend hours looking at the really local interactions, you sort of get a sense of what goes with what and how these things should look like. And I must admit, sometimes the Pinecraft designs they challenge what I've known, like challenge my own protein illiteracy. We had this one particular design against this design beta barrel. It's it's it's in one of the main figures. And when I looked at it, I was like, duh, this is really wonky. Like, there's these disconnected beta she's like, this doesn't look like the real protein. And I showed it to Bruno at the time, and he was like, Yeah, man, no way this is gonna work. And lo and behold, it was one of the best binders we ever got. And at that moment, I started thinking, like, wow, maybe these things have extracted principles that I sort of had no chance to learn, and maybe I have to trust them a little bit. But again, don't just blindly trust metrics, actually look at those structures. At least, you know, the most obvious errors are gonna pop up there.

Chris Bahl 37:39

That interesting-looking protein, even it was a good binder, but what was the thermostability like? Was it folded well?

Martin Pacesa 37:45

Oh, yeah, yeah. It was like thermostable 90 more than 90 degrees, we could do whatever with it.

Chris Bahl 37:50

That's super cool.

Martin Pacesa 37:51

Yeah, it's honestly blows my mind sometimes when I when I think about it. Because again, I'm used to in the extreme cases. I had a project in my bachelor's where I purified a yeast complex that was stable for roughly 15 minutes on ice. After that, it was dead. And now, if you compare it to you know, these little things that you can literally you could dehydrate and rehydrate, and they're perfectly fine. It's it's almost crazy. Nice.

Closing And Upcoming Conference Talk

Chris Bahl 38:17

Cool. Well, Martin, it's been super fun chatting today. Thanks so much for taking the time. And I can't wait to see your talk here in just a few weeks at PEGS Boston.

Martin Pacesa 38:27

Thanks. I'm very excited. Thanks for having me with all the great and thoughtful questions.