Welcome And Panel Introductions
Announcement
Welcome
to
The
Chain,
the
podcast
covering
the
lives,
careers,
research,
and
discoveries
of
protein
engineers,
scientists,
and
biotech
professionals.
In
today's
special
episode,
we
hear
a
panel
discussion
on
near-term
challenges
for
ML
and
AI
and
biotherapeutic
R&D,
recorded
live
at
the
PEGS
Summit
in
May
in
Boston.
Peter
Tessier,
Albert
M.
Maddox
Professor
of
Pharmaceutical
Sciences
and
Chemical
Engineering
at
the
University
of
Michigan,
moderates
the
conversation.
Here
he
is,
introducing
the
panel.
Peter Tessier
So
before
we
get
started,
I
just
want
to
introduce
this
distinguished
panel.
So
first,
Melody
Shahsavarian,
who's
Senior
Director
of
Data
Strategy
and
Digital
Transformation
at
Eli
Lilly.
And
her
group
really
is
involved
in
helping
make
data
and
models
available
and
usable
for
both
wet
lab
and
dry
lab
scientists
across
biologics
discovery
and
optimization.
Andrew
Martin,
who
is
emeritus
professor
of
bioinformatics
and
computational
biology
at
University
College
London.
Of
course,
a
lot
of
us
know
him
for
he's
a
co-founder
of,
I
call
it
Absis
or
Abesis.
He
can
tell
us
which
way
it
should
be
said.
And
he's
worked
in
this
field
for
a
very
long
time,
roughly
40
years,
and
was
involved
in
very
early
application
of
AI
in
the
area
of
antibody
structure
prediction
and
so
on.
Peter Tessier
Andrew
Buchanan
is
senior
vice
president,
head
of
discovery
in
a
stealth
mode
biotech,
developing
new
antibody
molecules
and
clinical
development.
He's
worked
for
25
years
in
this
area
of
antibody
discovery
and
early
development.
And
he's
actually
been
working
on
AI
and
ML
for
over
10
years
now,
both
in
with
investigators
in
industry
and
academia.
Peter Tessier
Norbert
Furtmann
is
head
of
biologics,
AI,
and
design
of
large
molecule
research
at
Sanofi.
And
he
works
closely,
closely
aligned
with
computational
design
optimization
of
protein
therapeutics,
including
VHHs
and
multi-specifics.
Bernhardt
Trout
is
a
professor
at
MIT
and
has
extensive
work
in
the
area
of
molecular
engineering,
protein
therapeutics,
formulation,
and
machine
learning.
And
a
lot
of
us
know
his
group
for
contributing
to
many
ML
methods
predicting
antibody
properties
like
aggregation,
viscosity,
and
so
on.
Peter Tessier
Konrad,
I'm
gonna
say
his
name
wrong,
so
let
me
take
a
minute.
Krawczyk
is
founder
and
CEO
of
Natural
Antibody,
a
company
focused
on
AI
ML
driven
biologics
engineering.
He's
also
assistant
professor
at
University
of
Southern
Denmark,
and
he's
recognized
as
uh
expert
in
the
area
of
computational
antibody
design.
Okay,
so
let's
get
started.
And
again,
my
request
to
the
panel
is
let's
make
this
practical.
Let's
try
to
make
our
thoughts
something
where
we
can
have
the
audience
leave
this
panel
smarter
and
sort
of
better,
sort
of
ready
on
Monday
morning
to
sort
of
start
thinking
about
how
to
change
and
think
about
how
they're
using
AI
and
ML.
Where AI Works And Where It Fails
Peter Tessier
Okay,
so
opening
question.
And
before
I
ask
a
question,
please
let's
make
this
sort
of
specific.
I'd
like
to
avoid
sort
of
a
lot
of
theoretical
thoughts
and
try
to
make
this
as
specific
as
possible.
What's
one
biologics
R&D
problem
where
AI
ML
is
already
generally
useful
today?
You're
using
it,
it's
useful,
it's
changed
the
way
you
do
your
work
or
the
work
of
your
colleagues
that
you're
involved
with.
And
one
what's
one
area
where
we're
just
not
there
yet?
We
could
say
it's
hype,
we
could
say
it's
just
not
ready,
however,
you
want
to
say
that.
So,
what
is
a
practical
example
of
where
it's
working
and
where
is
it
not
working
yet?
Who
would
like
to
start?
Konrad?
Konrad Krawczyk
Right,
it
works.
So
I
would
definitely
say
that
for
lead
optimization,
for
instance,
that's
where
it
can
actually
work.
Yeah,
because
if
you
even
if
you
have
a
few
data
points,
but
if
you
I
mean
like
you
know
tens,
like
you
know,
maybe
hundreds,
then
you
can
actually
iterate
to
make
your
biology
better.
Yeah,
so
that's
where
it
does
work.
One
area
where
we
know
that
it
is,
let's
say,
if
I
say
diplomatically
challenging,
is
mostly
with
clinical
stuff.
Yeah,
because
you
know
we
do
look
at
uh
data
from
clinical
trials,
we
look
at
the
readouts
from
the
different
cohorts,
and
it
is
extremely
difficult
to
forecast
like
how
a
biologically
would
fare
in
clinical
trials.
So
that's
where
I
would
say
that
it's
very
challenging.
Peter Tessier
Okay,
and
can
you
be
a
little
more
specific
on
the
clinical
part?
What
are
we
talking
about?
Immunogenicity,
efficacy,
safety,
anything
in
particular
that
comes
to
mind?
Konrad Krawczyk
Yeah,
so
initially
we
focused
on
immunogenicity,
yeah.
And
the
challenge
was
of
course
just
you
know
data
collection.
Yeah,
because
even
if
you
do
have
different
data
points
for
different
cohorts,
it
is
not
given
that
the
data
are
going
to
be
comparable.
Yeah,
and
if
you
want
to
train
the
model,
then
of
course
you
do
want
a
uniform,
essentially
a
comparable
distribution
between
them.
Yeah,
so
that
was
the
first
point,
yeah.
But
then
also
with
any
other
readouts,
like
you
mentioned,
like
they
are
extremely,
extremely
dependent
on
the
metadata.
Yeah,
so
we
find
that
it
might
be
easier
to
look
at
the
metadata
of
an
indication
that
like
in
your
targeting
to
actually
forecast
like
you
know
where
the
issues
might
lie,
not
in
the
actual
sequence,
yeah,
quite
you
know,
controversial.
Speaker
Okay,
Norbert?
Norbert Furtmann
So
maybe
I
start
with
where
we
still
see
challenges,
and
I'm
very
much
aligned
with
what
Konrad
just
said.
But
I
mean,
maybe
even
to
go
a
bit
further
in
the
workflow,
I
mean
predicting
function
is
I
think
incredibly
tricky.
I
mean,
many
of
the
the
tools
and
programs
we
are
using
is
kind
of
directed
for
affinity
prediction,
which
by
itself
I
think
is
a
holy
grail
problem
to
have
generalizable
affinity
prediction.
But
in
the
end,
what
we
are
aiming
for
is
not
only
binding,
it's
like
interference
with
a
specific
pathway.
And
I
mean,
many
things
can
be
explained
by
structure,
but
not
everything.
And
I
think
specifically
prediction
of
functional
readouts
is
something
where
there
is
lots
of
room
to
further
improve.
And
I
mean,
going
to
things
which
are
working
well,
I
mean,
I
think
it's
not
all
solved,
but
in
the
domain
of
developability
predictions,
I
think
there
was
lots
of
progress.
Norbert Furtmann
And
I
think
there
are
quite
some
models
which
can
help
based
on
pre-trained
data
sets
on
stability
or
aggregation
to
filter
molecules
and
where
AI
can
provide
value
kind
of
to
pre
to
replace
wet
lab
experiments
by
predictions
or
at
least
to
filter
so
that
you
have
to
test
less
molecules.
And
another
part
is
I
think
diversification
where
AI
can
help.
I
think
I
mean
to
get
to
more
diverse
panels
in
terms
of,
I
don't
know,
addressing
different
epitopes
in
early
selections,
or
to
make
sure
that
you
have
like
a
broader
distribution
of
general
biophysical
properties
of
your
molecules.
I
think
yeah,
AI
or
computational
tools
can
do
a
good
job
and
provide
an
additional
perspective
over
like
what
we
can
get
from
the
wet
lab
readouts.
Peter Tessier
Could
could
I
push
you
a
little
bit
on
the
this?
You
you
mentioned
developability,
and
that's
a
place
where
you're
seeing
you
know
impact
and
it's
working.
You
know,
how
how
how
well
is
it
working?
Are
you
are
you
making
actual
decisions
based
on
these
models?
Is
it
changing
what's
what
you're
advancing?
How
how
confident
are
you?
Could
we
push
on
that
a
little?
Norbert Furtmann
I
think
it's
hard
to
give
a
general
answer
because
it's
like
it
depends
like
which
property.
Like,
I
mean,
we
might
have
properties
which
are
better
understood
and
where
we
have
more
data,
where
we
have
data
sets
which
have
higher
quality.
So,
for
example,
let's
say
thermostability.
I
think
we
are
quite
good
already
in
selecting
molecules
or
filtering
molecules
based
on
predicted
thermostability.
We
also
what
we
presented
here
at
the
conference
is
a
model
for
multispecifics
colloidal
stability
prediction,
which
we
see
is
a
rather
simple
model
based
on
surface
patch
information,
where
we
can
get
like
a
quite
solid
correlation
from
the
predicted
colloidal
stability
to
the
experimental
colloidal
stability.
And
specifically,
I
mean
in
the
space
of
multi-specifics
where
the
design
spaces
are
huge,
if
we
have
like
solid
models,
it
can
simply
help
to
explore
the
huge
design
spaces
and
to
limit
and
de-risk
the
panels
of
molecules
we
then
actually
put
into
experimental
testing.
But
there
might
also
be
other
properties,
or
there
are
other
properties,
where
we
failed
to
build
models
to
predict
those
biophysical
properties.
And
I
think
it
depends
on
the
quality
of
the
data,
on
the
diversity
and
how
well
the
property
is
understood.
Can
we
get
the
right
molecular
descriptors
if
it's
like
sequence
PLM
derived,
if
it's
structure
derived
to
get
that
correlation?
But
as
compared
to
properties
like
affinity,
where
you
have
like
that
antigen
layer,
I
feel
confident
that
with
improvements
in
high
throughput
technologies
and
quality
and
size,
diversity
of
the
data,
that
the
advancements
we
see
will
further
continue,
while
other
areas
like
function
and
affinity
predictions
are
much
more
challenging.
Peter Tessier
I
think
one
you
know
one
practical
implication
here
is
are
you
is
the
field
going
to
get
to
a
place
where
we
stop
measuring
certain
properties
because
we're
confident
enough
in
predicting
them?
Or
is
the
field
gonna
get
to
a
place
where
they're
doing
many
fewer
experiments
because
the
confidence
in
the
models
is
high
enough
that
that
affords.
Could
you
comment
on
that?
Norbert Furtmann
I
mean,
I
don't
see
us
replacing
so
fully
replacing
the
experimental
testing
with
predictions
as
of
today
or
in
near
future.
I
think
we
are
more
like,
I
mean,
uh
we
are
very
much
convinced
that
you
need
a
tight
integration
between
WetLab
and
Enzilico,
like
in
iterative
loops
to
learn
from
each
other.
But
I
think
where
we
see
benefit
is
like
de-risking,
so
that
we
could
like
select
based
on
the
prediction
molecules
which
go
into
experimental
testing
where
we
do
not
see
the
surprise
afterwards.
Okay,
like
we
have
a
good
binder,
but
in
the
end,
like
stability
fails,
aggregation
fails.
So
I
think
we
can
move
ahead
more
high-quality
molecules
than
in
the
past,
but
we
still
need
to
test
and
confirm.
And
I
also
that's
very
think
that's
very
important
because
we
see
progress
in
data
sets,
but
we
need
to
keep
testing
and
funnel
back
that
information
into
the
models.
Okay.
Peter Tessier
Melody,
do
you
have
any
thoughts
on
that
or
more
broadly,
what's
working
and
what's
you
know,
where
is
the
biggest
gap
right
now?
Melody Shahsavarian
Yeah,
I
think
I
agree
with
Norbert.
What's
has
you
know
proven
to
work
is
this
triage
in
step
where
we
can
uh
we
can
implement
AI
both
for
uh
you
know
predicting
attributes
that
we're
looking
for
the
molecules
as
well
as
better
having
the
better
diversity
based
on
structure
prediction.
So
I
to
your
question
of
you
know,
has
it
do
we
have
proof
of
this
working?
I'll
mention
that
we
did
actually
this
study.
It's
it's
a
few
years
before
my
time
that
this
AI-enabled
sort
of
intelligent
triaging
workflow
has
been
established
in
the
Discovery
Organization
in
Lilly.
And
they
did
a
study
to
look
at
molecules
that
get
to
development
and
their
sort
of
success
rate,
and
significantly
the
the
number
of
uh
you
know
uh
surprises
or
or
liabilities
that
development
gets
has
been
reduced
uh
during
this
time.
So
there
is
definitely
proof
that
uh
it's
it's
impactful.
Peter Tessier
Could
I
ask
you
just
when
you
talk
about
triaging,
you
know,
what
are
you
considering?
Is
it
function?
Is
it
affinity?
Is
it
developability?
What
what
what
does
that
triage
look
like?
Melody Shahsavarian
So
it's
developability,
yeah.
We
have
so
developability,
but
also
diversity,
you
know,
trying
to
get
larger
diversity,
uh
both
sequence
space
and
also
epitope
space.
Peter Tessier
Okay.
Absolutely.
Yeah,
and
where
do
you
see
the
opportunities
or
the
missing?
Melody Shahsavarian
Yeah,
the
opportunity,
I
think
we
were
talking
about
this
uh
you
know
before
the
panel
started.
I
really
would
like
us
to
be
able
to
apply
these
sort
of
methods
to
more
complex
molecules.
Peter Tessier
By
complex
you'd
mean
multi-specifics.
Do
you
mean
non-antibodies?
Melody Shahsavarian
What
could
you
multi-specifics,
perhaps
conjugates,
you
know,
ADCs,
arcs.
Peter Tessier
Okay,
thank
you.
Andrew,
any
thoughts
on
this
or
where
you
see
where
things
are
working
or
where
the
opportunities
are?
And
by
the
way,
I
love
your
socks.
He's
got
antibody
socks
on,
okay?
So
very
appropriate.
Thank
you.
Andrew Buchanan
Good thought. In
essence,
the
technology
is
brilliant,
uh,
but
the
limitations
are
what
yeah,
in
trying
to
apply
it,
it
will
only
work
if
you
have
the
appropriate
data.
Where
teams
tend
to
have
the
appropriate
data
is
in
developability
because
it's
a
generic
property
of
the
IG
fold.
You
could
be
really
challenging
and
say,
are
these
molecules
really
learning
anything?
They're
just
neighborhood,
anyway.
But
it
it
is
really
helpful
in
the
triage
because
you
can
design
hundreds,
thousands,
tens
of
thousands.
You're
not
limited
in
what
you
can
design
computationally,
but
they
are
really
useful
for
triaging
down.
There
are
really
good
examples
of
classifier
models.
That's
really
that's
real
world
useful,
but
even
that
for
aspect
that's
only
for
aspects
of
developability.
There
are
other
important
aspects.
But
even
in
that,
how
well
do
we
understand
how
well
these
uh
a
chow
cell
will
make
it?
You
might
do
self-free
expression,
you
might
do
hack
expression,
but
that's
not
chow
expression.
Andrew Martin
Andrew Martin
And
in
research,
we
will
do
transient
show.
That's
not
a
stable
chow
cell
line.
And
these
are
data
gaps
that
are
really
worth
exploring,
but
they're
also
uh
getting
your
institutes
to
buy
into
doing
that,
to
generate
essentially
machine-ready
data.
I
think
no,
but
you
had
a
you
talked
about
it
a
bit
in
your
talk.
Actually
committing
to
generate
this
data.
You
there
are
papers
about
it,
but
those
papers
don't
have
they're
hardly
ever
published
all
the
data
that
you
really
need
to
implement
it.
So
even
between
our
institutes.
But
those
are
some
of
the
challenges
in
that
one
aspect.
The
other
aspect
about
structure
is
what's
has
anything
really
changed
in
the
amount
of
structural
data
that's
available
in
the
public
domain?
And
we
need
to
get
beyond
the
static
view
because
we
essentially
take
molecule,
we
we
understand
we
take
some
aspect
of
an
antibody's
biology
or
pharmacology
out
of
the
system.
We
study
it
by
structure,
by
sequence,
by
function,
and
then
we
try
and
insert
it
back
into
the
context
that
it
all
exists
in.
And
that's
the
kind
of
pinch
of
reality.
And
part
of
that
in
the
structure
space
is
dynamics
and
finding
ways
to
get
beyond
static
views.
Peter Tessier
Maybe
How Reliable Developability Predictions Are
Peter Tessier
just
push
back
for
a
minute.
You
know,
it
seems
like
there's
some
advocacy
for,
you
know,
we're
making
progress
in
the
developability
space.
Do
you
or
anyone
else
on
the
panel,
what
what
would
be
the
hardest
developability
property,
or
what
are
the
hard
ones
right
now
where
even
though
the
field's
making
progress,
what
are
the
what's
still
the
hardest
for
us?
Do
you
have
a
thought?
Andrew Buchanan
I
I
have
a
thought.
Most
of
the
times
in
the
industry,
we
run
platform
processes,
those
processes
work
really
effectively,
either
in
discovery
and
in
development.
But
when
you
enter
the
world
of
multi-specifics,
where
we
don't
have,
although
we're
getting
good
at
it,
we
don't
have
such
a
track
record.
And
I
think
the
a
simple
aspect
of
understanding
expression
in
stable
shows,
because
stable
shows
are
metabolically
stressed
in
a
way
that
our
transient
systems
aren't.
I
think
that's
a
fascinating
area
to
explore. Yep.
Peter Tessier
Okay.
Bernhardt,
can
we
take
it
to
you,
especially
as
a
developability
expert?
What's
your
perspective?
You
know,
you've
been
working
in
this
field
a
long
time.
Where
do
you
see
the
field?
Where
do
you
see
success
and
where
what's
missing?
Bernhardt Trout
Sure.
So
it's
good
to
hear
that
they're
used
for
screening
or
triage,
whatnot.
I
think
they
started
these
methods
started
being
used
in
the
2010s
and
have
been
propagating
more
and
more.
Since
you
said
to
be
practical,
I'll
I'll
say
a
few
practical
things
at
the
at
the
end
uh
regarding
that.
And
again,
by
developability,
we
mean
aggregation,
sorting,
triage,
viscosity,
and
various
chemical
liabilities,
damagation,
oxidation,
and
whatnot.
What's
missing
is
we
can
do,
as
you
said,
triage
with
maybe
80%
is
a
rough
estimate
accuracy.
In
our
lab,
we
keep
working
to
get
with
more
complicated
models
to
90%.
But
I
I
don't
think
that's
generally
the
case.
So,
and
probably
to
actual
more
than
just
sorting
or
triage,
we
need
to
get
above
90%.
So
I
think
that's
challenge
number
one.
And
then
as
far
as
the
next
step,
I
would
say
formulation
design.
So
we
started
doing
some
of
that,
and
I
think
that
would
be
something
that
generating
data,
machine
learning,
you
generate
actually
a
lot
of
data
when
you
do
formulation
work.
So
I
think
that
could
be
an
exciting
new
area,
which
has
already
begun
to
be
done.
As
far
as
the
practicality,
I'll
just
mention
two
things.
One
is
that
these
sort
of
methods
are
generally
available
on
a
lot
of
off-the-shelf
software.
We
work
with
uh
Biosim
Pipeline
Pilot,
but
I
think
most
of
the
major
companies
have
this.
So
there's
not
a
huge
investment
because
your
companies
probably
already
have
it,
the
software.
Peter Tessier
Could
you
repeat
the
name
of
that
again?
It's
Bernhardt Trout
oh,
Pipeline
Pilot
is
the
one
that
we
use,
but
there
are
many,
many
others.
Andrew Buchanan
Even
Amazon,
the
everything
bookstore
can
now
deliver
you
said
software?
Bernhardt Trout
Yeah,
you
can
go
from
Amazon.
Yeah,
exactly.
The
biggest
challenge
that
I've
seen,
I've
worked
with
companies
in
collaboration
for
over
20
years.
The
biggest
challenge
is
having
made
a
decision
like
Metimmune
did
many
years
ago,
that
they're
gonna
commit
to
this
and
actually
committing
resources
that
is
resources
of
people.
I've
been
involved
with
so
many
companies
that
said
we
want
to
implement
developability,
this
person's
gonna
do
it,
and
then
a
month
later,
two
months
later,
they're
called
away
on
another
high
priority
project,
and
then
it
kind
of
fizzes
away.
Structure, Epitopes, And Affinity Reality
Bernhardt Trout
Okay.
Peter Tessier
Thank
you.
Andrew,
do
you
have
thoughts
on
this
or
where
you
see
the
where
things
are
working
and
where
the
opportunities
are?
Andrew Martin
Sure.
So
not
in
the
antibody
space,
but
in
the
general
protein
space.
I
think
we
really
have
now
come
to
the
stage
of
solving
the
protein
folding
problem
with
uh
alpha
fold
type
software.
It's
really
is
working
very
well,
but
it
still
doesn't
work
so
well
for
the
CDRs,
which
is
the
bit
that
we're
interested
in
in
terms
of
protein
structure
prediction,
and
particularly
for
CDR
H3.
And
despite
this
idea
that
a
lot
of
people
have
that
uh
antibody
CDRs
are
very
flexible,
this
is
really
not
generally
true.
We
published
a
paper
on
that
uh
a
year
or
two
ago.
So,
yes,
dynamics
is
important,
but
uh
it's
probably
not
as
important
in
terms
of
the
uh
the
structure
of
the
CDRs
as
people
tend
to
tend
to
think.
There
are
some
exceptions,
of
course,
there
are
always
exceptions
in
biology,
and
some
CDR
H3s
are
more
flexible,
but
in
general
they're
not.
So
that
is
still
a
a
problem
and
an
opportunity
in
in
modelling
antibody
structure,
which
is
something
I've
worked
on
for
40
years.
So
that's
one
area.
Another
area,
well,
we've
everybody's
spoken
about
developability,
and
that's
something
that
we've
worked
on
recently
as
well,
and
which
uh
I'll
be
talking
about
in
my
talk
tomorrow.
But
uh
also
related
to
that,
a
big
problem
uh
if
you
if
you
if
you
have
a
panel
of
antibodies
that
you've
raised
against
a
particular
protein
and
you
want
to
know
the
ones
that
actually
bind
to
the
epitope
that
you're
uh
interested
in.
So
predicting
epitopes
is
is
a
big
problem.
And
uh
there
was
an
independent
assessment
done
a
few
years
ago
which
really
showed
that
none
of
the
methods
worked.
We
like
to
think
we
have
now
got
slightly
beyond
that
using
sort
of
fairly
novel
AI
approaches,
still
not
brilliant,
but
it's
a
lot
better,
particularly
when
you
know
the
antibody
that
you're
you're
thinking
about
using.
And
taking
that
a
step
further,
then
can
you
rank
antibodies
on
affinity?
So
you
know
affinity
measurements
are
difficult
to
be
consistent
across
different
uh
assays
and
so
on.
But
that's
why
we
decided
not
to
try
and
take
a
sort
of
regression
approach
where
you're
actually
trying
to
predict
numbers,
but
to
predict
ranking.
And
we
have
had
some
success
with
that.
So
I
think
you
know
there
are
a
number
of
areas
where
things
are
are
improving
at
the
moment.
Can
I
make
a
comment
on
a
fit
on
this
affinity
question?
Peter Tessier
I
think
a
short
one,
please.
Andrew Buchanan
Yeah,
going
to
ranking
is
good,
but
I
think
this
is
where
rather
than
just
the
AI
ML
bubble,
it's
actually
integrating
with
physics.
And
I
don't
necessarily
agree
with
your
comments
about
dynamics.
Molecule
flexibility
is
molecule
flexibility
is
key.
We
just
because
we're
so
used
to
static
views,
yes,
CDRs
might
not
we
look
at
them
but
only
when
they're
solved
in
structure,
and
we
know
we
can
see
them.
There
was
some
brilliant
work
that
Charlotte
Dean's
group
have
done
repeatedly,
get
to
trying
to
get
us
to
look
beyond
the
static.
What
its
impact
will
be,
I
don't
know.
But
on
affinity
prediction,
I
think
that's
another
problem
really
worth
working
on,
and
this
is
where
physics
will
help,
particularly
the
kind
of
FEP
approaches
that
chemistry
pioneered,
but
now
these
methods
are
moving
into
biological
into
antibodies.
Peter Tessier
I
want
to
transition
here
to
The
topic
of
Data Becomes The Real Bottleneck
Peter Tessier
data.
In
our
meeting,
when
we
talked
before
this,
I
actually
started
to
get
worried
this
whole
panel
would
be
just
about
data
because
everybody
has
very
strong
feelings
here
about
data.
You
know,
and
I
think
I
think
a
lot
of
us
understand
that,
you
know,
if
we
accept
data,
not
the
models,
model
architecture,
computational
methods,
just
data
is
now
the
bottleneck
for
AI
and
in
biologics
RD.
The
question
here
is,
you
know,
what
do
we
do
about
it?
Right?
What
specific
data
is
most
urgently
missing?
Who's
responsible
for
generating
it?
Is
it
a
good
idea
that
everybody's
generating
internally
and
keeping
it?
Should
there
be
sharing?
How
do
we
curate
it?
So,
what
I'd
like
to
do
is
go
through,
I
think
everybody
has
their
opinions
about
data
here.
And
I'd
ask
for
non-overlapping
opinions.
So
if
somebody
makes
a
comment
about
something,
then
okay,
let's
move
on.
Because
there's
so
many,
you
guys
had
very
strong
feelings
about
this
in
many
different
aspects,
right?
Everything
from
how
the
data
is
curated,
collected,
SOPs,
sharing,
and
so
on.
So,
Konrad,
I'm
gonna
start
with
you.
When
this
question
of
data
comes
up,
what's
your
first
thought
about
what
is
needed
next?
If
we're
providing
practical
advice
to
people
here,
you
know,
actionable
advice,
guidance
for
the
future,
where
would
you
start?
Konrad Krawczyk
So
essentially,
I
think
it's
not
a
secret
that
like
we
are
collecting
a
lot
of
data.
So
we
started
with
public
data
sets,
just
cleaning
them
up.
Peter Tessier
In
what
kind
of
data?
Just
be
a
little
specific.
What
are
you
collecting
by
the
data?
Everything.
Konrad Krawczyk
Okay,
yeah,
that's
everything,
and
essentially,
you
know,
structures,
literature,
and
patents,
and
the
reason
for
that
was
that
we
want
to
see
where
the
gaps
are.
Yeah.
And
quite
clearly,
like
in
the
one
very
big
gap
is
developability
data.
Yeah.
It's
not
that
there
aren't
few,
only
few
data
points
out
there,
yeah,
but
they
are
very
different
conditions.
Yeah.
So
what
I
would
say
is
essentially
walk
before
you
can
run.
Yeah.
So
it's
great
to
focus
on
like
in
a
lot
of
formats,
it's
great
to
focus
on
different
conditions.
But
if
you
just
like
you
make
a
foundational
data
set
for
developability,
just
one
condition,
yeah,
and
like
we
can
predict
it
like
in
very,
very
well,
yeah,
then
this
could
be
transferable
to
other
problems.
Yeah,
because
this
is
something
that
like
we
have
seen.
If
you
have
if
you
do
train
your
model
on
a
set
of
developability
conditions
and
developability
properties,
but
as
a
single
condition,
yeah,
and
then
you
apply
it
to
smaller
data
sets,
yeah,
that
might
have
been
done
at
slightly
different
conditions,
yeah,
then
there
is
predictive
power,
yeah,
where
it
wasn't
there
before.
Yeah,
so
like
walk
before
you
can
run.
Peter Tessier
Okay. Norbert?
Norbert Furtmann
Yeah.
I
mean,
I
agree.
I
see
the
same
that
I
mean
we
need
to
understand
the
data
and
kind
of
get
to
kind
of
an
understanding
of
the
readout,
which
shows
us
that
we
predict
something
meaningful
in
the
end.
But
to
move
away
from
what
Konrad
has
been
saying,
I
think
we
see
a
gap
of
when
utilizing
legacy
data
versus
what
kind
of
data
is
really
the
most
informative
one
to
train
AI
models.
So
I
think
it's
definitely
worth
investing
into
curation
of
legacy
data.
But
I
think
on
top
of
that,
investing
into
data
generation
specifically
to
feed
AI
systems
outside,
let's
say,
of
the
context
of
portfolio
programs.
I
mean,
in
our
portfolio
programs,
I
mean
the
goal
is
to
be
as
fast
as
possible,
to
walk
up
the
hill,
like
to
get
to
the
molecules
with
ultimately
check
all
of
the
different
boxes
in
terms
of
function,
CMC,
developability,
whatever.
But
in
the
end,
like
to
get
to
good
AI
models,
I
mean,
you
need
to
have
diversity
in
the
data
sets.
You
need
to
have
a
good
balance
between
well-behaving
and
not
well-behaving
molecules.
And
I
think
getting
there,
it
really
it
really
helps
thinking
like
or
changing
kind
of
the
mindset
from
how
do
we
generate
data
within
programs
and
how
do
we
generate
data
which
could
have
the
best
benefit
or
the
most
uh
information
to
to
improve
our
models.
And
I
think
this
is
kind
of
a
balance
and
the
and
a
mindset
change
uh
which
would
be
very
much
beneficial
to
advance
how
we
are
using
data
and
how
data
could
contribute
to
increase
the
performance
of
our
models.
Announcement
Are
you
enjoying
the
conversation?
We'd
love
to
hear
from
you.
Please
subscribe
to
the
podcast
and
give
us
a
rating.
It
helps
other
people
find
and
join
the
conversation.
If
you've
got
speaker
or
topic
ideas,
we'd
love
to
hear
those
too.
You
can
send
them
in
a
podcast
Legacy Data, Metadata, And Standards
Announcement
review.
Peter Tessier
You
know,
one
quick
question
on
this:
are
we
making
a
mistake
that
we're
still
using
legacy
data?
Is
that
is
that
interfering
with
progress?
As
the
world
advances
and
methods
are
better
at
generating
data.
Do
you
see
cases
where
we
should
not
be
using
legacy
data
or
it's
biasing
us
in
sort
of
does
anyone
have
an
opinion
on
this?
Andrew Buchanan
I
I
definitely
have
an
opinion.
No,
I
don't
think
it's
hampering
teams.
All
these
methods
move
on.
What
when
you're
generating
data,
the
most
important
so
there's
data
for
platform
foundational
models,
but
data
for
your
candidate
drug
that
you're
going
to
take
to
phase
one,
make
it
as
translationally
relevant
as
possible.
So
for
an
example,
uh,
one
way
we're
hampered.
People,
when
do
people
what
temperature
do
people
measure?
Do
groups
do
we
all
measure
our
affinity
at?
That
would
be
room
temperature
because
it's
convenient.
That
is
not
translationally
relevant.
You
need
to
measure
affinity
at
37
degrees
Celsius
because
that's
where
your
drug
is
going
to
work.
So
there
probably
is
hamper,
we
probably
are
hampering
ourselves
on
that
old
data,
but
technically
it's
more
of
a
challenge.
Okay.
So
that's
a
practical
example.
Do
your
affinity
at
37
and
feed
it
into
the
models.
Norbert Furtmann
I
mean,
I
think
we
should
not
forget
about
legacy
data.
I
think
it's
there
and
it
might
be
big
treasures
which
we
can
use.
But
I
mean,
you
could
see
that
maybe
data
grew
over
time.
Essay
conditions
have
been
changed.
So
you
need
to
figure
out
is
it
comparable
what
I
have
here?
But
it
might
still
be
a
valuable
starting
point.
But
then
from
that
starting
point
to
if
you
define
the
goal,
how
do
I
improve
my
model?
Then
maybe
the
best
way
is
not
to
just
keep
collecting
data
from
programs,
but
to
think
about,
I
mean,
for
example,
utilizing
active
learning
strategies,
like
what
would
be
the
next
data
point
to
generate,
not
in
terms
of
I
will
generate
a
good
molecule,
but
I
will
generate
information
which
helps
to
improve
some
more.
Peter Tessier
Melody,
do
you
have
thoughts
on
data
different
than
that's
been
shared
already
or
different
areas
we
should
think
about?
Melody Shahsavarian
I
think
it's
been
touched
upon,
but
I
was
going
to
mention
standardization
and
sort
of
having
uh
common
ontologies
when
we
capture
data.
So
establishing
that
and
ensuring
that
you
know
the
data
that
we
capture
moving
forward
complies
with
the
so
you
know,
uh,
for
example,
uh
enriching
the
data
with
all
these
metadata
around
conditions,
you
know,
like
how
the
data
has
been
generated,
methods
that
have
been
used
in
the
generation
data.
So
that's
very
important
within
an
enterprise,
but
also
like
across
industry,
right?
We
if
we
get
to
a
place
where
uh
we
have
common
data
standards
across
an
industry,
it
will
be
you
it
will
uh
you
know
open
huge
uh
opportunities
for
cross-sharing
of
data
and
some
of
the
efforts
uh
that
are
ongoing
to
collect.
Peter Tessier
Is
that
realistic,
do
you
think?
Is
it
realistic
for
us
to
get
to
a
place
where
we
have
common
standards
for
developability
data,
for
example?
I
mean,
I
don't
know
the
answer.
I
just
is
that
does
it
feel
realistic?
Andrew Martin
Can
I
just
chip
in
on
that?
That
in
the
field
of
microarrays,
that
was
something
that
was
done
fairly
early
on.
Lots
and
lots
of
groups
working
in
microway
arrays
sat
down
together
and
developed
an
ontology
for
for
storing
all
the
relevant
information.
So
I
think
it's
really
about
motivation
and
getting
people
to
sit
down
together.
As
one
of
the
people
in
microarrays
said,
they
sat
and
argued
over
pizza
for
hours
until
they
uh
they
wouldn't
be
let
out
of
a
room
until
they
came
to
a
conclusion.
I
think
that's
what
we
need
to
do
really
for
all
these
sorts
of
data
as
well.
Andrew Buchanan
I
think
there
is
a
will
in
industry
to
do
it.
You
see
it
in
some
of
the
federated
learning
opportunities.
I'm
no
longer
involved
in
them,
but
that
there
are
at
least
two,
AISB
and
FAIT,
where
big
farmers,
different
biotechs,
are
attempting
to
federate
and
share
data.
But
it's
going
to
come
back
to
Melody's
point,
which
is
so
valid
in
terms
of
how
comparable
is
our
data.
What
can
we
do
as
an
industry
to
agree
a
standard?
This
is
a
good
way
to
report
data.
How
can
vendors
who
provide
kits
get
data
standards
and
how
to
import
data
out
of
the
things?
But
I
think
industry
standards
are
so
important.
The
telephone
industry
has
them.
We
don't
in
discovery,
but
development
do.
Peter Tessier
Bernhardt,
do
you
have
a
thought?
Bernhardt Trout
Sure,
I
have
a
thought.
I
agree
with
the
points
on
standardization
and
whatnot.
I
have
a
suggestion,
though,
of
something
that
could
be
done
today
without
standardization,
or
actually
sort
of
an
auto-standardization,
but
someone
needs
to
collect
the
data.
So
we
can
go
company
by
company
and
interview
the
scientists
who
did
the
development
projects
and
then
categorize
them.
Was
it
extremely
difficult?
Was
it
pretty
difficult
or
very
difficult
or
just
difficult?
And
you
could
categorize
them
and
that
could
potentially
help
with
screening,
but
someone
has
to
do
that.
Andrew Buchanan
So
they
do
that
in
the
image
groups
are
very
good
at
doing
that.
I
haven't
seen
it
done
from
a
CMC
perspective.
But
in
these
federated
efforts,
that
that's
essentially
what
they're
doing.
People
have
strong
opinions,
but
it's
finding
ways
to
there
should
be
more
dialogue
in
this
space
to
try
and
get
as
somebody
said
sit
over
the
pizzas
and
work
it
out.
Sorry,
Andrew,
your
commentation.
Peter Tessier
Can
you
can
you
just
expand
that
just
so
we're
all
clear
on
that?
When
you
say
is
it
difficult,
extremely
difficult,
can
you
be
more
specific
about
collecting
the
data?
About
what
what
what
flesh
that
out
a
little
bit?
Bernhardt Trout
Ah
just
going
through
the
whole
development
process.
Peter Tessier
I
see,
I
see.
Bernhardt Trout
So
looking
at
sort
of
molecules
and
how
difficult
they
were
to
develop.
Right,
imagine
interviewing,
Peter Tessier
I
see.
Right.
Bernhardt Trout
Imagine
if
you
could
then
use
that
for
predictive
methods
when
you
start.
Peter Tessier
That
makes
sense.
Very
interesting.
Andrew,
do
you
have
uh
thoughts
around
data?
Andrew Martin
Yeah,
again,
I
mean
I
think
the
the
problem
certainly
working
in
an
academic
setting
is
getting
hold
of
large
enough
data
sets,
particularly
of
things
that
have
failed.
And
you
know,
all
these
companies
are
sitting
on
huge
amounts
of
data
related
to
molecules
that
they
haven't
then
taken
forward.
But
those
data
are
actually
invaluable
for
making
predictions
and
things.
So
uh
obviously
the
data
on
things
that
have
succeeded
is
is
also
hugely
useful.
But
you
know,
those
data
do
tend
to
be
published
somewhere,
uh,
as
Konrad
was
saying,
you're
collecting
from
patents
and
so
on.
But
for
the
things
that
have
failed,
that's
much
harder
to
find.
Andrew Buchanan
So
on
the
publishing
fail
data,
there
are
one
or
two
journals
now
that
are
committing
to
publish
essentially
it's
to
deal
with
the
reproducibility
crisis.
So
there
are
ways
to
do
it,
but
again,
people
don't
do
it.
Andrew Martin
Yeah,
people
don't
see
a
great
advantage
in
doing
it,
but
it's
much
more
of
an
advantage
actually
to
the
whole
community.
But
again,
a
some
sort
of
federated
database
to
store
all
these
things
would
be
absolutely
fantastic.
Peter Tessier
I
oh
we
need
to
transition
here.
How
about
a
30-second
comment?
We're
gonna
transition.
Norbert Furtmann
I
don't
want
to
make
a
comment,
but
maybe
if
there
are
any
quick
thoughts
on
synthetic
data
generation.
I
mean,
we
have
all
thinking
about
the
gaps
is
like
collecting
the
data,
generating
the
data.
But
I
think,
like,
for
example,
uh
no
not
putting
out
an
opinion
here,
but
the
advancements
in
structure
prediction
tools.
I
mean,
there's
also
gap,
how
slow
structures
are
still
growing
to
get
to
better
structure
prediction,
but
with
the
power
of
alpha
for
like
structure
prediction
tools.
So
any
thoughts
on
like
augmenting
experimental
data
with
synthetic
data
here?
Peter Tessier
Anyway,
yeah,
and
let's
let's
uh
we're
gonna
sort
of
get
there.
So
we
need
to
we
need
to
get
to
the
elephant
in
the
room,
De Novo Design Without The Hype
Peter Tessier
right?
Everybody
knows
this
is
an
AIML
panel,
and
the
elephant
in
the
room
is
you
know,
where
are
we
in
de
novo
design?
Zero
shot
predictions.
How
impactful
is
this
today?
Are
you
using
it?
Is
it
being
used
as
one
of
your
main
tools
in
parallel
with
other
antibody
generation
methods,
in
parallel
with
other
antibody
optimization
methods?
Is
this
a
good
use
of
our
resources
right
now,
or
is
it
proving
to
be
a
distraction?
Konrad,
can
we
start
with
you?
Konrad Krawczyk
All
right,
so
I'm
slightly
biased
here
because
you
know
I
do
work
with
companies
and
we
do
consult
on
actually
like
a
usage
and
deployment
of
such
AI
tools.
Oh,
we'd
like
to
hear
your
bias.
Yeah,
so
my
bias
is
essentially
that
those
tools
can
work
sometimes,
yeah,
but
like
they
do
not
just
like
work
if
you
just
download
them
and
press
the
button.
Okay,
yeah.
Peter Tessier
So
should
should
the
people
in
this
room
be
using
them
after
your
experience
and
all
you've
seen?
Should
they
be
being
used
today?
Konrad Krawczyk
I
would
definitely
say
they
should
be
explored,
yeah,
because
in
certain
cases,
like
you
know,
they
can
produce
answers,
like
they
can
produce
binders
faster,
or
at
least
hypotheses
faster
than
your,
let's
say,
like,
you
know,
a
big
library
campaign.
Yeah.
On
the
other
hand,
usage
of
such
tools,
like
you
know,
should
not
actually
stop
you
from
doing
what
you
can
do
in
the
lab.
Yeah,
so
like
this
is
actually
a
conversation
that
like
we
have
with
teams,
yeah.
Like
we're
looking
at
their
workflows,
and
if
using
such
tools
would
actually
prevent
them
from
doing
what
they
have
been
doing
always,
yeah,
then
we
just
say,
like,
look,
that's
perhaps
not
for
you.
Yeah.
Okay.
So
looking
for
opportunities.
Peter Tessier
Okay.
Norbert?
Norbert Furtmann
I
mean,
we
are
very
much
committed
making
it
work.
I
mean,
like
as
a
to
to
give
like
an
insight.
I
mean,
the
majority
of
our
programs
is
driven
by
traditional
discovery
approaches.
But
I
mean,
we
are
very
much
committed.
We
set
up
a
specific
group
just
focusing
on
the
novo
and
we
use
it
in
parts
of
our
programs.
I
don't
see
it
like
at
the
moment
that
the
novo
will
replace
all
of
the
other
discovery
technologies,
but
it
will
be
another
tool
in
a
toolbox.
So
you
might
have
like
even
like
in
past
programs,
that
maybe
immunization
is
not
working,
and
you
go
with
synthetic
libraries.
So
now
you
have
maybe
immunization,
synthetic
libraries,
and
de
novo.
So
multiple
tools,
and
I
would
definitely
recommend.
I
mean,
we
see
the
future
that
this
technology
will
evolve,
that
it
will
generate
value.
It's
simply
like
a
completely
different
approach
how
to
tackle
to
tackle
the
problem.
But
we
are
not
convinced
that
it's
like
the
magic
button
which
you
have
been
mentioning.
So
I
think
you
should
build
know-how
around
it.
And
instead
of
thinking
about
you
have
a
pipeline,
and
that
pipeline
works
for
each
and
every
target,
for
each
and
every
epitope,
you
should
build
it
in
Novo
Toolbox
with
different
complementary
methods,
which
like
where
one
method
maybe
can
like
overcome
the
bias
from
another
method,
and
kind
of
you
build
expertise,
you
have
the
experts
in-house,
you
have
exp
you
have
access
to
the
right
tools,
and
then
you
customize
it
to
the
problem.
Norbert Furtmann
So
we
are
very
much
committed,
we
see
it
as
a
future,
but
I
think
we
also
like
would
like
to
have
a
realistic
view
and
that
it's
like
a
complex,
tricky
technology
where
you
need
to
build
the
expertise,
where
you
need
to
know
what
method
to
use
when,
and
then
yeah,
we
we
see
successes.
So,
I
mean,
to
be
to
get
very
concrete,
I
mean,
we
have
targets
where
we
struggle
to
to
let's
say
generate
binders
with
traditional
approaches,
and
we
were
successful
with
the
Novo
approaches.
We
could
like
evolve
the
binders,
so
it's
not
like
zero
shot
and
you
have
like
a
highly
affine
binder,
but
you
get
to
weak
binders,
you
evolve
them
not
only
in
terms
of
binding,
but
also
in
terms
of
function,
and
you
end
up
in
a
functional
building
block
originating
from
the
Novo,
where
you
could
close
a
gap
where
traditional
approaches
fail.
On
the
other
hand,
the
novo
also
fails
on
target,
so
it's
not
like
it's
a
magic
bullet
and
it
always
solves
the
problem.
Peter Tessier
Melody?
Melody Shahsavarian
I
essentially
agree
with
everything
that
Norbert
said.
Uh
it's
early
days.
Uh,
we
also
have,
we
see,
you
know,
early
positive
signal.
It's
improving,
but
you
know,
we're
we're
also
using
it
as
another
tool
in
the
toolbox
along
with
uh,
you
know,
other
uh
traditional
methods.
There's
case
studies
where,
you
know,
similar
to
what
Norbert
said,
the
the
Genovo
uh
has
worked,
but
where
other
you
know
other
sort
of
platforms
have
failed,
hasn't
been
zero
shot,
not
yet,
at
least.
And
in
programs
that
we
use
it,
definitely
it's
uh
you
know,
it
it's
it
leads
to
having
more
diverse,
you
know,
it
gives
you
other
different
options
than
other
other
platforms
that
we
run.
So
at
the
end
we
get
a
more
diverse
uh
panel
of
hits.
Peter Tessier
I
mean
to
me,
this
seems
like
a
very,
very
important
point.
That
are
you
seeing
successes
in
cases
where
conventional
methods
are
not
succeeding,
even
if
it's
only
hit
generation
that
needs
to
be
opt-further
optimized.
And
you're
saying
both
of
you,
you've
seen
that
before.
That's
correct.
That
seems
like
a
very
important
moment
in
the
field,
you
know,
when
you
start
to
see
that.
Andrew.
Andrew Buchanan
Uh
epitope-specific
design
is
the
holy
grail
for
this
space,
which
a
few
years
ago
was
nonsense,
but
now
it
is
actually
it
can
work,
as
you
say,
alongside
other
methods
that
you
choose.
It
has
an
appropriate
use.
I
don't
use
it,
so
I'll
not
cover
it
anymore.
Peter Tessier
Okay.
Bernhardt,
I
wondered
if
we
could
rephrase
this
a
little
bit,
you
know,
from
a
zero
shot.
Often
we're
thinking
about
affinity
prediction,
but
you
know,
in
the
developability
space,
you
could
think
in
the
same
way.
Suppose
you
have
a
problematic
molecule.
How
close
are
we
to
you
know
being
able
to
predict
the
sort
of
you
know,
a
small
panel
of
things
that
that
solves
that
problem?
Do
you
have
thoughts
on
that?
Bernhardt Trout
Yeah,
so
again,
right
now
we're
at
the
stage
where
we
can
kind
of
screen
for
potential
problems
as
far
as
actually
solving
those
problems.
I
actually
think
that
the
technology
is
there
in
terms
of
just
broadly
the
descriptors,
the
the
machine
learning
methods.
The
issue
is
again
the
data
generating
and
then
training
models.
So
we're
not
really
we're
far
away
from
the
point
of
training
the
models.
Peter Tessier
Okay,
but
probably
more
case
studies
and
you
know,
testing
and
implementation.
Bernhardt Trout
Well,
even
within
a
company,
there's
lots
of
information
that
needs
to
be
gathered
together.
But
that's
a
project
in
itself.
Okay.
Andrew,
do
you
have
thoughts
around
this
area?
Andrew Martin
Yeah,
again.
So
I
uh
agree
very
much
with
uh
what
others
have
said,
but
uh
I've
always
been
very
skeptical
about
the
uh
de
novo
idea.
You're
gonna
have
to
do
some
testing.
Uh
just
relates
to
a
question
that
came
up
earlier.
Are
you
going
to
need
to
test?
Well,
yes,
clearly
you
are.
And
a
lot
of
the
papers
that
have
been
published
on
de
novo
methods
uh
essentially
produce
better
libraries,
which
I
think
is
is
is
good,
but
it's
it's
not
really
de
novo.
So,
you
know,
there
are
a
couple
of
papers
now
that
do
seem
to
uh
succeed
for
or
much
more
closely
succeed
for
de
novo
work.
But
it's
very
interesting
that
uh
other
panel
members
have
said
that
it
works
for
things
that
they
haven't
been
able
to
raise
antibodies
against
through
conventional
uh
methods
because
uh
the
papers
that
have
been
published
are
fairly
sort
of
standard
targets.
And
also
taking
those
sorts
of
ideas
back
to
what
happened
early
on
in
3D
protein
structure
prediction
in
general,
there
were
a
lot
of
cases
where
people
published
things
that
seemed
to
work
fantastically
well.
But
that's
because
they
tried
a
hundred
different
examples
and
they
published
the
one
that
worked,
the
protein
that
they
could
work.
So
it
it's
difficult
to
tell
with
the
papers
that
are
really
sort
of
saying
we're
doing
really
well
to
know
uh
how
many
failures
they've
also
had.
But
it's
it's
very
interesting
to
see
that
it
is
providing
some
success
for
people.
But
I
don't
think
it
would
be
a
first
route
of
doing
anything
for
for
quite
some
time,
yeah.
Norbert Furtmann
I
think
you
made
a
very
good
point.
I
mean,
lots
of
the
things
are
maybe
it's
how
you
define
de novo,
right?
Because
some
things
could
be
libraries.
If
you
see
how
many
molecules
are
tested,
you
could
argue
like
if
it's
like
a
hit
by
chance
or
if
it's
a
hit
by
design,
if
you
go
those
library
approaches.
And
I
mean,
we
could
fully
agree,
and
it's
also
a
question,
I
mean,
how
hard
you
try,
right?
Like,
and
then
I
mean,
even
if
something
works,
like
it's
still
a
different
if
it
works
or
it
gives
you
an
edge
about
the
traditional
of
the
traditional
discovery
technologies.
And
there
I
think
it's
still
the
early
stages,
but
worthwhile
to
invest
as
another
component
within
a
toolbox.
Andrew Martin
Uh
absolutely
about
about
the
libraries.
I
mean,
there
have
been
lots
of
companies
for
some
time
working
on
improved
libraries.
So
I
was
a
consultant
for
one
company
who
wanted
to
do
de
novo
design,
and
I
said,
don't
do
that,
do
better
libraries.
And
that's
what
they
did,
and
they've
been
very
successful.
So
uh,
you
know,
I
think
that's
it's
a
differentiation.
Peter Tessier
Short
response.
Andrew Buchanan
Yeah,
I
I
think
these
structure-guided
informed
or
structure-enriched
libraries
are
super
helpful.
Peter Tessier
Okay.
Hard Targets And Epitope Specific Design
Peter Tessier
Okay.
I
want
to
change
this
a
little
bit
from
thinking
about
thinking
about
targets
for
a
minute.
You
know,
I
think
that
when
we
hear
about
de
novo
and
we
think
about
the
potential,
you
know,
the
question
is
there
are
high-value
targets.
Now,
there's
some
difference
of
opinion
about
how
much
demand
there
is
for
these
high-value
targets,
but
it's
certainly
been,
you
know,
there's
certainly
these
attractive
membrane
proteins,
GPCRs,
ion
channels,
and
so
on
that
have
been
hard
to
target
with
conventional
methods.
I
guess
the
question
that
I
have
is
for
those
of
us
that
are
using
this,
that
are
practitioners,
are
you
seeing
success
against
hard
targets?
Since
this
is
often
being
portrayed
as,
you
know,
the
The
killer
application
of
de
novo
is
against
things
that
the
ant
it's
an
antigen
problem
in
many
ways.
Very
difficult
antigen,
very
difficult
to
get
immune
responses,
difficult
to
analyze,
or
or
it's
the
nature
of
the
target,
it's
an
agonist
or
something
where
it's
just
it's
it's
complex.
I
guess
are
you
seeing
practical
successes
there?
And
maybe
some
of
us
that
have
commented
specifically,
Norbert,
can
we
start
with
you?
Norbert Furtmann
Sure.
I
mean,
I
cannot
share
too
much
about
the
the
targets
in
particular,
but
I
mean
the
example
I
just
just
made
a
few
minutes
ago.
I
mean,
it
depends
how
you
define
a
challenging
target.
And
I
mean,
the
target,
the
example
I
was
uh
referring
to
is
kind
of
a
target
like
where
it
was
hard
to
get
an
immune
response
with
the
traditional
approaches,
and
that
could
be
solved
via
de
novo.
I
mean,
we
have
experience
with
more
than
a
single
target,
so
with
a
panel
of
targets,
but
I
think
we
are
not
there
that
I
could
give
you
a
statistically
uh
significant
response.
So
I
mean,
we
pick
the
targets
not
based
on
how
challenging
they
are,
but
with
relevance
for
our
portfolio
programs.
And
sometimes,
like,
it's
a
target
because
of
a
challenging
uh
immune
response.
Sometimes
it's
a
target
because
where
you
know
you
need
to
have
a
challenging
mode
of
action,
like
argonism.
And
for
this
argonism,
you
know
you
need
to
like
address
a
specific
epitope.
And
with
the
no-hu,
maybe
that's
one
of
the
advantages.
I
mean,
you
don't
fish
randomly,
but
you
can
you
can
go
against
that
specific
epitope.
But
maybe
also
the
success
rates
as
a
last
comment.
So
what
we
see,
but
also
what
you
see
in
the
literature,
is
a
bit
defined
like
by
the
properties
of
the
epitope.
So
it
seems
to
be
a
bit
more
tricky
addressing
hydrophilic
epitopes
and
addressing
like
epitopes,
let's
say,
where
you
need
to
design
charged
or
hydrogen
bond-based
interactions,
then
you
address
hydrophobic,
hydrophobic
interfaces.
True
interesting.
Melody,
do
your
thoughts?
Melody Shahsavarian
Yeah,
I
mean
it
it's
early
days,
but
I
would
say
that
the
response
to
your
question
is
yes.
We
have
seen
success
against
GP
at
CRs,
let's
say.
Again,
you
know,
early
days,
but
there's
promise
there.
Peter Tessier
Exciting.
Comment?
Andrew Martin
Yeah,
in
my
working
for
a
previous
big
pharma,
they
they
also
see
success,
but
it
it
it's
again,
they're
early
leads.
Peter Tessier
Okay.
So
we're
gonna
take
uh
questions
from
the
audience.
So
if
anyone
wants
to
come
up,
please
come
to
the
mic
while
we
have
sort
of
our
last
question
for
the
panel.
Where To Invest Next
Peter Tessier
Our
last
question
for
the
panel
is
really
thinking
about
sort
of,
you
know,
where
are
we
going?
What
does
the
future
look
like?
So
I'd
like
like
a
30-second
sort
of
response
to
keep
it
short.
If
you
could
reset
the
field's
priorities
for
AI,
ml
and
biologics,
or
if
you
could
just
emphasize
from
your
opinion,
you
know,
where
do
you
think
the
where
do
you
think
the
most
important
investments
are
or
should
be
made
right
now?
You
could
also
go
as
far
if
you
want
to
be
controversial
as
what
should
we
stop
doing
as
well,
if
you
have
any
opinions
like
that.
Okay.
But
just
like
what
is
a
takeaway
you
have?
What
what
would
you
say
for
practitioners,
people
in
the
trenches,
where
where
should
where
should
they
invest
or
their
their
company
invest,
and
maybe
even
where
should
they
stop
investing?
Konrad.
Konrad Krawczyk
Yeah,
so
altogether
I
would
say
that
investment
in
looking
into
like
how
those
computational
AI
methods
can
help
with
the
workflows
is
very
important
because
otherwise,
like
in
there
are
many
tools
coming
out
every
week.
You
know,
some
of
them
might
be
downloaded,
some
of
them
might
be
internalized,
but
they
are
essentially
gathering
dust.
Yeah.
So
if
you
essentially
like
you
make
an
effort
into
like
you
know,
looking
at
how
those
methods
can
improve
your
workflow,
then
this
is
going
to
be
something
but
but
still
for
the
practitioners
in
the
room,
that's
not
easy,
right?
Peter Tessier
Like,
how
do
you
actually
do
that?
Because
it's
it's
so
overwhelming
with
the
pace
of
things.
So
work
before
you
can
run,
yeah.
Konrad Krawczyk
Like
and
try
not
to
do
everything,
but
just
like
you
can
do
something.
Yeah,
so
we
do
have
methods
that
even
like
in
very,
very
simple
models.
If
you
have
few
data
points
for
lead
optimization,
just
like
you
know,
looking
at
certain
readouts,
you
can
make
a
model
out
of
that.
Yeah,
it
already
generates
hypotheses,
yeah.
Peter Tessier
Okay,
okay.
Norbert?
Norbert Furtmann
So
maybe
picking
picking
picking
up.
I
mean,
you
covered
a
bit
like
how
to
get
into
it,
right?
And
I
mean
how
how
we
kind
of
did
it.
I
mean,
talking
to
the
wet
lab
scientists
and
identifying
what's
the
biggest
gap
and
where
could
we
help
with
bringing
in
silicon
approaches
and
then
like
prioritizing
is
that
realistic
that
we
could
solve
that
via
computational
approaches
or
not?
But
besides
that,
like
uh
as
a
second
take,
I
mean,
I
feel
at
the
moment,
like
we
are
trying
to
augment
wet
lab-driven
workflows
with
computational
tools
to
make
kind
of
like
the
traditional
discovery
pipelines
more
efficient,
faster
with
Inselico
tools.
If
you
think
de
novo
and
the
evolution,
how
where
drug
discovery
might
go,
you
might
rethink
kind
of
that
that
whole
approach
and
think,
okay,
maybe
it's
in
the
future
workflow
might
be
computational
driven,
AI
driven.
And
how
do
you
build
that
lab
in
the
loop
kind
of
concept
that
you
customize
your
wet
lab
to
be
the
perfect
counterpart
to
the
Inzillico
tools?
And
this
is
where
I
would
invest.
Like
it's
it's
two
different
concepts.
And
I
think
we
need
to
rethink
how
we
run
our
value
chain
or
how
would
we
like
to
run
our
value
chain
in
future,
considering
the
advancements
in
AI
and
computational
tools?
Peter Tessier
You
know,
we
didn't
talk
about
this
a
lot
in
this
panel,
but
you
know,
the
panel
had
a
very
strong
feeling
about
this
idea,
you
know,
of
this
lab
in
the
loop
of
data-driven
AI
usage,
right?
And
sort
of
that
was
a
common
theme
that
implementation
of
AI
needed
to
be
paralleled
with
strong
data
generation.
Melody,
do
you
have
any
takeaway
message
or
any
suggestion
for
the
audience
and
where
to
put
their
efforts?
Melody Shahsavarian
So
I
I
I
guess
I
mean
I've
said
this
like
five
times.
I
think
I
already
sort
of
have
a
biased
uh
view,
but
uh
from
from
my
perspective,
investing
more
in
complex
molecules
and
data
generation
is
a
part
of
it,
but
also
you
know,
starting
to
work
on
models
that
will
address
uh
these
are
you
know,
Melody
was
emphasizing
the
fact
that
you
know
often
information
about
the
monospecifics
is
not
predicting
certain
aspects
of
the
biospecifics
or
multi-specifics.
So
actually
directly
generating
data
on
multi-specifics
is
really
important.
Peter Tessier
Andrew?
Andrew Buchanan
Generate
wet
lab
data
that's
relevant
to
your
kind
of
drug
target
profile.
There
was
a
brilliant
example
of
a
paper
from
a
group
from
Sewell,
where
they
have
a
new
platform
you
can
measure
church
expression
and
affinity.
They
used
fairly
standard
model,
machine
learning
models
with
structure,
and
they
could
predict
things
that
you
that
would
beat
an
empirical
scientist
day
in
and
day
out.
So
I
think
web
lab
generated
data
will
transform
the
actual
use
of
this
kind
of
tech
because
it's
data
constrained.
Peter Tessier
And
can
we
just
push
you
on
that?
Do
you
see
that
as
it's
going
to
accelerate?
I
mean,
do
you
see
this
as
a
declining
enterprise
where
you
know
data
generation
is
very
important
but
will
become
progressively
less
important
as
the
models
get
better?
You
don't
agree.
Andrew Buchanan
No
wet
lab
way
that
data
is
going
away.
If
you're
an
early
career
person,
become
good
at
data
science,
talk
to
them.
But
if
you
like
the
lab,
the
lab
is
not
going
away.
Peter Tessier
And
will
it
diminish
some,
or
you'd
push
back
on
that
too?
Andrew Buchanan
I'm
also
gonna
push
back
on
that.
Okay,
there's
some
brilliant
folks
that
show
you
examples
that
you're
gonna
push
this
molecule
through.
It's
a
multi-specific
biology
hasn't
seen
it
before,
a
choke
cell
hasn't
seen
it
before.
You
don't
know
what's
gonna
happen.
You
can't
triage
your
way
through
it.
Peter Tessier
Okay.
Bernhardt Trout
I'll
answer
your
question
for
those
who
are
developing
machine
learning
methods.
We
need
new
descriptors.
Currently,
our
chemistry
is
hydrophobic,
hydrophilic,
charge-charge,
and
various
structure
factors.
Chemistry
is
probably
a
lot
more
complicated
than
that.
Peter Tessier
And
what
what
what's
it
gonna
take
to
do
that?
It
does
feel
like
we
need
the
disruption.
You
know,
you
have
experience
in
fields
outside
of
this
field.
Can
we
learn
something
from
others,
you
know,
in
catalysis?
Can
we
bring
things,
you
know,
how
how
do
we
get
out
of
the
same
uh
sort
of
feature
perspective
we've
had
for
years?
Thoughts
on
that?
Bernhardt Trout
I
think
it's
uh
bringing
ideas
from
other
fields
is
always
uh
helpful.
I
I
think
it's
uh
being
creative,
it's
in
a
way
the
opposite
of
implementing
machine
learning
is
thinking
about
the
chemistry
and
developing
new
theoretical
approaches
to
chemistry.
Bernhardt Trout
Makes
sense.
Okay.
Andrew?
Andrew Martin
Just
a
couple
of
words
really
that
I
think
the
the
the
strength,
the
value
is
in
developability.
You
know,
it's
it's
triaging
your
your
data
so
that
you
don't
waste
time
down
the
pipelines.
That's
that's
really
the
I
think
the
biggest
advantage.
Peter Tessier
Okay.
Anybody
else
have
a
short
comment,
Bernhardt?
Bernhardt Trout
I
would
always
go
with
structure-based
methods
if
I
could.
Okay.
Andrew Martin
Yeah,
I
would
say
echo
what's
been
said
already,
that
it
depends,
but
for
a
lot
of
things,
the
power
of
protein
language
models
and
the
antibody
protein
language
models
really
encode
structural
data
in
some
sort
of
hidden
way
within
the
sequence
information.
So
I
think
that
can
be
useful
for
an
awful
lot
of
things.
But
for
certain
things
we
find
that
using
structure
in
prediction,
so
things
like
epitopes
and
affinity,
I
think
you
need
structure.
Peter Tessier
Thank
you.
Well,
unfortunately,
we've
come
to
the
end
of
our
time.
And
you
know,
I
think
it
would
be
nice
if
we
all
thank
the
this
very
thoughtful
panel
of
scientists.
Thank
you
so
much
for
inspiring
us.