CHIís Drug Discovery and
A Framework for Applications of Genomics and Proteomics to Drug
The last decade has been marked by an unprecedented
boom in the number and variety of technologies used to discover and develop new
drugs. Many of these new technologies arose from work surrounding the Human
Genome Project, and we at Cambridge Healthtech Institute have been in the
enviable position of having a front row seat to these remarkable developments.
To support us in tracking and analyzing the progress of these technologies and
their applications, weíve developed a dynamic, integrated framework that we
call our Drug Discovery and Development Map.
This framework is shaped by a particular view
taken by researchers seeking to describe, measure, understand, and ultimately
predict biology. These tasks require the generation of data, with the nature of
the data depending upon the specific types of questions being asked. Our
framework focuses on two key components, genes and proteins, and considers five
major types of data for each one. The five types of data encompass structure,
expression, variation, function and integration.
A wide variety of tools and assays are used to
generate this range of data, with several different approaches used for any one
type of data, and some tools used for more than one type of data. As shown in
the map, the last column under "Biology" represents data
interpretation, with examples given for each of the ten categories. In each
cases there are actually two subsets of interpretation; both databases
containing that type of data and bioinformatic algorithms for comparison,
analysis, data mining and other purposes. To make the map more readable, this
level of detail has not been broken out.
Throughout this time period, researchers have
studied all ten types of data, but there have been clear shifts in patterns of
emphasis. Some of these shifts have resulted from technological progress that
allowed higher throughput or better results to be achieved, while in other
cases, the maturation of one area has served as the foundation for increased
emphasis in others. For example, seven or eight years ago the greatest emphasis
was on gene sequences. In fact, since genomic DNA sequencing was considered by
many to be too formidable a project at that stage, it was cDNA, or the expressed
genome, that drew their attention. As progress was made toward obtaining data
for nearly all expressed gene sequences, increased attention shifted into three
key directions. For those involved in the Human Genome Project, the effort of
gaining sequence information at the level of the entire genome began to ramp up.
For others, the next focus was still in genomics, but beyond the static genome
and into the dynamic patterns of gene expression. For still others, the key
field for further development was proteomics, where the ultimate actors in
health and disease can be studied comprehensively.
The rapid development of microarray technology
spurred gene expression profilingís rise in prominence.As sequencing efforts
matured, it also became clear that beyond achieving a consensus sequence for the
human genome, lay the even larger challenge of understanding the basis of human
genetic variability (particularly with regard to single nucleotide
polymorphisms, or SNPs). This attention to SNP genotyping, which began just a
few years ago, was greatly aided by the foundation laid with gene sequence data.
All three of these genomic segments, sequencing,
gene expression profiling and genotyping, greatly benefited from technological
advances that resulted in dramatically higher throughputs and rapid declines in
cost. For some other areas it has proven to be more of a challenge to achieve
performance improvements of the same magnitude. Defining the function of each
gene, or more specifically, the protein it encodes, has been particularly
challenging. Itís just not yet possible to study gene function with the type
of "massively parallel" approaches used in those other fields.
For proteins, amino acid sequences are comparable
to nucleic acid sequences for genes, but the richer data is 3-D conformational
structure, since it provides much better clues as to biological activity. While
the techniques for determination of protein conformational structure have
improved, this process still remains relatively quite slow. Other areas of
proteomics have also labored under the lack of comparable technical
breakthroughs to replace difficult 2-D gel analysis, even though promising new
approaches, including protein arrays, are under development. There has been a
huge increase in the throughput for one type of functional proteomics,
protein-protein interaction studies based on yeast 2-hybrid assays. This was
fueled by high interest, and achieved by industrializing and automating the
assays, rather than through any significant technical advance.
Ultimately, the new genomic and proteomics
technologies are not just about generating reams of disparate bits of data, they
aim to provide a unified view of complex biological systems. The first step in
this process is generating gene networks from gene sequence and expression data.
Such studies do not require new tools as much as sophisticated and comprehensive
approaches to data compilation. Correspondingly, protein pathway studies pull
together data about how changes in protein expression levels modulate the
expression of other proteins in a cascade fashion. In our framework, integration
at the protein level has been extended into systems biology, which can be
described as the integration of genomic, proteomic and metabolic data.
Beyond discovery-oriented biology lies the actual
development of marketable diagnostics and therapeutics. In the commercial realm,
most of the value ascribed to genomic and proteomic technology and data is tied
directly to the pharmaceutical industryís ability to translate that
information into such products. For diagnostics, tests based on genes
(mutations, SNPs), gene expression profiles and protein biomarkers are being
added to the more standard diagnostics of clinical chemistry or immunoassays.
Much of the impact of genomics on drug development thus far has been focused on
the identification and validation of biological targets. While much of this
research on targets is based only on comparisons of the biology of health and
disease, sooner or later it becomes critical to integrate the activity of
chemical compounds with the body.
There are two different ways in which chemistry
comes into play-- in the form of chemical probes or as compounds being evaluated
as potential leads or drugs. The use of chemical probes to elucidate biology is
the basis of chemical genomics. A large series of compounds are individually
introduced into cells, with the aim of identifying a cell that then undergoes a
specific phenotypic change. By identifying the compound introduced into that
cell, and then finding which gene or protein was bound by the chemical probe,
the researcher succeeds in finding both a genetic link to a change in phenotype
and a chemical probe that can cause that change to occur.
Genomics and proteomics can also be used in
compound evaluation, by providing molecular details about the effect of a
compound on the body. This approach may highlight mechanisms of action or
toxicity, both of which can be critical for further compound optimization. The
other way in which genomics and proteomics can be employed for drug development
is through pharmacogenomics, which focuses on the relationship between drug
responses and biological variation. Pharmacogenomics comprises the study of
variations in targets or target pathways, variation in metabolizing enzymes (pharmacogenetics)
or, in the case of infectious organisms, genetic variations in the pathogen.
Finally, just as biological data has databases and tools for analysis that
constitute bioinformatics, data about chemical structures and activities
While this framework provides a relatively simple
view of this complex field, it has proven to be quite useful for aiding
visualization of relationships, identification of bottlenecks and prediction of
trends. Cambridge Healthtech Institute has effectively used this framework to
aid in the selection and structuring of conferences and reports, as well as for
analysis in reports and consulting projects. Additional levels of detail can and
have been added to this framework, such as mapping specific technologies to more
detailed steps in the drug development process. Once such technology maps have
been created it is then possible to map technology developers into each space.
For more detailed explanation of this framework, or to view an example of a