|
11:00am - 12:00pm Conference
Registration and Poster Set-up
| 11:30am - 12:30pm
Luncheon Workshop |
Sponsored by: |
"Data
Integration at Amgen"
Presented by Mark Jury, Amgen |
 |
1:00pm Chair's Opening Remarks
Dr. Georges Grinstein
|
1:10 Mining the Biomedical
Literature using Semantic Analysis and Natural Language Processing
Techniques
Dr. Ronen Feldman, Assistant Professor, Mathematics and Computer
Science Department, Israel's Bar-Ilan University and Chief Scientist,
Clearforest, Ltd.
The information age has made it easy to store large amounts of data
electronically. The proliferation of documents available on the web, on
corporate intranets, on newswires and elsewhere is overwhelming. Search
engines only exacerbate this overload problem by making more and more
documents available in a matter of a few keystrokes. This information
overload is directly mirrored in the bio-medical field, where scientific
publications and other forms of text-based data are produced at an
unprecedented rate. Text mining is the combined, automated process of
analyzing unstructured, natural language text in order to discover
information and knowledge that are typically difficult to retrieve. In
this paper, we focus on using text mining as it applies to the biomedical
literature. In particular, we are interested in finding relationships
among genes, proteins, drugs and diseases, to assist in explaining and
predicting complex biological processes. We will describe the LitMinerÔ
system that we have developed for this purpose; in particular, we will
focus on the KDD CUP 2002, which serves as a formal evaluation of our
system.
1:50 Integromics in Drug Discovery:
Practical Tools for Integrating Genomics, Proteomics, Bioinformatics, and
Chemoinformatics
Dr. John N. Weinstein, Senior Research Investigator, Laboratory of
Molecular Pharmacology, National Cancer Institute, National Institutes of
Health
After microarray experiments, (or other "omic" studies),
have been done in the pharmaceutical context, one's first task is to
analyze the data statistically. But that leaves open the Big Question:
what do the results mean biologically and pharmacologically? A number of
practical approaches and computational tools for addressing that question
and integrating different types of data will be discussed. Included is a
set of program packages available from our group and collaborators through
http://discover.nci.nih.gov: MedMiner, MatchMiner, GoMiner, CIMminer, and
LeadScope/LeadMiner.
2:30 Poster and Exhibit Viewing,
Refreshment Break
ONTOLOGIES
3:15 Ontologies in Breast Cancer:
Concepts vs Words
Dr. Michael Liebman, Director, Computational Biology and Biomedical
Informatics, Professor, Cancer Biology, Abramson Cancer Center of the University
of Pennsylvania
Ontologies are correctly defined as hierarchies of concepts but are
frequently applied to mean controlled syntax,database schema, semantic networks
or thesaurus. In using an ontological approach to extract knowledge about
disease progression and disease presentation, including co-morbidities, we have
extended the approach of ontology construction to incorporate critical temporal
domains. Towards this goal, we have applied LexiMine (SPSS) as a method for
syntactical analysis of free text to establish the value in the analysis of full
articles versus abstracts in knowledge extraction.
3:45 MedScan - Automatic Text
Mining and Data Extraction System
Dr. Ilya Mazo, President, Ariadne Genomics, Inc.
MedScan uses natural language processing to extract information about
proteins, small molecules and their interactions from PubMed abstracts. The
system performance and the role of ontology and efficient term recognition will
be discussed. The knowledge management system that integrates MedScan and gene
expression analysis tools will be presented.
4:15 Opportunities for Text
Mining, supported by Ontology-Based Knowledge Assembly Description
Mr. Chris Sole, Director, Technical Liaison, Jarg Corp./ SemanTx Life
Sciences, Inc
Curation costs for medical text-bases are escalating rapidly, and manual
efforts to maintain up-to-date indexing have significant built-in biases. This
presentation will review how ontology-based text mining applications offer a
clearer path for lingusitic recogntion than simple word-match or synonym-based
approaches. The need is clear right through the product lifecycle, from
discovery to clinical trials, treatment and payer claims analysis. The
technology should form the basis for ongoing knowledge management processes in
organizations.
4:45 Flexible IT Structures For
Ontology Networking
Mr. John Wilbanks, Director, Genstruct
Most ontologies and taxonomies are "hard-coded" - a user
predefines the relationships amongst the entities in the ontology (for example,
father-son defines the relationship between two men in a family ontology). This
hard-wiring creates trouble in science, where relationships are multivariate and
change in response to new data. Incellico's CELL technology allows users to
construct ontologies; (or topic maps, data models, taxonomies), "on the
fly" out of semantic relationships, creating a flexible ontology network
that can grow and respond to new information, regardless of data format, type or
source.
5:15 Panel Discussion
5:45-6:45 Networking Reception (hosted
by Cambridge Healthtech Institute)
7:30am Coffee and Technology
Workshop (Sponsorship Available)
INTEGRATING DIVERSE DATA SOURCES
8:30 Chair's Remarks
Dr. Donald Jackson, Senior Research Investigator, Applied Genomics,
Bristol-Myers Squibb
8:35 Improving the Drug Discovery
Process Through Information Technology Approaches
Dr. Herschel J.R. Weintraub, Principal Consultant, IBM Life Sciences
The Pharmaceutical and Biotech Industry is facing unprecedented challenges.
Enormous amounts of data must be collected and analyzed. Sources of these data
include public sources (e.g. genomic data), in-house instrumentation and
robotics, in-house laboratories, and the literature. Timely access to this data
by scientists, management, and others, and the ability to gain knowledge from
it, is essential in order to remain competitive. An assessment of the issues
faced by the industry and possible solutions will be presented.
9:05 Intelligent Integration of
Heterogeneous Bioinformatics Data Sources
Dr. David Silberberg, Senior Computer Scientist, Research and Technology
Development Center, Johns Hopkins University Applied Physics Lab
We present an approach to integrating distributed and heterogeneous
bioinformatics databases based on simplified query specifications, model-based
automated query formulation, and ontology inference techniques. This approach
simplifies the query formulation task for users and applications, simplifies the
data source enrollment task for data managers, and simplifies the system
integration task for the federated system managers. We will present the results
of a seamless integration of Refseq, OMIM (Online Mendelian Inheritance of Man)
and Ensembl which demonstrates new levels of data integration potential for
biologically relevant access to all available information about genes, proteins
and sequences.
9:35 Bioinformatics: Not Just for
Sequences Anymore
Dr. Donald Jackson
Drug discovery bioinformatics requires integrating multiple types of
information (nucleotide and protein sequences, mRNA and protein expression
measurements, model organism data, alternative splicing, single nucleotide
polymorphisms, and more). Data from public and proprietary databases, alliances,
and internal research must be combined into a unified picture that
experimentalists can access. Our experience highlights tools and strategies for
data integration and presentation that are applicable to pharmaceutical, biotech
and academic researchers alike.
10:05am Poster and Exhibit
Viewing, Refreshment Break
10:45 "The Open Science
Alliance" - Accelerating Data Integration to Support Pharma Collaborative
eR&D
Dr. Jeffrey Spitzner, Chief Scientific Officer, Rescentris
Optimizing R&D portfolio and pipleline decision support and management -
from bench top discovery science, validated systems in clinical trials and
manufacturing, to the executive suites - requires badly needed protocols and
standards for data integration and interoperability. This talk will discuss the
"Open Science Alliance(TM)", a focused activity of 15 Fortune 50
pharma and 30 other companies designed to greatly accelerate interoperability
many years faster than by other methods. This talk will cover strategies, new
XML technologies and standards being created and considered for all key types of
R&D data and their XML-based ontologies to support all aspect of
Collaborative eR&D including decision support, and the final aims and
timetables of the Alliance.
11:15 Informatics-Driven
Chemistry in Action: The Success of Data Integration in a Discovery Research
Organization
Dr. Ramesh Durvasula, Senior Director, Applied Science, Tripos,
Inc.
The integration of data from multiple disciplines has been achieved at the
technological level in many discovery organizations. However, the challenge
facing these organizations today is how to leverage the tsunami of available
data, to find relevant trends in this data, and to embed the data mining within
an efficient business process. In this presentation, we will discuss the
strategies used for seamlessly integrating our library design environment,
(enumeration, virtual screening, diversity, etc.), with our chemistry operations
systems (inventory, reagent ordering, purification, etc.). With such a tight
integration, we are able to achieve maximum productivity from our
high-throughput and medicinal chemistry projects, including unprecedented
success in various therapeutic projects. The technologies we have developed to
support both our in-house discovery operation as well as many of our clients'
operations will also be presented. Topics will include data integration,
compound lifecycle management, and electronic lab journal strategies.
11:45 Panel Discussion
| 12:15 Luncheon
|
 |
|
CLOSING PLENARY SESSION:
EFFECTIVE DATA MANAGEMENT FOR DRUG DISCOVERY
1:45pm Chair's Remarks
Dr. Georges Grinstein
1:50 Information Pathways
in Pharmaceutical R&D
Dr. Otto Ritter, Associate Director, Bioinformatics, Enabling
Science and Technology, AstraZeneca R&D Boston
Molecular pathways are useful models for representing biological
processes at the cellular level. Pathways are usually represented as
graphs, where molecules or molecular complexes are the nodes, and
molecular interactions are the edges. In one step of generalization, where
we take any biomedical entities as nodes and any general relationships as
edges, we get an associative network as a representation of biomedical
knowledge. If we take one more step in this generalization process and
include any information assets and any transformations or associations, we
get information pathways as models representing (pharmaceutical) R&D
processes. It's useful to know that at all three levels of interpretation
(molecular, biomedical, R&D), we can actually re-use the same software
components for data management, analysis, and visualization.
2:10 Standards to Enable
Information Integration
Dr. David Benton, Director, Knowledge Integration and Discovery
Systems, Informatics and Knowledge Management, R&D IT, GlaxoSmithKline
Information integration is widely acknowledged to be both a great need
and a major challenge facing pharmaceuticals R&D. The principal
obstacle to information integration systems is heterogeneity at virtually
all levels of the pharma information stack. This talk will address the
sources of this heterogeneity, question the premise that any technical
solution can solve the problems posed by heterogeneity, and propose that
any non-trivial information integration will require shared ontologies and
domain models. It will also address whether such shared ontologies and
domain models can be: (1) developed entirely in-house; (2) acquired from
vendors; or (3) developed as open standards by the R&D community.
2:40 Poster and Exhibit
Viewing; Refreshments and Desserts Served
3:30 Microarray Gene
Expression Analysis and Data Integration
Dr. Heng Dai, Senior Scientist, Bioinformatics, Drug Discovery,
Johnson & Johnson PRD
Data integration and interpretation remains one of the major
challenges in microarray data analysis. It is essential to integrate data
from diverse resources including molecular annotation, expression,
pathway, disease and pharmacological databases. We have developed methods
and tools to effectively integrate this data into a central database,
which can be easily accessed through a web interface. This data can be
further analyzed with data mining and visualization tools, such as Omniviz,
to identify novel interactions and associations between medical,
biological and chemical entities.
4:00 New Challenge for Drug
Discovery Informatics: Information and Knowledge Integration
Dr. Abdel Laoui, Head, Chemoinformatics, Aventis Pharmaceuticals
The new issue in the pharmaceutical industry is to develop new drug
discovery informatics solution designed to deal with the challenges
emerging in today's data-rich environment - challenges arising from the
volume, diversity, and variable quality of data being generated. Data
pipelining has emerged as a practical technology for accelerating the
discovery process. The companies that will be successful will be those
that can bridge the gap between Bioinformatics and ChemoInformatics
quickly. At Aventis we have implemented a new paradigm in drug discovery
informatics which we call Chemical Biology. We will present this
integrated approach which is multidisciplinary and knowledge based with
the corresponding new enabling technology.
4:30 Speaker to be
announced
5:00 Panel Discussion
5:30 Close of Conference
| Lead
Publication: |
|
|
|
|
 |
|
|
|
|
| Sponsoring
Publications: |
|
|
|
|
|
 |

|
|

|
|
 |
| Web Partner: |
|
|
|
|
|
|

|

|

|
|
|
|
 |
| There
are many sponsorship opportunities for your company to maximize its
exposure and influence. They include conference-specific sponsorships,
technology workshops, networking receptions, delegate bags, etc. We
are also ready to work with you in customizing a solution to meet your
specific marketing objectives. Make a lasting impression by taking
advantage of these marketing tools.
For exhibit and sponsorship
information, please contact Carol Dinerstein at 781-972-5471 or dinerstein@healthtech.com.
|
TRAVEL INFORMATION
Special Airline Discounts Available
Special Zone and Discount Fares have been established for this conference with
United Airlines. Please call United Airlines Meeting Reservation Desk at
800-521-4041 and reference ID#579YS.
HOTEL INFORMATION
Wyndham Baltimore Inner Harbor
101 W. Fayette Street
Baltimore, Maryland 21201
T: 410-752-1100 o F: 410-752-0832
Cut-off date: August 29, 2003
$179 single/$199 double occupancy
Please call the hotel directly to make your room reservation. Identify yourself
as a Cambridge Healthtech Institute conference attendee to receive the reduced
room rate. Reservations made after the cut-off date or after the group room
block has been filled (whichever comes first) will be accepted on a
space-and-rate-availability basis. Rooms are limited, so please book early.
CALL FOR POSTERS
Cambridge Healthtech Institute encourages attendees to gain further
exposure by presenting their work in the poster sessions. Please fill out the
registration form, with the poster title and primary author. To ensure
inclusion in the conference CD, a one-page summary must be submitted and
registration must be paid in full by August 22, 2003. Click
here for poster instructions
|
Initial
Listing of Poster Presentations |
Extending
MicroArray Explorer with R
Dr. Peter F. Lemkin, National Cancer Institute
Multiresolution Analysis of 2-D
Electrophoretic Gel Images
Dr. Nicolas Nafati, Research Engineer, INSERM |
Model
Centric Data Integration and Visualization
Dr. Christophe Schilling, Chief Technical Officer, Genomatica, Inc. |
Data
Integration to Enable Drug Discovery:A Microarray PerspectiveDr. Soheil
Shams, BioDiscovery, Inc |