Register Online

PDF Version

Submit a
Poster

Upcoming Conferences

Request info

Order CD

Press pass


Click here to view a post conference review and photos

Held concurrently with CHI's Data Visualization and Interpretation 
and immediately following CHI's Third Annual Microarray Data Analysis.

Corporate Sponsors:

Effective communication leads to productivity, and well-designed databases can help achieve effective communication, but there is no quick and easy way to design and build databases in biological science. The complexity of biology and the lack of human consensus can create confusion, but with a well-designed database and a creative attitude, objectives can be met. Well-designed, well-integrated information systems can facilitate communication between researchers, which is paramount in the drug development decision making process. First, data must be considered, which includes information architecture, commonality, ontologies, and quality control. Second, integration, extraction, and compilation of data through the pipeline leads to clearer analysis of pre-clinical knowledge and to identification of drug candidates. Finally, understanding this consolidated information drives decisions for productive investments. This is a must-attend meeting for those working with databases, from architects to software engineers, and those working in drug discovery and development, including biologists, chemists, managers, analysts, and end-users.

Scientific Advisors:
Ms. Amy Dasch, Genzyme Corp.
Dr. Eric Neumann, Beyond Genomics

Download Conference-at-a-Glance
View Initial Listing of Poster Presentations

Wednesday, September 24

8:00am Pre-conference Short Course Tutorial Registration and Coffee

8:30-11:30 Pre-conference Short Course Tutorials (*Separate Registration Required)

Pre-conference Short Course Tutorials

COURSE ONE*

Integrating Visualization and Data Mining for Microarray Analysis
Dr. Georges Grinstein, Professor, Computer Science Department; Director, Institute for Visualization and Perception Research; Director, Center for Biomolecular and Medical Informatics, University of Massachusetts Lowell; and Founder and Director, Research & Development
This short course will provide an overview of visualization and data mining techniques discuss how current systems deal with their integration, and the role of high dimensional data. We will highlight the exploration process and provide various application examples. We will also discuss where visual analytic systems should be heading. Several demos will be presented.

Course Participants Will:

  • Gain a fundamental background in visualization
  • Understand the role of visualization in discovery
  • Have an overview of the different techniques and how they fit in with analysis
  • Understand the integration issues
  • Gain knowledge on what the current systems provide and how to compare systems

Who should attend?
Biologists, chemists, analysts, software developers, statisticians, bioinformaticians, and managers in discovery biology and drug development laboratories.

COURSE TWO*

Enterprise Database Integration for Researchers
Dr. William J. Pjura, President, Altionics, Inc.
Enterprise Database Integration for Researchers presents a visual modeling approach to identifying entity objects and packaging them into cohesive subject areas. The short course examines several public web accessible, chemical, pharmaceutical, and biological databases from the perspective of analysis and design criteria, focusing on identifying the common characteristics shared by the databases and their unique characteristics. These common and unique characteristics are identified and packaged into cohesive subject areas and remodeled into extensible and adaptive enterprise database architectures. Subsequently, the eXtensible Markup Language (XML) is introduced, and existing chemical, biological, and pharmaceutical XML-based vocabularies are reviewed. The importance of XML in the integration of data from disparate sources and the design of XML schema to facilitate the integration of these databases are examined, and the implementation of the XML schema is demonstrated.

Course Prospectus:
Course participants will have the opportunity to apply database analysis and design concepts to the analysis of several public Web accessible chemical and biological databases. Attendees will learn how to identify the common characteristics shared by the databases and to identify their unique characteristics. They will see how the concept of packaging cohesive objects is applied to designing a database architecture that supports the integration of apparently disparate databases. Subsequently, the eXtensible Markup Language (XML) will be introduced and the existing chemical, biological, and pharmaceutical XML-based vocabularies will be reviewed. The importance of XML in data integration from seemingly disparate sources and the design of XML schema to facilitate integration of these disparate databases will be examined. Finally, the implementation of the XML schema will be demonstrated.

Who should attend?
This course is recommended for researchers who are responsible for experimental design, are familiar with basic database analysis and design concepts, and have an ongoing need to effectively integrate and analyze data from internal and external databases.

*Separate Registration Required

*Separate Registration Required

11:00am - 12:00pm Conference Registration and Poster Set-up

11:30am - 12:30pm Luncheon Workshop

Sponsored by:

"Data Integration at Amgen"
Presented by Mark Jury, Amgen

1:00pm Chair's Opening Remarks
Dr. Georges Grinstein

 

Joint Kick-off Keynotes

1:10 Mining the Biomedical Literature using Semantic Analysis and Natural Language Processing Techniques
Dr. Ronen Feldman, Assistant Professor, Mathematics and Computer Science Department, Israel's Bar-Ilan University and Chief Scientist, Clearforest, Ltd.
The information age has made it easy to store large amounts of data electronically. The proliferation of documents available on the web, on corporate intranets, on newswires and elsewhere is overwhelming. Search engines only exacerbate this overload problem by making more and more documents available in a matter of a few keystrokes. This information overload is directly mirrored in the bio-medical field, where scientific publications and other forms of text-based data are produced at an unprecedented rate. Text mining is the combined, automated process of analyzing unstructured, natural language text in order to discover information and knowledge that are typically difficult to retrieve. In this paper, we focus on using text mining as it applies to the biomedical literature. In particular, we are interested in finding relationships among genes, proteins, drugs and diseases, to assist in explaining and predicting complex biological processes. We will describe the LitMinerÔ system that we have developed for this purpose; in particular, we will focus on the KDD CUP 2002, which serves as a formal evaluation of our system.

1:50 Integromics in Drug Discovery: Practical Tools for Integrating Genomics, Proteomics, Bioinformatics, and Chemoinformatics
Dr. John N. Weinstein, Senior Research Investigator, Laboratory of Molecular Pharmacology, National Cancer Institute, National Institutes of Health
After microarray experiments, (or other "omic" studies), have been done in the pharmaceutical context, one's first task is to analyze the data statistically. But that leaves open the Big Question: what do the results mean biologically and pharmacologically? A number of practical approaches and computational tools for addressing that question and integrating different types of data will be discussed. Included is a set of program packages available from our group and collaborators through http://discover.nci.nih.gov: MedMiner, MatchMiner, GoMiner, CIMminer, and LeadScope/LeadMiner.

2:30 Poster and Exhibit Viewing, Refreshment Break

 

ONTOLOGIES

3:15 Ontologies in Breast Cancer: Concepts vs Words
Dr. Michael Liebman, Director, Computational Biology and Biomedical Informatics, Professor, Cancer Biology, Abramson Cancer Center of the University of Pennsylvania
Ontologies are correctly defined as hierarchies of concepts but are frequently applied to mean controlled syntax,database schema, semantic networks or thesaurus. In using an ontological approach to extract knowledge about disease progression and disease presentation, including co-morbidities, we have extended the approach of ontology construction to incorporate critical temporal domains. Towards this goal, we have applied LexiMine (SPSS) as a method for syntactical analysis of free text to establish the value in the analysis of full articles versus abstracts in knowledge extraction.

3:45 MedScan - Automatic Text Mining and Data Extraction System
Dr. Ilya Mazo, President, Ariadne Genomics, Inc.
MedScan uses natural language processing to extract information about proteins, small molecules and their interactions from PubMed abstracts. The system performance and the role of ontology and efficient term recognition will be discussed. The knowledge management system that integrates MedScan and gene expression analysis tools will be presented.

4:15 Opportunities for Text Mining, supported by Ontology-Based Knowledge Assembly Description
Mr. Chris Sole, Director, Technical Liaison, Jarg Corp./ SemanTx Life Sciences, Inc
Curation costs for medical text-bases are escalating rapidly, and manual efforts to maintain up-to-date indexing have significant built-in biases. This presentation will review how ontology-based text mining applications offer a clearer path for lingusitic recogntion than simple word-match or synonym-based approaches. The need is clear right through the product lifecycle, from discovery to clinical trials, treatment and payer claims analysis. The technology should form the basis for ongoing knowledge management processes in organizations.

4:45 Flexible IT Structures For Ontology Networking
Mr. John Wilbanks, Director, Genstruct
Most ontologies and taxonomies are "hard-coded" - a user predefines the relationships amongst the entities in the ontology (for example, father-son defines the relationship between two men in a family ontology). This hard-wiring creates trouble in science, where relationships are multivariate and change in response to new data. Incellico's CELL technology allows users to construct ontologies; (or topic maps, data models, taxonomies), "on the fly" out of semantic relationships, creating a flexible ontology network that can grow and respond to new information, regardless of data format, type or source.

5:15 Panel Discussion

5:45-6:45 Networking Reception (hosted by Cambridge Healthtech Institute)

 

Thursday, September 25

7:30am Coffee and Technology Workshop (Sponsorship Available)

 

INTEGRATING DIVERSE DATA SOURCES

8:30 Chair's Remarks
Dr. Donald Jackson, Senior Research Investigator, Applied Genomics, Bristol-Myers Squibb

8:35 Improving the Drug Discovery Process Through Information Technology Approaches
Dr. Herschel J.R. Weintraub, Principal Consultant, IBM Life Sciences
The Pharmaceutical and Biotech Industry is facing unprecedented challenges. Enormous amounts of data must be collected and analyzed. Sources of these data include public sources (e.g. genomic data), in-house instrumentation and robotics, in-house laboratories, and the literature. Timely access to this data by scientists, management, and others, and the ability to gain knowledge from it, is essential in order to remain competitive. An assessment of the issues faced by the industry and possible solutions will be presented.

9:05 Intelligent Integration of Heterogeneous Bioinformatics Data Sources
Dr. David Silberberg, Senior Computer Scientist, Research and Technology Development Center, Johns Hopkins University Applied Physics Lab
We present an approach to integrating distributed and heterogeneous bioinformatics databases based on simplified query specifications, model-based automated query formulation, and ontology inference techniques. This approach simplifies the query formulation task for users and applications, simplifies the data source enrollment task for data managers, and simplifies the system integration task for the federated system managers. We will present the results of a seamless integration of Refseq, OMIM (Online Mendelian Inheritance of Man) and Ensembl which demonstrates new levels of data integration potential for biologically relevant access to all available information about genes, proteins and sequences.

9:35 Bioinformatics: Not Just for Sequences Anymore
Dr. Donald Jackson
Drug discovery bioinformatics requires integrating multiple types of information (nucleotide and protein sequences, mRNA and protein expression measurements, model organism data, alternative splicing, single nucleotide polymorphisms, and more). Data from public and proprietary databases, alliances, and internal research must be combined into a unified picture that experimentalists can access. Our experience highlights tools and strategies for data integration and presentation that are applicable to pharmaceutical, biotech and academic researchers alike.

10:05am Poster and Exhibit Viewing, Refreshment Break

10:45 "The Open Science Alliance" - Accelerating Data Integration to Support Pharma Collaborative eR&D
Dr. Jeffrey Spitzner, Chief Scientific Officer, Rescentris
Optimizing R&D portfolio and pipleline decision support and management - from bench top discovery science, validated systems in clinical trials and manufacturing, to the executive suites - requires badly needed protocols and standards for data integration and interoperability. This talk will discuss the "Open Science Alliance(TM)", a focused activity of 15 Fortune 50 pharma and 30 other companies designed to greatly accelerate interoperability many years faster than by other methods. This talk will cover strategies, new XML technologies and standards being created and considered for all key types of R&D data and their XML-based ontologies to support all aspect of Collaborative eR&D including decision support, and the final aims and timetables of the Alliance.

11:15 Informatics-Driven Chemistry in Action: The Success of Data Integration in a Discovery Research Organization
Dr. Ramesh Durvasula, Senior Director, Applied Science, Tripos, Inc.
The integration of data from multiple disciplines has been achieved at the technological level in many discovery organizations. However, the challenge facing these organizations today is how to leverage the tsunami of available data, to find relevant trends in this data, and to embed the data mining within an efficient business process. In this presentation, we will discuss the strategies used for seamlessly integrating our library design environment, (enumeration, virtual screening, diversity, etc.), with our chemistry operations systems (inventory, reagent ordering, purification, etc.). With such a tight integration, we are able to achieve maximum productivity from our high-throughput and medicinal chemistry projects, including unprecedented success in various therapeutic projects. The technologies we have developed to support both our in-house discovery operation as well as many of our clients' operations will also be presented. Topics will include data integration, compound lifecycle management, and electronic lab journal strategies.

11:45 Panel Discussion

12:15 Luncheon

 

CLOSING PLENARY SESSION:
EFFECTIVE DATA MANAGEMENT FOR DRUG DISCOVERY


1:45pm Chair's Remarks
Dr. Georges Grinstein

1:50 Information Pathways in Pharmaceutical R&D
Dr. Otto Ritter, Associate Director, Bioinformatics, Enabling Science and Technology, AstraZeneca R&D Boston
Molecular pathways are useful models for representing biological processes at the cellular level. Pathways are usually represented as graphs, where molecules or molecular complexes are the nodes, and molecular interactions are the edges. In one step of generalization, where we take any biomedical entities as nodes and any general relationships as edges, we get an associative network as a representation of biomedical knowledge. If we take one more step in this generalization process and include any information assets and any transformations or associations, we get information pathways as models representing (pharmaceutical) R&D processes. It's useful to know that at all three levels of interpretation (molecular, biomedical, R&D), we can actually re-use the same software components for data management, analysis, and visualization.

2:10 Standards to Enable Information Integration
Dr. David Benton, Director, Knowledge Integration and Discovery Systems, Informatics and Knowledge Management, R&D IT, GlaxoSmithKline
Information integration is widely acknowledged to be both a great need and a major challenge facing pharmaceuticals R&D. The principal obstacle to information integration systems is heterogeneity at virtually all levels of the pharma information stack. This talk will address the sources of this heterogeneity, question the premise that any technical solution can solve the problems posed by heterogeneity, and propose that any non-trivial information integration will require shared ontologies and domain models. It will also address whether such shared ontologies and domain models can be: (1) developed entirely in-house; (2) acquired from vendors; or (3) developed as open standards by the R&D community.

2:40 Poster and Exhibit Viewing; Refreshments and Desserts Served

3:30 Microarray Gene Expression Analysis and Data Integration
Dr. Heng Dai, Senior Scientist, Bioinformatics, Drug Discovery, Johnson & Johnson PRD
Data integration and interpretation remains one of the major challenges in microarray data analysis. It is essential to integrate data from diverse resources including molecular annotation, expression, pathway, disease and pharmacological databases. We have developed methods and tools to effectively integrate this data into a central database, which can be easily accessed through a web interface. This data can be further analyzed with data mining and visualization tools, such as Omniviz, to identify novel interactions and associations between medical, biological and chemical entities.

4:00 New Challenge for Drug Discovery Informatics: Information and Knowledge Integration
Dr. Abdel Laoui, Head, Chemoinformatics, Aventis Pharmaceuticals
The new issue in the pharmaceutical industry is to develop new drug discovery informatics solution designed to deal with the challenges emerging in today's data-rich environment - challenges arising from the volume, diversity, and variable quality of data being generated. Data pipelining has emerged as a practical technology for accelerating the discovery process. The companies that will be successful will be those that can bridge the gap between Bioinformatics and ChemoInformatics quickly. At Aventis we have implemented a new paradigm in drug discovery informatics which we call Chemical Biology. We will present this integrated approach which is multidisciplinary and knowledge based with the corresponding new enabling technology.

4:30 Speaker to be announced

5:00 Panel Discussion

5:30 Close of Conference


 
Lead Publication:
Sponsoring Publications:

Web Partner:


There are many sponsorship opportunities for your company to maximize its exposure and influence. They include conference-specific sponsorships, technology workshops, networking receptions, delegate bags, etc. We are also ready to work with you in customizing a solution to meet your specific marketing objectives. Make a lasting impression by taking advantage of these marketing tools.

For exhibit and sponsorship information, please contact Carol Dinerstein at 781-972-5471 or dinerstein@healthtech.com.

TRAVEL INFORMATION
Special Airline Discounts Available
Special Zone and Discount Fares have been established for this conference with United Airlines. Please call United Airlines Meeting Reservation Desk at 800-521-4041 and reference ID#579YS.

HOTEL INFORMATION
Wyndham Baltimore Inner Harbor
101 W. Fayette Street
Baltimore, Maryland 21201
T: 410-752-1100 o F: 410-752-0832
Cut-off date: August 29, 2003
$179 single/$199 double occupancy
Please call the hotel directly to make your room reservation. Identify yourself as a Cambridge Healthtech Institute conference attendee to receive the reduced room rate. Reservations made after the cut-off date or after the group room block has been filled (whichever comes first) will be accepted on a space-and-rate-availability basis. Rooms are limited, so please book early.

CALL FOR POSTERS
Cambridge Healthtech Institute encourages attendees to gain further exposure by presenting their work in the poster sessions. Please fill out the registration form, with the poster title and primary author. To ensure inclusion in the conference CD, a one-page summary must be submitted and registration must be paid in full by August 22, 2003.  Click here for poster instructions

 

Initial Listing of Poster Presentations

Extending MicroArray Explorer with R
Dr. Peter F. Lemkin, National Cancer Institute

Multiresolution Analysis of 2-D Electrophoretic Gel Images
Dr. Nicolas Nafati, Research Engineer, INSERM

Model Centric Data Integration and Visualization
Dr. Christophe Schilling, Chief Technical Officer, Genomatica, Inc.
Data Integration to Enable Drug Discovery:A Microarray PerspectiveDr. Soheil Shams, BioDiscovery, Inc

 

 

 

CHI Home   |  Conferences   |  Exhibits  |  Sponsorship  |  Request Info CD Orders  |  Privacy Policy



Phone: 781-972-5400, Fax:  781-972-5425
Email: chi@healthtech.com