Conference Banner

Next-Generation Sequencing Data Analysis - Day 2

Conference Proceeding CD Now Available
  • Speaker Presentations
  • Poster Abstracts
  • and More!


Day 1 | Day 2 | Day 3 | Download Brochure 

Tuesday, August 14

7:30 am Breakfast Presentation (Sponsorship Opportunity Available)

8:30 Java and Jive Discussion Groups

Grab a cup of coffee and join one of the discussion groups. These are moderated discussions with brainstorming and interactive problem solving, allowing conference participants from diverse areas to exchange ideas, experiences, and develop future collaborations around a focused topic.

» Click Here for Discussion Groups «


Assembly’s Role in Gene Annotation 

9:30 Chairperson’s Remarks

Herve Tettelin, Ph.D., Associate Professor, Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland

9:35 Improving Pan-Genome Annotation Using Whole Genome Multiple Alignment

Herve Tettelin, Ph.D., Associate Professor, Department of Microbiology and Immunology, Institute for Genome Sciences, University of Maryland

Rapid annotation and comparisons of genomes from multiple isolates (pan-genomes) is becoming commonplace due to advances in sequencing technology. Genome annotations can contain inconsistencies and errors that hinder comparative analysis even within a single species. We developed Mugsy-Annotator that identifies orthologs and evaluates annotation quality in prokaryotic genomes using whole genome multiple alignment. Mugsy-Annotator identifies anomalies in annotated gene structures, including inconsistently located translation initiation sites and disrupted genes due to draft genome sequencing or pseudogenes.

10:05 Written in the Genome: Assembly Consequences for Gene Annotation

Liliana Florea, Ph.D., Assistant Professor, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine

As a growing number of species are being sequenced and assembled to various degrees of completion, a key question is: how does the quality of the assembled sequence affect gene annotations? We found that even between successive high-quality draft assemblies there were significant changes in the annotations, due primarily to genome mis-assembly events and local sequence variation. We illustrate with a comparison of annotations in assemblies produced with different methods, from reads generated by diverse sequencing technologies, and undergoing various amounts of curation.

10:35 Coffee Break in the Exhibit Hall with Poster Viewing

11:15 Identifying Genomic Features by Chromatin Architecture

Michael J. Buck, Ph.D., Assistant Professor, Department of Biochemistry, The State University of New York at Buffalo

The accurate identification and annotation of genomic features is a key to understanding the human genome. However, finding these locations within the genome can be a laborious and expensive undertaking requiring site-specific assays.  Even more difficult is identifying entirely new classes of genomic features.  To facilitate identification and characterization of new classes of genomic features, we have developed chromatin architecture alignment and search tools.  We have successfully applied these tools to various datasets from multiple genomes and will present our results for insulators, enhancers, transcriptional start sites, and origins of replication.

11:45 Evaluating NGS Factors Affecting Prediction Performance, and Exploring Feasibility of NGS Meta Analysis

May Dongmei Wang, Ph.D., Associate Professor, Biomedical Engineering, Electrical and Computer Engineering, Hematology and Oncology, Winship Cancer Institute, Georgia Institute of Technology

We have been developing computing algorithms and infrastructure to evaluate among various parameters involved in NGS data analysis, which factors may affect prediction performance. In addition, we have been exploring whether meta-analysis can be used in NGS data analysis.

DNAnexus 12:15 pm A Community-Inspired Collaborative and Scalable Data Technology PlatformAndreas Sundquist, Ph.D., CEO & Co-founder, DNAnexus
DNAnexus’ instant genomics data and analysis center is a novel, open, and flexible cloud-based platform for DNA data management, analysis, and visualization. It allows collaboration in a secure environment and the integration of own and external tools into custom workflows to make any DNA data analysis a success.

12:30 Luncheon Presentation (Sponsorship Opportunity Available) or Lunch on Your Own


Assembly and Alignment’s Role in Analysis 

2:00 Chairperson’s Remarks

Tom Schwei, Vice President and General Manager, DNASTAR, Inc.

2:05 Assessing and Improving the Reliability of Short Read Alignments

Matthew Ruffalo, Research Scientist, Department of Electrical Engineering and Computer Science, Case Western Reserve University

As the developers of alignment tool software optimize their algorithms with respect to various considerations, the relative merits of different software packages remain unclear. However, for researchers who generate and use NGS data for their specific projects, an important consideration is choosing software that is most suitable for their application. Here, we identify criteria to evaluate the performance of alignment algorithms, compare various algorithms in terms of performance with respect to criteria, and discuss how their output can be used to accurately assess the reliability of mappings.

EMC2:35 Performance Metrics in the Cloud: A Storage PerspectiveSanjay Joshi, Solutions Architect, Life Sciences, EMC Isilon Storage DivisionAs the process of assembling and interpreting genomic data moves to the Cloud, understanding its performance dictates operational costs, resources and planning for growth within a storage context. We present the most common use cases for NGS on the Cloud and their performance guidelines and best practices.

3:05 Improving Structural Variation Analysis with Next-Generation DNA Sequencing Data

Anna Ritz, Research Scientist, Department of Computer Science & Center for Computational Molecular Biology, Brown University  

Detecting structural variants (SVs) in mammalian genomes using next-generation DNA sequencing data is complicated by the fact that many SVs lie in repetitive regions.  Sequence reads from such regions have numerous possible alignments to the reference genome, and determining whether ambiguous alignments result from an SV is challenging.  We describe probabilistic algorithms for SV detection that combine multiple signals indicating the presence of a variant.  We demonstrate the results of our algorithms on Illumina and Pacific Biosciences sequencing data.

Edge3:35 Towards a Human Clinical Grade Genome – EdgeBio and The Archon Genomics X PRIZE presented by Medco

Justin Johnson, Director, Bioinformatics, EdgeBioThe Archon Genomics X PRIZE presented by Express Scripts created a “Validation Protocol” that is helping to define for the first time what it means to have a complete and accurate “medical grade" whole human genome sequence. This presentation will describe the Validation protocol in detail.

3:50 Refreshment Break in the Exhibit Hall with Poster Viewing

4:30 How Basic Assembly Parameters Affect de novo and Reference-Guided Next-Gen Sequence Assembly

Matthew R. Keyser, Next-Gen Applications Scientist, DNASTAR, Inc.

Most sequence assembly software provide adjustable parameters that can dramatically affect assembly quality and commonly used benchmarks like assembly time and contig size.  However, assemblies that are optimized for speed and contig size often show a decrease in assembly quality that may impede downstream analysis.  This discussion will investigate how adjusting basic parameters affects assembly quality and performance in both de novo and reference-guided projects, and the need to use assembly software that provides a proper balance.

5:00 Assembly and Alignment Panel Discussion with Afternoon Speakers

Moderator: Tom Schwei, Vice President and General Manager, DNASTAR, Inc.


Discussion Questions to be Addressed:

1. Which data analysis problems are essentially solved today and which still remain to be solved?

2. What data assembly and analysis work is best done by bioinformaticians and what work is better done by the end user life scientist?

3. What is it going to take to get next-gen sequencing truly ready for clinical applications? 

4. How will nanopore technologies affect assembly and alignment?

5. Why are people using multiple sequencing platforms to solve their problems and how do they approach assembly and alignment when using data from more than one sequencing platform?

6. Is there a time when de novo assembly algorithms and approaches will be so good that people no longer need to use a template or reference for assembly?  If so, what will it take to get there?

7. What guidance or recommendations are there for scientists who are new to assembly and alignment challenges as to tools to consider, hazards to watch out for, or other lessons you’ve learned that they can potentially benefit from?

5:30 Close of Day and Dinner Short Course Registration

5:45-8:45 Dinner Short Courses


Day 1 | Day 2 | Day 3 | Download Brochure 


By Series:
By Region:

& Course Catalog

CHI Catalog 

Short Course DVDs