Sequencing Data Analysis and Storage

 

Sequencing Data Analysis and Storage 2010 Banner

 Monday  |  Tuesday  |  Wednesday  |  Download Brochure 

Now-generation sequencing platforms are capable of generating gigabytes of data in a sequence run leading to terabytes of data in a single experiment. Thus data storage, transfer, and analysis will unquestionably be the rate limiting steps in turning this new sequencing data into knowledge. Sequencing Data Analysis and Storage convenes engineers who are developing the sequencing platforms, biological researchers who are designing and running the experiments, biostatisticians who are analyzing and interpreting the data, and software developers who are managing and storing the data. Each specialty provides unique perspectives and must be integrated into a cohesive, comprehensive team to decipher the sequencing data deluge.

MONDAY, MARCH 15, 2010

8:30 am Short Course Registration

 

Recommended Pre-Conference Short Course *

9:00 am – 12:00 pm

(SC2) DATA MANAGEMENT AND STORAGE:
THE NEXT HURDLE FOR NGS

Click here for the complete short course description.

*Separate registration required 

 

1:00 pm Conference Registration


MANAGEMENT AND STORAGE

2:00 Chairperson’s Remarks
David Dooling, Ph.D., Assistant Director, Genome Center, Washington University

2:05 FEATURED SPEAKER

Ethan MillerPergamum: Evolvable, Reliable, Energy-Efficient Disk-Based Archival Storage

Ethan Miller, Ph.D., Professor, Computer Science; Associate Director, Storage Systems Research Center, University of California, Santa Cruz

As the world moves to digital storage for archival purposes, there is an increasing demand for reliable, low-power, cost-effective, easy-to- maintain storage that can still provide adequate performance for information retrieval and auditing purposes. To address this challenge, we developed Pergamum, which stores data in a network of “bricks”, each of which contains a disk, low-power CPU, and flash memory. This talk will describe the Pergamum architecture and the approaches we are using to ensure that data is never lost due to device failure, that data can easily be located in the system, and that the system can evolve as devices and even technologies change.

2:50 Information Technology in the Era of Next-Generation Sequencing

Eddy Navarro, M.B.A., Computer Systems Manager, Storage, J. Craig Venter Institute

This presentation will discuss how JCVI’s Information Technology department has developed solutions to address both the demanding storage and computational requirements involved in supporting a next-generation sequencing and genomic research organization. The talk will focus on the need to stay at the technical forefront while developing more cost-effective solutions in these tougher economic times. Topics will include storage, backup and disaster recovery, virtualization, and the cloud.

3:20 Networking Refreshment Break, Exhibit & Poster Viewing

 

4:00 UCSD’s Triton Resource – A Common High Performance Computing Platform for Next Generation Sequencing Projects
Ron Hawkins, Director of Industry Relations, San Diego Supercomputer Center
This presentation will provide an overview of high performance computing and storage systems at the San Diego Supercomputer Center and how they are being employed to support data storage and computational analysis for multiple next generation sequencing projects.  Topics covered will include conventional computing clusters, large-memory symmetric multiprocessing (SMP) systems, a prototype computing system using solid state disk (SSD) or “flash” memory, and high performance storage systems.  The presentation will discuss the potential benefits of a centralized computing infrastructure supporting multiple sequencing projects. 

4:30 CloudBurst, Crossbow, and Contrail: Scaling Up Bioinformatics with Cloud Computing

Michael Schatz, M.S., Research Assistant, Center for Bioinformatics & Computational Biology, University of Maryland

One of the main challenges for computational biologists is creating efficient algorithms to match improvements in high throughput sequencing. Here we describe how CloudBurst and Crossbow use cloud computing for mapping and genotyping whole human genomes at deep coverage in an afternoon. We’ll also describe how our new program, Contrail, uses cloud computing to scale up de Bruijn graph construction and analysis for the assembly of large genomes from short reads.

5:00 Panel Discussion with Afternoon Speakers

5:30 – 7:00 Welcoming Reception in the Exhibit Hall


By Series:
By Region: