1000 Genomes Project to unleash the power of pharmacogenomics

23 Jan 2008 | News
An international research consortium has announced plans for a $50 million effort to create the most detailed and medically useful map of human genetic variation to date.

An international research consortium announced plans for the 1000 Genomes Project, a $50 million effort to sequence the genomes of at least a thousand people and create the most detailed and medically useful map of human genetic variation to date.

The map will enable researchers to relate genetic variation to particular diseases, laying the foundations for pharmacogenomics in which people will routinely have their genomes sequenced to predict individual risk of disease and response to drugs.

The 1000 Genomes Project would have been unthinkable only two years ago, according to Richard Durbin, of the Wellcome Trust Sanger Institute in Cambridge, UK, one of the members of the consortium.

“Today, thanks to amazing strides in sequencing technology, bioinformatics and population genomics, it is within our grasp. So, we are moving forward to examine the human genome at a level of detail that no one has done before, expanding and accelerating efforts to find more of the genetic factors involved in human health and disease.”

Any two humans are more than 99 per cent the same at the genetic level: the small fraction of genetic material that varies among people is expected to help explain individual differences in susceptibility to disease, response to drugs or reaction to environmental factors.

Using recently developed catalogues of human genetic variation, such as the HapMap and the Wellcome Trust Case Control Consortium (WTCCC), researchers already have discovered more than 100 regions of the genome that contain genetic variants associated with susceptibility to common human diseases such as diabetes, coronary artery disease, prostate and breast cancer, rheumatoid arthritis, inflammatory bowel disease and age-related macular degeneration.

However, these studies have to be followed up with costly and time-consuming DNA sequencing to help pinpoint the precise causative variants. The new map will enable researchers to zero in more quickly on disease-related genetic variants, speeding efforts to use genetic information to develop new strategies for diagnosing, treating and preventing common diseases.

Greater sensitivity

This new project will increase the sensitivity of disease discovery efforts across the genome fivefold and within gene regions at least tenfold. Current methods can detect rare variants that have a significant consequence, such as cystic fibrosis, and which are studied in affected families, or relatively common variants, such as those described in 2007 by the WTCCC, many of which have weak effects on common disease.

“Between these two types of genetic variants - very rare and fairly common - we have a significant gap in our knowledge,” said David Altshuler, of the Broad Institute of Massachusetts Institute of Technology (MIT) and Harvard University in Cambridge, MA, who is the consortium's co-chair and was a leader of the HapMap Consortium. “The 1000 Genomes Project is designed to fill that gap, which we anticipate will contain many important variants that are relevant to human health and disease.”

Importantly, the 1000 Genomes Project will map not only the single-letter differences in DNA, (single nucleotide polymorphisms or SNPs), but also structural variants - rearrangements, deletions or duplications of segments of the human genome. The importance of these variants has become increasingly clear in the past 18 months from the Wellcome Trust Sanger Institute’s Copy Number Variation Project and similar research, which show that structural variants may play a role in susceptibility to certain conditions, such as mental retardation and autism.

The project depends on large-scale implementation of several new sequencing platforms. Using standard DNA sequencing technologies, the effort would cost more than $500 million. But leaders of the 1000 Genomes Project expect the costs to be far lower - in the range of $30 million to $50 million. The Project consists of pilot and production phases. In year one, three pilot projects will determine how to produce most efficiently and cost effectively the project’s detailed map of human genetic variation.

Two genomes a day

During its two-year production phase, the data produced by the 1000 Genomes Project – an average equivalent to more than two human genomes every 24 hours – poses a major challenge for leading experts in the fields of bioinformatics and statistical genetics.

“The scale of this project is immense. At 6 trillion DNA bases, the 1000 Genomes Project will generate 60-fold more sequence data over its three-year course than have been deposited into public DNA databases over the past 25 years,” said Gil McVean, of the University of Oxford in England, one of the co-chairs of the consortium’s analysis group. “In fact, when up and running at full speed, this project will generate more sequence in two days than was added to public databases for all of the past year.”

The data will be held by and distributed from the European Bioinformatics Institute near Cambridge, UK, and the National Center for Biotechnology Information in the USA.

The 1000 Genomes Project will use samples from volunteer donors who gave informed consent for their DNA to be analysed and placed in public databases. The first thousand samples for the Project will come from those used for the HapMap and from additional samples in the extended HapMap set, which used the same collection processes. Populations from Africa, Asia, America and Europe are included.

“The scale of the 1000 Genomes Project is ambitious, but it is essential for building on the important work carried out since the Human Genome Project,” said Alan Schafer, Head of Molecular and Physiological Sciences at the Wellcome Trust. “It is clear that as humans, we differ from each other genetically by only a small fraction, yet this is enough to cause variation in human health and disease. By studying this many people, we aim to generate a comprehensive catalogue of variation that will facilitate identification of the disease related variation.”


Never miss an update from Science|Business:   Newsletter sign-up