The DREAM Project
Posted July 01, 2009
Computational biologists strive to "reverse engineer" the underlying networks of interactions between the molecules by studying how these activities change as the cells grow and respond to stimuli. Although many researchers claim success in applying their favorite techniques to their favorite problems, however, they rarely evaluate the use of different techniques for a single, unfamiliar system.
According to Gustavo Stolovitzky of IBM and Andrea Califano of Columbia University, co-organizers of a recent conference at the Academy, the time is ripe for systematic evaluation of reverse engineering methods. Califano and Stolovitzky are spearheading a project they call DREAM, a Dialogue on Reverse Engineering Assessment Methods. On March 9 and 10, 2006, they brought together a diverse group of researchers to clarify the goals and procedures for the new project. The event featured an evening symposium followed by an all-day, closed-door workshop to discuss ideas.
Computational Biology & Medical Informatics
Gustavo Stolovitzky's research area at IBM.
DIMACS Workshop on Strategies for Reverse Engineering Biological Circuits
To be held on September 7 and 8, 2006, this will be the inaugural meeting for the DREAM project.
Multiscale Analysis of Genetic and Cellular Networks (MAGNet)
Based at Columbia University, one of the founding supporters of the DREAM project.
Other critical assessment projects and related sites
Biomolecular Interaction Network Database (BIND)
Unleashed Informatics offers commercial and open access versions of their databases.
Competitive Evaluation of Prediction Algorithms
A modeling competition organized to advance the algorithms and software for modeling chemical, biological, and medical data, with special emphasis on the prediction of physico-chemical properties and biological activities from molecular descriptors derived from the chemical structure. CoEPrA will also provide a reference database of modeling datasets that can be used to validate and compare new classification and regression algorithms.
Critical Assessment of Microarray Data Analysis
Aims to establish the state-of-the-art in microarray data mining.
Critical Assessment of Prediction of Interactions
A community-wide experiment on the comparative evaluation of protein-protein docking for structure prediction.
Database of Interacting Proteins (DIP)
The DIP database, based at UCLA, catalogs experimentally determined interactions between proteins.
The ENCODE Project: ENCyclopedia Of DNA Elements
An ongoing program sponsored by the National Human Genome Research Institute for testing and comparing existing methods to rigorously analyze a defined portion of the human genome sequence.
Genetic Analysis Workshop
A collaborative effort among genetic epidemiologists to evaluate and compare statistical genetic methods.
Genome Annotation Assessment Project
A program at the Berkeley Drosophila Genome Project that uses known sample genome regions in Drosophila melanogaster to obtain an in-depth and objective assessment of the current state of the art in gene and functional site predictions in genomic DNA.
Human Protein Reference Database (HPRD)
A centralized platform to visually depict and integrate information pertaining to domain architecture, post-translational modifications, interaction networks and disease association for each protein in the human proteome.
MIPS Mammalian Protein-Protein Interaction Database (MPPI)
A collection of manually curated protein-protein interaction data assembled from the scientific literature.
Molecular Interactions Database (MINT)
MINT focuses on experimentally verified protein interactions mined from the scientific literature. The curated data can be analyzed in the context of the high throughput data and viewed graphically.
Software for checking the stereochemical quality of a proposed protein structure.
Protein Structure Prediction Center
Hosts results of the CASP competitions.
A consortium of researchers at the Institute for Bioinformatics and Genomics, at the University of California Irvine, the Beckman Institute of the California Institute of Technology, and the bioengineering department at Johns Hopkins University that is assembling a database of cell-signaling pathways.
Tools for biological networks
A Collection of Artificial Gene Networks
A project by Pedro Mendes's group of generating artificial gene expression data.
Complex Pathway Simulator (Copasi)
A software application developed by the Mendes group at the Virginia Bioinformatics Institute and the Kummer group at EML Research for simulation and analysis of biochemical networks.
A platform for visualizing molecular networks and integrating them with other information.
A system for automatically extracting, analyzing, visualizing, and integrating molecular pathway data from the research literature.
A graphical tool for building logical representations of genetic regulatory networks.
Tool developed in Mark Gerstein's lab at Yale for comparing topological parameters of networks and comparing sub-networks.
Some vendors of high-throughput experimental tools
Manufacturer of gene chips.
Provider of high-density oligomer arrays.
Uses FACS-like techniques to sort multicellular organisms.
Suggested reading from the organizers
Basso, K., A. A. Margolin, G. Stolovitzky et al. 2005. Reverse engineering of regulatory networks in human B cells. Nat. Genet. 37: 382-390.
Davidson, E. H. & D. H. Erwin. 2006. Gene regulatory networks and the evolution of animal body plans. Science 311: 796-800.
Gardner, T. S., D. di Bernardo, D. Lorenz & J. J. Collins. 2003. Inferring genetic networks and identifying compound mode of action via expression profiling. Science 301: 102-105.
Rual, J. F., K. Venkatesan, T. Hao et al. 2005. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437: 1173-1178.
Segal, E., M. Shapira, A. Regev et al. 2003. Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet. 34: 166-176.
Kryshtafovych, A., C. Venclovas, K. Fidelis & J. Moult. 2005. Progress over the first decade of CASP experiments. Proteins 61 Suppl 7: 3-7.
Moult, J. 2006. Rigorous performance evaluation in protein structure modelling and implications for computational biology. Philos. Trans. R. Soc. Lond. B Biol. Sci. 361: 453-458.
Moult, J. 2005. A decade of CASP: progress, bottlenecks, and prognosis in protein structure prediction. Curr. Opin. Struct. Biol. 15: 295-289.
Venclovas, C., A. Zemla, K. Fidelis & J. Moult. 2003. Assessment of progress over the CASP experiments. Proteins 53 Suppl. 6: 585-595.
Venclovas, C., A. Zemla, K. Fidelis & J. Moult. 2001. Comparison of performance in successive CASP experiments. Proteins Suppl. 5: 163-170.
Brazhnik, P., A. de la Fuente & P. Mendes. 2002. Gene networks: how to put the function in genomics. Trends Biotechnol. 20: 467-472.
Mendes, P., W. Sha & K. Ye. 2003. Artificial gene networks for objective comparison of analysis algorithms. Bioinformatics 19 Suppl 2: ii122-ii129. (PDF, 360 KB) Full Text
Boulton, S. J., A. Gartner, J. Reboul et al. 2002. Combined functional genomic maps of the C. elegans DNA damage response. Science 295: 127-131.
Cusick, M. E., N. Klitgord, M. Vidal & D. E. Hill. 2005.Interactome: gateway into systems biology. Hum. Mol. Genet. 2005 14 Spec No. 2: R171-R181.
Ge, H., Z. Liu, G. M. Church & M. Vidal. 2001. Correlation between transcriptome and interactome mapping data from Saccharomyces cerevisiae. Nat. Genet. 29: 482-486.
Giot, L., J. S. Bader, C. Brouwer et al. 2003. A protein interaction map of Drosophila melanogaster. Science 302: 1727-1736.
Gunsalus, K. C., H. Ge, A. J. Schetter et al. 2005. Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 436: 861-865.
Ito, T., T. Chiba, R. Ozawa et al. 2001. A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc. Natl. Acad. Sci. USA 98: 4569-4574. Full Text
Li, S., C. M. Armstrong, N. Bertin et al. 2004. A map of the interactome network of the metazoan C. elegans. Science 303: 540-543
Reboul, J., P. Vaglio, J. F. Rual et al. 2003. C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression. Nat. Genet. 34: 35-41.
Rual, J. F., K. Venkatesan, T. Hao et al. 2005. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437: 1173-1178
Scholtens, D., M. Vidal & R. Gentleman. 2005. Local modeling of global interactome networks. Bioinformatics 21: 3548-3557.
Sönnichsen, B., L. B. Koski, A. Walsh et al. 2005. Full-genome RNAi profiling of early embryogenesis in Caenorhabditis elegans. Nature 434: 462-469.
Uetz, P., L. Giot, G. Cagney et al. 2000. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403: 623-627.
Vidal, M. 2001. A biological atlas of functional maps. Cell 104: 333-339.
Vidal, M. 2005. Interactome modeling. FEBS Lett. 579: 1834-1838.
Vidal, M. 2006. Time for a human interactome project? The Scientist 20(3): 46. (subscription required)
Walhout, A. J., R. Sordella, X. Lu et al. 2000. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287: 116-122.
Cheong, R., A. Bergmann, S. L. Werner et al. 2006. Transient IκB kinase activity mediates temporal NF-κB dynamics in response to a wide range of tumor necrosis factor-α doses. J. Biol. Chem. 281: 2945-2950.
Janes, K. A., J. G. Albeck, S. Gaudet et al. 2005. A systems model of signaling identifies a molecular basis set for cytokine-induced apoptosis. Science 310: 1646-1653.
Diego Di Bernardo
Bansal, M., G. Della Gatta & D. Di Bernardo. 2006. Inference of gene regulatory networks and compound mode of action from time course gene expression profiles. Bioinformatics 22: 815-822.
Di Bernardo, D., M. J. Thompson, T. S. Gardner, et al. 2005. Chemogenic profiling on a genome-wide scale using reverse-engineered gene networks. Nat. Biotechnol. 23: 377-383.
Bertone P., V. Stolc, T. E. Royce et al. 2004. Global identification of human transcribed sequences with genome tiling arrays. Science 306: 2242-2246.
Borneman, A. R., J. A. Leigh-Bell, H. Yu et al. 2006. Target hub proteins serve as master regulators of development in yeast. Genes Dev. 20: 435-448.
Martone, R., G. Euskirchen, P. Bertone et al. 2003. Distribution of NF-κB-binding sites across human chromosome 22. Proc. Natl. Acad. Sci. USA 100: 12247-12252. Full Text
Zhu, H., M. Bilgin, R. Bangham et al. 2001. Global analysis of protein activities using proteome chips. Science 293: 2101-2105.
Edwards, A. M., B. Kus, R. Jansen et al. 2002. Bridging structural biology and genomics: assessing protein interaction data with known complexes. Trends Genet. 18: 529-536.
Jansen, R., H. Yu, D. Greenbaum et al. 2003. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302: 449-453.
Yu, H., X. Zhu, D. Greenbaum et al. 2004. TopNet: a tool for comparing biological sub-networks, correlating protein properties with topological statistics. Nucleic Acids Res. 32: 328-337. Full Text
Andrea Califano, PhD
Andrea Califano is professor of biomedical informatics at Columbia University, where he leads several cross-campus activities in computational and system biology. Califano is also codirector of the Center for Computational Biochemistry and Biosystems, chief of the bioinformatics division, and director of the Genome Center for Bioinformatics.
Califano completed his doctoral thesis in physics at the University of Florence and studied the behavior of high-dimensional dynamical systems. From 1986 to 1990, he was on the research staff in the Exploratory Computer Vision Group at the IBM Thomas J. Watson Research Center, where he worked on several algorithms for machine learning, including the interpretation of two- and three-dimensional visual scenes. In 1997 he became the program director of the IBM Computational Biology Center, and in 2000 he cofounded First Genetic Trust, Inc., to pursue translational genomics research and infrastructure related activities in the context of large-scale patient studies with a genetic components.
Gustavo Stolovitzky is manager of the Functional Genomics and Systems Biology Group at the IBM Computational Biology Center in IBM Research. The Functional Genomics and Systems Biology group is involved in several projects, including DNA chip analysis and gene expression data mining, the reverse engineering of metabolic and gene regulatory networks, modeling cardiac muscle, describing emergent properties of the myofilament, modeling P53 signaling pathways, and performing massively parallel signature sequencing analysis.
Stolovitzky received his PhD in mechanical engineering from Yale University and worked at The Rockefeller University and at the NEC Research Institute before coming to IBM. He has served as Joliot Invited Professor at Laboratoire de Mecanique de Fluides in Paris and as visiting scholar at the physics department of The Chinese University of Hong Kong. Stolovitzky is a member of the steering committee at the Systems Biology Discussion Group of the New York Academy of Sciences.
Diego di Bernardo, PhD
Diego di Bernardo works as a research scientist at the Telethon Institute of Genetics and Medicine (TIGEM) in Italy and as a lecturer at the European School of Molecular Medicine (SEMM). His research in computational biology places particular emphasis on the application of biomedical engineering and applied mathematics to physiology and molecular biology. Currently, he is working on the analysis of ncRNAs (RNA that does not code for any protein), the role of transcription factor binding sites, and the identification of gene regulatory networks through reverse engineering.
Di Bernardo received his PhD degree from the school of medicine of the University of Newcastle, United Kingdom; his thesis describes a novel computational model for the analysis of the electrocardiogram. Until May 2002, he was a researcher at the Wellcome Trust Sanger Center in Cambridge, where he was employed in June 2001. He went on to join the department of biomedical engineering of Boston University before becoming principal investigator at the Telethon Institute of Genetics and Medicine in Naples, Italy.
Mark Gerstein, PhD
Mark Gerstein is the Albert Williams Associate Professor of Biomedical Informatics at Yale University. His main research lies in the analysis of experimental data from areas in the biological sciences, including structural biology, genomics and proteomics. He also develops computational tools and resources for processing molecular sequences and structures. The lab has created a number of web servers for the generation of molecular motions, microarray data analysis, genomic DNA sequence tiling, large-scale tiling array experiments, annotation of pseudogenes, and collaborative structural genomics. More specifically, Gerstein's work includes three research foci: comparative genomics, expression analysis, and macromolecular geometry.
Gerstein received his PhD in biophysics and chemistry from Cambridge. He completed his postdoctoral training at Stanford University. Gerstein received multiple young investigator awards (from Navy & IBM, PhRMA, Donaghue, and Keck foundations).
Andre Levchenko, PhD
Andre Levchenko is an assistant professor of biomedical engineering at the Johns Hopkins University. His research centers on cell transduction and cell-cell communication by employing both computational and experimental techniques.
After earning a PhD from Columbia University and working at Memorial Sloan-Kettering Cancer Center on cancer cell drug resistance, Levchenko was a postdoc at CalTech working on problems in computer science and cell biology. He joined Johns Hopkins in 2001. Levchenko serves as reviewer for several scientific journals (such as Science, Nature Genetics, and others) and is a panel reviewer for the National Science Foundation and the National Institutes of Health.
Pedro Mendes, PhD
Pedro Mendes is an associate professor of biochemistry at Virginia Tech, where he is the principal investigator at the Biochemical Networks Modeling Group, a division of the Virginia Bioinformatics Institute. His research is centered broadly around computer simulation and analysis of biochemical networks. This is comprised of three components: development of simulation software, modeling of gene expression in the context of metabolic networks, and bioinformatic support for metabolic profiling. He completed his PhD at the University of Wales Aberystwyth.
John Moult, PhD
John Moult directs a group focusing on the computational modeling of biological systems at the Center for Advanced Research in Biotechnology (CARB), one of the five centers in the University of Maryland Biotechnology Institute. He is also president of the Critical Assessment of Protein Structure (CASP) project. His research focuses on understanding the nature of protein folding pathways, the complexity of folding, and the role of chaos in folding. Moult earned his DPhil at the University of Oxford.
Mike Snyder, PhD
Mike Snyder is a professor at the Yale University Department of Molecular Biophysics and Biochemistry as well as at the Department of Molecular, Cellular, and Developmental Biology. He is also a member of the Yale Comprehensive Cancer Center.
In his laboratory, Snyder uses a combination of molecular, cellular, genetic, and genome approaches to study cell structure and division in eukaryotes. His research focuses on four areas: control of the cell cycle, cell polarity and morphogenesis, chromosome segregation and the microtuble cytoskeleton, and large scale analysis of the yeast genome.
Snyder received his PhD in biology from the California Institute of Technology and completed his postdoctoral training at Stanford University School of Medicine. He has been a member of multiple scientific advisory committees, such as the National Institutes of Health Study Section and the American Cancer Society. In addition, Snyder has served as editor-in-chief of Functional and Integrative Genomics and currently serves on the editorial board of several scientific journals.
Marc Vidal, PhD
Marc Vidal received his PhD in 1991 from Gembloux University (Belgium) from work performed at Northwestern University. He trained as a postdoctoral fellow at Massachusetts General Hospital Cancer Center from 1992 to 1996. He started his own research group in 1997 as a staff scientist and joined Dana-Farber Cancer Institute as an assistant professor in 2000. He is now director of the Center for Cancer Systems Biology at DFCI and associate professor of genetics in the department of Cancer Biology at Harvard Medical School.
Mike Yaffe, PhD
Mike Yaffe is the Howard S. and Linda Stern Associate Professor of Biology at the Center for Cancer Research and division of biological engineering. Research in his lab is targeted at the understanding the regulation of signaling pathways that control the cell cycle. In particular, he studies how modifications of proteins are involved in transmitting environmental signals to the cell cycle machinery and uses basic principles of biochemistry to identify all members of a particular signaling pathway and place them in a systems level network.
Yaffe received his PhD in biophysical chemistry as well as his medical degree from Case Western Reserve University. He then served as resident and fellow in general surgery and critical care medicine at Harvard University. Before joining the MIT faculty, he was a fellow at the Division of Signal Transduction and instructor in medicine and surgery at Harvard.
The DREAM project aspires to bring together experimental biologists and computational biologists to better explore the power and the limits of network modeling in biology. The participants in the first stage of planning for the DREAM project include the following, about half of whom attended the March 9-10, 2006, meeting at the New York Academy of Sciences:
- Gary Bader (Memorial Sloan-Kettering Cancer Center)
- Joel Bader (Johns Hopkins University)
- Diego Di Bernardo (Telethon Institute of Genetics and Medicine)
- Hamid Bolouri (Institute for Systems Biology)
- Harmen Bussemaker (Columbia University)
- Andrea Califano (Columbia University)
- Jim Collins (Boston University)
- Eric Davidson (California Institute of Technology)
- Tim Gardner (Boston University)
- Mark Gerstein (Yale University)
- Alexander Hartemink (Duke University)
- Trey Ideker (University of California, San Diego)
- Andre Levchenko (Johns Hopkins University)
- Pedro Mendes (Virginia Tech)
- John Moult (University of Maryland)
- Andrey Rzhetsky (Columbia University)
- Benno Schwikowski (Institut Pasteur)
- Eran Segal (Weitzmann Institute of Science)
- Ron Shamir (Tel Aviv University)
- Mike Snyder (Yale University)
- Gustavo Stolovitzky (IBM)
- Marc Vidal (Harvard Medical School)
- Mike Yaffe (Massachusetts Institute of Technology)