Educated Guesses: The Second Dialogue on Reverse Engineering Assessment and Methods (DREAM)

Educated Guesses
Reported by
Don Monroe

Posted February 27, 2008


On December 3 and 4, 2007, the New York Academy of Sciences hosted the second meeting of the Dialogue on Reverse-Engineering Assessment and Methods (DREAM), which the Academy has nurtured from its inception. This ongoing process aims to assess the ability of scientists—and their computer servants—to infer networks from experimental data, by comparing their predictions to "gold-standard" networks whose structure is thought to be known.

The conference also featured plenary and invited talks, as well as contributed talks and posters, illuminating various aspects of the reverse-engineering challenge.

Web Sites

DREAM2 Challenges
This Web site includes the original description and the blinded data for the five challenges, and now the original gold-standard networks, and the competition results.

The ENCODE Project
An Encyclopedia of DNA elements.

The ENFIN network
This network is committed to providing a Europe-wide integration of computational approaches in systems biology.

Tools & Databases

An interactive tool for building, visualizing, and simulating genetic regulatory networks from the Institute for Systems Biology and the California Institute of Technology.

A modeling tool of biochemical networks from the Systems Biology Institute.

An open-source bioinformatics software platform for visualizing molecular interaction networks and integrating these interactions with gene expression profiles and other state data.

An open source MATLAB toolbox for managing, transforming, visualizing, and modeling data, in particular the high-throughput data encountered in Systems Biology.

A chemical kinetics stochastic simulation software package from the Institute for Systems Biology.

Gene-expression analysis and visualization software from Tel Aviv University.

Information hyperlinked over proteins, from Memorial Sloan-Kettering Cancer Center.

Innate Immune Database (IIDB)
A repository of genomic annotations and experimental data for over 2000 genes associated with immune response behavior in the mouse genome, from the Institute for Systems Biology.

MetaReg combines formalization of existing qualitative models with probabilistic modeling and integration of high throughput experimental data, from Tel Aviv University.

MouseFunc I
A critical assessment of quantitative gene function assignment from genomic datasets in M. musculus.

A method for predicting in vivo kinase-substrate relationships, that augments consensus motifs with context for kinases and phosphoproteins.

Pathway Commons
A point of access to public biological pathway information maintained by Memorial Sloan-Kettering Cancer Center and the University of Toronto.

A database of phosphorylation sites.

From the Institute for Systems Biology, enables the analysis of DNA sequences using mouse-specific position weight matrices.

A Process Modeling Tool from Magdeburg for constructing and manipulating complex technical and biological systems.

A computational model of mechanisms of transcriptional regulation in E. coli from Universidad Nacional Autónoma de México.

Search tool for the retrieval of interacting genes/proteins from the European Molecular Biology Lab.

An open, public space for content editing dedicated to biological pathways.

Journal Articles

Regulating transcription

Ambesi-Impiombato A, Bansal M, Liò P, di Bernardo D. 2006. Computational framework for the prediction of transcription factor binding sites by multiple data integration. BMC Neurosci. 7 Suppl 1:S8 Full Text

Birney E, Stamatoyannopoulos JA, Dutta A, et al. 2007. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447: 799-816. Full Text

Borneman AR, Leigh-Bell JA, Yu H, et al. 2006. Target hub proteins serve as master regulators of development in yeast. Genes Dev. 20: 435-448. Full Text

Borneman AR, Gianoulis TA, Zhang ZD, et al. 2007. Divergence of transcription factor binding sites across related yeast species. Science 317: 815-819.

Faith JJ, Hayete B, Thaden JT, et al. 2007. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 5: e8. Full Text

Gat-Viks I, Shamir R. 2007. Refinement and expansion of signaling pathways: the osmotic response network in yeast. Genome Res. 17: 358-367. Full Text

Gat-Viks I, Tanay A, Raijman D, Shamir R. 2006. A probabilistic methodology for integrating knowledge and experiments on biological networks. J. Comput. Biol. 13: 165-181.

Korbel JO, Urban AE, Affourtit JP, et al. 2007. Paired-end mapping reveals extensive structural variation in the human genome. Science 318: 420-426.

Steinfeld I, Shamir R, Kupiec M. 2007. A genome-wide analysis in Saccharomyces cerevisiae demonstrates the influence of chromatin modifiers on transcription. Nat. Genet. 39: 303-309.

Van Deun K, Marchal K, Heiser WJ, et al. 2007. Joint mapping of genes and conditions via multidimensional unfolding analysis. BMC Bioinformatics 8: 181. Full Text

Beyond transcription

Chen L, Vitkup D. 2006. Predicting genes for orphan metabolic activities using phylogenetic profiles. Genome Biol. 7: R17. Full Text

Eungdamrong NJ, Iyengar R. 2007. Compartment-specific feedback loop and regulated trafficking can result in sustained activation of Ras at the Golgi. Biophys. J. 92: 808-815. Full Text

Fuhrer T, Chen L, Sauer U, Vitkup D. 2007. Computational prediction and experimental verification of the gene encoding the NAD+/NADP+-dependent succinate semialdehyde dehydrogenase in Escherichia coli. J. Bacteriol. 189: 8073-8078.

Linding R, Jensen LJ, Ostheimer GJ, et al. 2007. Systematic discovery of in vivo phosphorylation networks. Cell 129: 1415-1426.

Ma'ayan A, Jenkins SL, Neves S, et al. 2005. Formation of regulatory patterns during signal propagation in a Mammalian cellular network. Science 309: 1078-1083.

Ptacek J, Devgan G, Michaud G, et al. 2005. Global analysis of protein phosphorylation in yeast. Nature 438: 679-684. (PDF, 456 KB) Full Text

Rangamani P, Iyengar R. 2007. Modelling spatio-temporal interactions within the cell. J. Biosci. 32: 157-167. (PDF, 143 KB) Full Text

Network inference from dynamics

Albeck JG, MacBeath G, White FM, et al. 2006. Collecting and organizing systematic sets of protein data. Nat. Rev. Mol. Cell Biol. 7: 803-812.

Aldridge BB, Burke JM, Lauffenburger DA, Sorger PK. 2006. Physicochemical modelling of cell signalling pathways. Nat. Cell Biol. 8: 1195-1203.

Bansal M, Belcastro V, Ambesi-Impiombato A, di Bernardo D. 2007. How to infer gene networks from expression profiles. Mol. Syst. Biol. 3: 78. Full Text

Bansal M, di Bernardo D. 2007. Inference of gene networks from temporal gene expression profiles. IET Syst. Biol. 1: 306-312.

Cline MS, Smoot M, Cerami E, et al. 2007. Integration of biological networks and gene expression data using Cytoscape. Nat. Protoc. 2: 2366-2382.

Cosentino C, Curatola W, Montefusco F, et al. 2007. Linear matrix inequalities approach to reconstruction of biological networks. IET Syst. Biol. 1: 164-173.

Di Cara A, Garg A, De Micheli G, et al. 2007. Dynamic simulation of regulatory networks using SQUAD. BMC Bioinformatics. 8: 462. Full Text

Janes KA, Gaudet S, Albeck JG, et al. 2006. The response of human epithelial cells to TNF involves an inducible autocrine cascade. Cell 124: 1225-1239. Full Text

Krawitz P, Shmulevich I. 2007. Entropy of complex relevant components of Boolean networks. Phys. Rev. E. Stat. Nonlin. Soft Matter Phys. 76(3 Pt 2): 036115.

Lähdesmäki H, Hautaniemi S, Shmulevich I, Yli-Harja O. 2006. Relationships between probabilistic Boolean networks and dynamic Bayesian networks as models of gene regulatory networks. Signal Processing 86: 814-834. Full Text

Mendoza L, Xenarios I. 2006. A method for the generation of standardized qualitative dynamical systems of regulatory networks. Theor. Biol. Med. Model. 3:13. Full Text

Parisi F, Wirapati P, Naef F. 2007. Identifying synergistic regulation involving c-Myc and sp1 in human tissues. Nucleic Acids Res. 35: 1098-1107. Full Text

Price ND, Shmulevich I. 2007. Biochemical and statistical network models for systems biology. Curr. Opin. Biotechnol. 18: 365-370. Full Text

Reva B, Antipin Y, Sander C. 2007. Determinants of protein function revealed by combinatorial entropy optimization. Genome Biol. 8: R232.

Vilar JM, Jansen R, Sander C. 2006. Signal processing in the TGF-β superfamily ligand-receptor network. PLoS Comput. Biol. 2: e3. Full Text

Patterns of expression

Anastassiou D. 2007. Computational analysis of the synergy among multiple interacting genes. Mol. Syst. Biol. 3: 83. Full Text

Chua HN, Sung WK, Wong L. 2007. An efficient strategy for extensive integration of diverse biological data for protein function prediction. Bioinformatics 23: 3364-3373.

Chua HN, Sung WK, Wong L. 2007. Using indirect protein interactions for the prediction of Gene Ontology functions. BMC Bioinformatics 8 Suppl 4: S8. Full Text

Lemmens K, Dhollander T, De Bie T, et al. 2006. Inferring transcriptional modules from ChIP-chip, motif and microarray data. Genome Biol. 7: R37. Full Text

Margolin AA, Nemenman I, Basso K, et al. 2006. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 Suppl 1: S7. Full Text

Palomero T, Lim WK, Odom DT, et al. 2006. NOTCH1 directly regulates c-MYC and activates a feed-forward-loop transcriptional network promoting leukemic cell growth. Proc. Natl. Acad. Sci. 103: 18261-18266. Full Text

Ravasz E, Somera AL, Mongru DA, et al. 2002. Hierarchical organization of modularity in metabolic networks. Science 297: 1551-1555.

Shen-Orr SS, Milo R, Mangan S, Alon U. 2002. Network motifs in the transcriptional regulation network of Escherichia coli. Nat. Genet. 31: 64-68. (PDF, 1.59 MB) Full Text

Varadan V, Anastassiou D. 2006. Inference of disease-related molecular logic from systems-based microarray analysis. PLoS Comput. Biol. 2: e68. Full Text


Davis J, Gaodrich M. 2006. The Relationship between precision-recall and ROC curves. 23rd International Conference on Machine Learning (ICML), Pittsburgh, PA, USA, 26th–28th June. (PDF, 139 KB)


Andrea Califano, PhD

Columbia University
e-mail | web site | publications

Andrea Califano is professor of biomedical informatics at Columbia University, where he leads several cross-campus activities in computational and system biology. Califano is also codirector of the Center for Computational Biochemistry and Biosystems, chief of the bioinformatics division, and director of the Genome Center for Bioinformatics.

Califano completed his doctoral thesis in physics at the University of Florence and studied the behavior of high-dimensional dynamical systems. From 1986 to 1990, he was on the research staff in the Exploratory Computer Vision Group at the IBM Thomas J. Watson Research Center, where he worked on several algorithms for machine learning, including the interpretation of two- and three-dimensional visual scenes. In 1997 he became the program director of the IBM Computational Biology Center, and in 2000 he cofounded First Genetic Trust, Inc., to pursue translational genomics research and infrastructure related activities in the context of large-scale patient studies with a genetic components.

Gustavo Stolovitzky, PhD

IBM Computational Biology Center
e-mail | web site | publications

Gustavo Stolovitzky is manager of the Functional Genomics and Systems Biology Group at the IBM Computational Biology Center in IBM Research. The Functional Genomics and Systems Biology group is involved in several projects, including DNA chip analysis and gene expression data mining, the reverse engineering of metabolic and gene regulatory networks, modeling cardiac muscle, describing emergent properties of the myofilament, modeling P53 signaling pathways, and performing massively parallel signature sequencing analysis.

Stolovitzky received his PhD in mechanical engineering from Yale University and worked at The Rockefeller University and at the NEC Research Institute before coming to IBM. He has served as Joliot Invited Professor at Laboratoire de Mecanique de Fluides in Paris and as visiting scholar at the physics department of The Chinese University of Hong Kong. Stolovitzky is a member of the steering committee at the Systems Biology Discussion Group of the New York Academy of Sciences.


Michael Snyder, PhD

Yale University
e-mail | web site | publications

Michael Snyder is Lewis B. Cullman Professor of Molecular, Cellular, and Developmental Biology at Yale University. He is also a member of the Yale Comprehensive Cancer Center. His work focuses on topics including the control of cell division and cell morphology in yeast, the characterization of proteomes, analysis of regulatory circuits in yeast, characterization of the human genome, and sex-specific gene expression in mammals. After completing his PhD at the California Institute of Technology, he held postdoctoral fellowships at CalTech and Stanford, before joining Yale in 1986.

Ron Shamir, PhD

Tel Aviv University
e-mail | web site | publications

Ron Shamir is Sackler Professor of Bioinformatics at the School of Computer Science at Tel Aviv University. His group develops algorithms and tools for analyzing biological data, with particular emphasis in systems biology and the study of gene regulation, human genetics, and genome rearrangements. His research interests include computational molecular biology, the design and analysis of algorithms, and algorithmic graph theory. Shamir completed his PhD in operations research at the University of California, Berkeley.


Dimitris Anastassiou, PhD

Columbia University
e-mail | web site | publications

Irene Cantone

Telethon Institute of Genetics and Medicine
e-mail | web site

Barbara Di Camillo

University of Padova
e-mail | web site | publications

Hon Nian Chua, PhD

Institute for Infocomm Research, Singapore
e-mail | web site | publications

Neil Clarke, PhD

Genome Institute of Singapore
e-mail | web site | publications

Diego Di Bernardo, PhD

Telethon Institute of Genetics and Medicine
e-mail | web site | publications

Timothy S. Gardner, PhD

Boston University
e-mail | web site | publications

Ravi Iyengar, PhD

Mount Sinai School of Medicine
e-mail | web site | publications

Rune Linding, PhD

Samuel Lunenfeld Research Institute and MIT
e-mail | web site | publications

Daniel Marbach

Ecole Polytechnique Fédérale de Lausanne
e-mail | web site

Kathleen Marchal, PhD

Katholieke Universiteit Leuven
e-mail | web site | publications

Adam Margolin

Columbia University
e-mail | web site | publications

Fabio Parisi

Ecole Polytechnique Fédérale de Lausanne
e-mail | web site | publications

Vipul Periwal, PhD

National Institute of Diabetes and Digestive and Kidney Diseases
e-mail | web site | publications

Fritz Roth, PhD

Harvard Medical School
e-mail | web site | publications

Chris Sander, PhD

Memorial Sloan-Kettering Cancer Center
e-mail | web site | publications

Ilya Shmulevich, PhD

Institute for Stystems Biology
e-mail | web site | publications

Peter Sorger, PhD

Harvard Medical School
e-mail | web site | publications

Dennis Vitkup, PhD

Columbia University
e-mail | web site | publications

Ioannis Xenarios, PhD

Swiss Institute of Bioinformatics
e-mail | web site | publications

Don Monroe

Don Monroe is a science writer based in Murray Hill, New Jersey. After getting a PhD in physics from MIT, he spent more than fifteen years doing research in physics and electronics technology at Bell Labs. He writes on physics, technology, and biology.


  • DREAM challenged participants to reverse engineer five different "known" networks, based on prescribed data.
  • The five gold-standard networks included both experimental and simulated data for networks of widely varying size.
  • 36 teams submitted a total of 110 predictions for various challenges.
  • The best predictions were very good for some networks, but disappointing for others.
  • Although inferring "the network" is an inspiring goal, it may be more practical to evaluate results based on predictions for observable experiments.

How well are reverse engineers doing?

From the earliest stages of the DREAM project, the organizers imagined a competition in which different teams competed in using the same, blinded data to infer the networks that had generated it. Perhaps only in this way can the community know whether the networks that their methods produce can be trusted. The idea was inspired in part by other competitions, notably the CASP assessment of algorithms for protein-structure prediction.

The protein-folding challenge, however, begins with a precise amino-acid sequence and ends with a three-dimensional structure that is experimentally well defined. For reverse engineering of networks, both the specification of the data and the evaluation of the results are much harder.

CASP and MouseFunc have served as models for DREAM.

A more analogous task, perhaps, is the MouseFunc project, which aims to evaluate researchers' ability to deduce the function of mouse genes. Like DREAM, said MouseFunc co-organizer Fritz Roth of Harvard Medical School, who also helped to design the DREAM assessments, "The idea was to have one benchmark dataset and everyone do their prediction on the same dataset." The goal was to predict which annotation terms apply to each gene, based on a variety of data types. (Interestingly, Roth said, protein sequence and protein–protein interaction data turned out to be more useful than expression profiles for this task.)

A major problem for DREAM is identifying gold-standard networks whose structure can be taken as known. The best-understood networks are those created by people, but many researchers worried that these networks would hold little interest for the larger biological community. As a result, the organizers selected a range of "challenges" that span both large and small biological networks as well as mathematical networks, and, in between, a synthetic network implemented in yeast.

Participation in DREAM challenges

This second DREAM conference unveiled the results of the first round of predictions, in which 36 teams submitted 110 predictions for five challenges. In each case, teams submitted a list of predictions, such as transcription-factor targets or pairs of interacting proteins, ranked in order of their confidence in the prediction. The organizers then determined whether the predicted links were correct, based on the secret gold-standard list.

Comparing the accuracy of the predictions is a subtle task. For DREAM2, the results were scored using a metric that is minus the base-10 logarithm of the product of the two p-values associated with the area under the ROC curve and under the precision-recall curve. For example, if both curves had the standard threshold p-value of 0.05, the DREAM2 metric would have a value of −log10(0.05*0.05), or 2.6, while better classification results in larger values.

Aris Floratos, Ethan Fuchs, and Bernd Jagla of Columbia University provided critical support to the project.

BCL6-targets challenge

The first DREAM challenge required participants to determine which of a set of 200 genes were targets for the transcriptional repressor BCL6. They were provided with gene expression data generated under normal conditions, when BCL6 was itself suppressed by a treatment, or was augmented by exogenous BCL6 that resists degradation, or both. Teams were also allowed to use sequence data.

The gold standard list of BCL6 targets was developed by Gustavo Stolovitzky, Andrea Califano, Kai Wang, and Adam Margolin at Columbia University, with the help of Riccardo Dalla-Favera. Among the genes that showed an appropriate expression signature, the 53 gold-standard positive targets were also required to be positive in a chromatin-immunoprecipitation (ChIP-chip) experiment, requiring a false discovery rate less than 10−4. The remaining decoy targets—gold-standard negatives—were confidently judged to be negatives by these tests.

"Is recognizing what the organizers said was a target the same thing as predicting true targets?"

Several teams did very well in recapitulating the gold-standard targets, with a DREAM metric of about 25. However, Neil Clarke, who represented GenomeSingapore, one of the best-predictor teams from the Genome Institute of Singapore, was somewhat circumspect about this success. "We recognized what the organizers said was a target. Is that the same thing as saying that we are the best at predicting true targets? How do we know?" he wondered.

The TargetSeeker team (also from the Genome Institute of Singapore) was also one of the best predictors for this challenge. However, the two teams had relatively little overlap in their predictions (other than the correct ones).

The protein–protein subnetwork challenge

In DREAM Challenge 2, participants were asked to determine which pairs of proteins among a set of 47 interact with one another. Each team submitted an ordered list of pairs out of the 47×48/2 = 1128 possible pairs, ranked in order of their confidence that they interact.

The list of gold-standard-positive interactions was determined by consistent response in yeast 2-hybrid experiments, which probe direct and persistent interactions between tagged proteins through the combined effect of their tags on transcription. The idea of this challenge was originally conceived by Joel Bader of Johns Hopkins University. The data were compiled by Haiyuan Yu from Marc Vidal's lab at the Dana Farber Cancer Institute, with invaluable intellectual input from Fritz Roth of Harvard Medical School. Gold-standard-negative interactions consistently failed to show evidence of interaction in the 2-hybrid assay. Inclusion also required that evidence of the interaction not be found in a literature search. Inconsistent interactions, and those already documented in the literature, were excluded from scoring.

For this challenge, only one of the five submitting groups, the Netminer team from the Institute for Infocomm Research, Singapore, got a reasonable DREAM2 metric of about 5. To do this, they quickly adapted methodologies developed for protein-function prediction to predict protein interactions.

The five-gene network challenges

Challenge 3 involved a synthetic network, in which five genes were introduced into yeast chromosomes (not plasmids). This scheme was conceived and implemented by Pia Cosma and her student Irene Cantone of the Telethon Institute of Genetics and Medicine. Diego di Bernardo, also from the Telethon Institute, collaborated with this effort. Teams were given two distinct types of time-series data for these modified organisms.

In the first data set, qPCR (quantitative polymerase chain reaction) data for the expression of the five genes were monitored at regular intervals following a switch from glucose to galactose growth medium. In the second set, time series of microarray data were provided for the expression of 588 genes, including the five extra ones, as they respond to the cell cycle.

In each case, expression data were collected every 20 minutes, and teams were asked to infer, from these time series, which pairs among the five genes interact. Separate competitions were held for the two data sets, and for four types of network prediction within each set: the interactions could be directed (specifying which gene affects which) or undirected, and the sign of the interaction (excitatory or inhibitory) could be specified or not.

Somewhat surprisingly, given the simplicity of the network, the predictions were only fair, with a DREAM2 metric of about 4.3 for the unsigned, unidirectional predictions from qPCR data. Among the better predictions were the evolutionary algorithm described by Daniel Marbach of the Ecole Polytechnique Fédérale de Lausanne (EPFL) and the algorithm described by Fabio Parisi, also of EPFL, which uses penalized regression and imposes a sign constraint on the regulation.

The in-silico-network challenges

DREAM Challenge 4 concerned three networks that exist solely on computers, generated under the guidance of Pedro Mendes of Virginia Tech and the University of Manchester. Two of these are simulated transcriptional regulatory networks, while the third is much more realistic.

The first two networks each contain 50 mutually interacting genes. Steady-state expression data are provided under individual perturbation of each of the 50, either partial knockdown or full knockout. In addition, time-series data in response to 23 different perturbations are given. The third network aspires to capture some of the complexity of real biological systems that have many levels of interaction. It includes 16 metabolites, 23 proteins, and 20 genes in mutual interaction. As with networks 1 and 2, data on knockdown and null mutants for each of the 20 genes is provided, as is time-series data in response to 22 different perturbations. As for Challenge 3, participants had the option of describing the direction and sign of the interactions.

For the first two networks, competitors did rather well. The best predictors, led by Diego di Bernardo and represented at the meeting by Irene Cantone of the Telethon Institute of Genetics and Medicine, achieved a DREAM metric of nearly 38 on InSilico1, by applying the NIR algorithm developed at Boston University. In contrast, only three predictions were submitted for the complex InSilico3, and none of the DREAM metrics exceed about 2.0. Since transcriptional networks are often modulated by signaling and metabolism, this mediocre success is rather worrisome for real biological networks.

The genome-scale network challenge

The final challenge has the most data. Challenge 5 aims to assess the ability of algorithms to deal with a regulatory network whose size approaches the entire genome of a simple organism. The data, assembled by Boris Hayete and Tim Gardner (both then at Boston University), includes expression profiles for 3456 genes under 300 different conditions. The identities of both the genes and the experiments were disguised, and a small amount of noise was added to thwart numerical searches for the publicly available part of the data.

The expression data for this challenge included both published and unpublished work for the bacterium E. coli. The gold-standard interactions were extracted from the well established RegulonDB database.

For this challenge, many teams made very good predictions. The best results, with a DREAM metric over 40, came from John Watkinson and Dimitris Anastassiou of Columbia University, who augmented their statistical analysis of pair correlations in expression with calculations of the synergy between triplets of genes.


The participation in this first DREAM competition was impressive, and in many cases the results were equally impressive. However, the practical experience of constructing data sets and evaluating the result raised important questions for future competitions.

"What we are assessing is whether we know how to assess reverse engineering."

One important issue concerns the current criterion of evaluating network connections, rather than experimental behavior. Although everyone agreed that finding the underlying networks is desirable in principle, many thought it was an unrealistic way to evaluate algorithms. "Requiring someone to come up with the same network assumes that the network that the organizers have inferred is the only one that's consistent with the data," noted Neil Clarke. "If the goal is really to predict data, that's what you should measure."

Nonetheless, understanding what can be confidently learned about biological networks is increasingly critical, and the DREAM conference is moving towards that goal. "Rather than an exercise in assessment, what we are assessing is whether we know how to assess reverse engineering," admitted DREAM co-organizer Gustavo Stolovitzky. "It seems that we are not fully aware of all the implications of this exercise yet."


At a microscopic level, organisms are ruled by interacting systems of biomolecules. Historically, scientists painstakingly elucidated chains of molecular events using experiments that reveal individual interactions, although they recognized that members of different pathways frequently interact. In recent years, researchers have built richer, interconnected networks to mathematically summarize their knowledge of these interactions. This systems biology enterprise, largely stimulated by high-throughput tools like microarrays that measure mRNA levels as an indicator of gene expression, is a vital and increasingly important activity in both basic biology and in medicine.

How seriously should we take the networks that systems biology describes?

A nagging concern, however, is how accurately these networks represent the biology. For complex systems like biological networks, there are practical limits on how well even massive amounts of data can uniquely define the underlying structure and yield useful predictions of measurable events. Indeed, although its advocates call this process "reverse engineering," the topology and the detailed molecular interactions of the "inferred" networks will likely never be known with precision.

On December 3 and 4, 2007, the New York Academy of Sciences hosted the second meeting of the Dialogue on Reverse-Engineering Assessment and Methods (DREAM), which the Academy has nurtured from its inception. (For more information, see the related volume of the Annals of the New York Academy of Sciences: Reverse Engineering Biological Networks: Opportunities and Challenges in Computational Methods for Pathway Inference.) This ongoing process aims to assess the ability of scientists—and their computer servants—to infer networks from experimental data, by comparing their predictions to "gold-standard" networks whose structure is thought to be known. The conference also featured plenary and invited talks, as well as contributed talks and posters, illuminating various aspects of the reverse-engineering challenge.

Diverse networks

The centerpiece of the second DREAM meeting was a set of five "challenges," in which participants tried to replicate various types of known networks from specified data. The five challenges included identifying targets of the transcriptional repressor BCL6, determining which proteins of a group interact, and inferring the topology of a variety of networks, including a five-gene synthetic network in yeast, several more complex, computer-generated networks, and a documented gene regulatory network in a bacterium.

To ensure a fair comparison of different techniques for reverse engineering networks, the DREAM organizers carefully limited the data supplied, and tried to disguise it so that participants could not leverage other kinds of data. This blinded procedure does not take advantage of all available information, however, especially biological wisdom that does not fit easily into a formal mathematical framework. Some speakers instead advocated incorporating prior biological knowledge such as known feedback loops into the network from the earliest stages of the process. But others felt that, although such information might improve the networks, it would compromise the primary DREAM goal of assessing methods.

Determining the most revealing experimental conditions is a crucial issue for reverse engineering. The blinded competition, however, demanded that the organizers provide the data, so the competitors could not differentiate themselves by devising perturbations to best clarify network features.

Transcriptional regulation—in which proteins produced from mRNA in turn act to modulate the transcription of other genes into mRNA—is the poster child of systems biology. Researchers exploit uniform and commercially accessible high-throughput data to construct complex transcriptional networks based on simple models of regulation. Nonetheless, recent studies reveal important complexities in transcription regulation. In addition, other types of interaction must ultimately be integrated into the description. Researchers have made significant progress in elucidating some types of networks, such as signaling networks driven by post-translational modifications of proteins. Other networks, like those governed by metabolic interactions or the various mechanisms associated with microRNA, are at an earlier stage of understanding.

Diverse algorithms

The purpose of DREAM is not to produce the best possible network, but to evaluate the best tools for producing networks. The choice of tools depends in part on the nature of the available data. Dynamic techniques aim to exploit the detailed time evolution of biological responses like mRNA concentration in response to perturbations. The underlying model is generally a system of differential equations, and the modeling aims to determine the parameters of these equations.

Many algorithms analyze the correlations between the steady-state levels of biomolecules, such as mRNA, under various conditions. These static techniques use statistical methods to try to distinguish the direct interactions between nodes from those mediated by other nodes. Their results are generally embodied in the topology of a (possibly directed) graph.

For both static and dynamic models, however, the experimental data are typically insufficient to specify a unique network. Researchers generally must discard many apparent interactions because their effects are unimportant, but in so doing they may also discard some real interactions. Developing metrics that quantify this tradeoff is a subtle and challenging issue, especially for biological networks, which are often sparse.

Diverse results

Researchers are still grappling with the best ways to find the best ways to find networks.

In the end, 36 teams made a total of 110 predictions for the five challenges. The match between these predictions and the "known" networks varied widely, both between teams and between challenges. For example, all teams did poorly at identifying the most complex in silico network, which was governed by transcriptional, signaling, and metabolic interactions. The networks inferred from the data differed significantly from the real network, which is precisely known. What is not known is whether the data given are, by themselves, sufficient to distinguish the networks.

By contrast, many teams did very well at identifying the targets of the transcription suppressor BCL6 from expression and sequence data. For this real data, however, the "gold-standard" result is itself derived in the context of a specific understanding of the biological mechanisms. Although the organizers did additional experiments to validate the results, the team that best predicted the targets used hints about the organizers' thinking process to better tune their predictions. They challenged the organizers to consider that, rather than identifying the underlying network, predicting the observable results of experiments may be a more objective way to asses reverse engineering.

At this early stage, the DREAM process is still searching for the best ways to find networks, and each challenge has shed some light on the the problem.


  • Even after pre-mRNA is transcribed, its subsequent processing, export from the nucleus, survival in the cytoplasm, translation, and post-translational modifications all affect its ultimate impact.
  • Binding sites for transcription factors have diverged much more rapidly during yeast evolution than genes did.
  • Sequence-based analysis of chromatin immunoprecipitation, ChIP-Seq, provides more sensitive and detailed information about binding sites than array-based ChIP-chip does.
  • Chromatin Immunoprecipitation identifies DNA sequences to which a transcription factor binds, but customary significance criteria may discount many true binding sites.
  • Some transcription factors act as continuous global regulators for thousands of genes, rather than on-off switches for a few.

The core of the system

Regulatory networks, in which transcription factors modulate the expression of genes, are the cornerstone of systems biology. In part, this situation reflects the availability, for many organisms, of microarrays that simultaneously monitor the transcription of thousands of genes. In addition, beginning with early work on bacteriophages, the mechanisms by which some proteins act as factors to modulate transcription, and thus to help regulate protein synthesis, has been explored in great detail. Nonetheless, there is still much to learn.

Keynote speaker Michael Snyder of Yale University has investigated many aspects of transcriptional regulation. In one study, he and his colleagues compared the sequences of binding sites between yeast species. They found surprisingly large differences, even between closely related species. "These binding sites have diverged incredibly rapidly between these related species. They have very similar gene content, so we would conclude that gene regulation actually diverges much more rapidly than gene content," Snyder concluded. Because of this rapid divergence, interspecies sequence similarity in binding domains may not be as useful as it has been for protein-coding regions of the DNA, he suggested. "You're going to miss a lot of interesting regulatory information, because it's not conserved."

"Gene regulation diverges much more rapidly than gene content."

Even when a motif is present, Snyder warned, it is a poor predictor of binding activity. He speculates that binding frequently involves multiple transcription factors, so researchers may need to study combinations of binding sites.

Snyder's team has explored the subnetwork associated with pseudohyphal growth in the yeast Saccharomyces cerevisiae in great detail. Of the six key transcription factors, two appear to act as master regulators of the entire response. However, the network indicates that, rather than driving all the others, these factors are "target hubs" that receive input from all of the others. Snyder suggested that this surprising downstream location result may reflect the biological role of this subnetwork in integrating information about whether to initiate the response, and that master regulators may well lie upstream in other networks.

To validate the binding of a transcription factor to a particular section of DNA, researchers frequently turn to chromatin immunoprecipitation, or ChIP, in which an antibody-tagged transcription factor pulls down bound DNA. In the ChIP-chip technique, the DNA is identified by its binding to complementary sequences on an array on a chip.

Although ChIP-chip provides a wealth of information, Snyder warned that the experiments probably need to be repeated for different cell types, and even under varying conditions. "Most of the binding information is new, even if you're in the same cell type under two different conditions."

In the future, Snyder sees greater promise in ChIP-seq technique, in which researchers determine the detailed sequence of the pulled-down DNA fragments and map them onto the known genome. "The data turn out to be quite a bit cleaner and higher resolution, and also more sensitive," he said. "In the long run, I think the sequencing technologies will probably replace the array technologies."

A wealth of targets

Adam Margolin of Columbia University argues that some transcription factors have vastly more targets than are predicted using standard significance criteria for interpreting chromatin immunoprecipitation data. Because there are so many targets whose expression changes, he observed, the overall range of activity changes increases. Since the significance of a particular activity change is judged relative to this spread, many of the real targets are likely to be rejected as statistically non-significant.

Thousands of genes show small expression changes when the transcription factor NOTCH1 is inhibited.

Margolin proposed an algorithm that explicitly accounts for the bimodal distribution, as well as a systematic variation with total activity. The non-target genes have a much narrower spread of activity changes. The implication is profound: although the standard method flags fewer than 200 genes as likely targets for the oncogene product MYC, the new method predicts more than 8000, even using a conservative 5% false-discovery rate. A similarly dramatic increase is found for NOTCH1, and many genes are predicted as targets for both transcription factors.

Margolin has validated individual targets far down on this long list. "I think this binding is convincing, but it remains unclear whether it matters," he commented. However, looking at expression for thousands of these putative targets of NOTCH1 showed small but highly significant expression changes across the group. One might have expected to see a few targets with dramatic changes, Margolin observed. "This is not what we see in the distribution, suggesting that this whole space of genes is coordinately regulated," he said. He likened this transcription factor to a gas pedal that continuously modifies a whole constellation of expression changes.

Interactions between chromatin modifiers and transcription factors

As part of his keynote talk, Ron Shamir of Tel Aviv University explored the interactions between chromatin modifications and transcription factors. It is well known that the wrapping of DNA around nucleosomes affects its accessibility for transcription, as well as for binding by promoters or repressors. This effect can be modulated by molecules that tend to open or close the chromatin, either directly or by modifying the tails of the histone proteins at the core of the nucleosomes.

Shamir and his colleagues mined published data to identify transcription factors whose activity changes in the presence of various chromatin modifiers. "The techniques, mathematically, were extremely simple," he said. Nonetheless, the researchers identified more than 250 candidate interactions.


  • Post-translational modifications, such as phosphorylation by kinases, can both transmit and analyze cellular signals.
  • Kinases tend to be very context-specific, with different targets depending on detailed tissue type and other conditions.
  • If the spatial segregation of interacting biomolecules is ignored, misleading networks may be proposed.
  • Metabolic networks are well understood functionally but many metabolic activities have not been connected to a DNA sequence.
  • Integration of multiple data types is critical for constructing robust networks.

Looking in context

Signaling networks typically involve the addition of chemical groups, such as the phosphates added by kinases, to specific protein domains. The interaction of multiple domains in a protein, said Rune Linding of the Massachusetts Institute of Technology, is "how the cell performs very fast, very high-throughput computational work," forming what he calls "the logic gates of the cell." In addition, he said, "25% of drug development is going into kinase inhibitors."

Signaling proteins such as kinases are the logic gates of the cell.

The number of known, in vivo phosphorylation sites—now about 24,000 in humans—is growing rapidly, but it is rarely known which of the 500-plus kinases act on them. "There is a widening gap in our knowledge of the networks that are mediated by the phosphorylation," Linding said, because "we don't have very good ways of experimentally determining these networks." Simply reducing the activity of a kinase, for example, may not have a large effect, and the kinase–target complexes don't persist long enough to be found through precipitation experiments.

In addition, Linding warned, in vitro studies may have limited value because they lack biological context, such as compartmentalization of proteins to different regions of the cell, or the need for nearby anchoring proteins and scaffolds, as well as temporal and tissue-specific expression. Without this context, the association—or lack of association—between a kinase and a protein could be misleading, he claimed.

To address this need, Linding and his colleagues developed the NetworKIN algorithm. This scheme first flags known binding motifs to identify likely families of kinases that target them. It then turns to the protein database called STRING, developed at the European Molecular Biology Laboratory. This database uses a Bayesian framework to integrate information about known and predicted protein interactions. Armed with this information, NetworKIN chooses the most likely kinase within the family as the probable upstream protein that interacts with the binding site.

Linding also described a technique to speed experimental validation of the interactions. In "multiple-reaction monitoring," users program a mass-spectrometer to pass only molecules that have specified masses before a reaction and after fragmentation. This can quickly confirm the NetworKIN predictions, because using "the prior knowledge of how the protein would behave in the mass spec, you can program [the spectrometer] to only look for peptides that fragment in that way," Linding said.

Windows on phosphorylation

Keynote speaker Michael Snyder of Yale University and his team have developed a protein-chip technology for large-scale mapping of which proteins are phosphorylated by which kinases, at least in vitro. So far, he said, they have found about 4200 phosphorylation events targeting 1325 yeast proteins. Many of the targets are transcription factors, Snyder said. "Biologically, one might expect that."

"Many phosphorylation targets are transcription factors."

"This network has many, many properties in common with that of the transcription factor binding network: it's scale-free, like the transcription factor binding, and it has similar clustering properties," Snyder said. "Many of the motifs that you find in this network are similar to what you would find in a transcription network as well. Things like feed-forward loops are very, very common in the phosphorylation network, just like they are in the transcription network." Combining interaction, transcription, and phosphorylation data showed strong enrichment of other motifs as well, Snyder observed, such as a kinase phosphorylating a protein, and both of them interacting with a third protein.

Andrea Califano of Columbia University described a scheme to identify post-transcriptional interactions computationally, using transcription data. The MINDy algorithm developed by his group looks for the "conditional mutual information" between two transcripts, depending on the expression of a third transcript (similar in some sense to the "synergy" described by Dimitris Anastassiou, also of Columbia).

This "modulator" effect occurs frequently. For example, among 340 validated targets of the MYC oncoprotein, 205 were modulated by a single species, casein kinase 2. Another modulator, serine threonine kinase 38, has not been widely studied but is a potent modulator of MYC. Califano said his group has now validated that this kinase directly phosphorylates MYC, in a way that depends on signaling activity in the cell. Up until now, he said, "the interface between signaling and transcription processes has been very much unmapped."

The MINDy algorithm can thus identify some protein–protein interactions using transcription data, but will also flag indirect interactions. A comparison with Linding's NetworKIN results shows substantial overlap, Califano said, but perhaps 80% of the interactions identified by MINDy are probably indirect.

Geometry lessons

Ravi Iyengar of the Mount Sinai School of Medicine reminded participants of the profound effects of spatial separation of macromolecules. The importance of geometry is well appreciated for structures like the long, thin dendrites and axons of neurons. Iyengar noted one case where a chemical signal propagates correctly only if the diameter of the dendrite is in a narrow range.

Comparison of numerical simulations to FRET experiment that estimated cAMP levels in live cells.

In modeling biological networks, Iyengar said, "without spatial representation, I don't think that we'll get very far." Even in cells that are geometrically simpler than neurons, the spatial separation of molecules strongly influences their interactions. Iyengar has used the partial differential equations available in tools like Virtual Cell to model this spatial variation, but it is sometimes sufficient to account for distinct pools of reactants, for example one in the nucleus and another in the cytoplasm. He showed that lumping the pools together can create a misleading network model in which molecules misleadingly appear to interact. "If you didn't know this spatial organization existed, you would never know why this connection never exists."

Metabolic networks and RNA interference

Metabolic neworks, said Dennis Vitkup of Columbia University, are "like a hidden jewel" among biological networks. In many ways the current understanding of metabolic networks is a mirror image of the transcriptional networks probed by microarrays: the functions and interactions of species are often well understood, but not their genetic identity. In fact, a large fraction—perhaps 30%–40%—of metabolic functions are "orphans" in that no DNA sequence has been associated with that function.

A large fraction of metabolic activities are "orphans." We do not know a single gene responsible for these metabolic reactions.

Vitkup has been laboring to fill in some of the missing pieces, for example, by identifying sequences for orphans. He emphasizes that, because metabolites produced by one organism are often taken up by another, metabolic networks provides a window into entire ecosystems. In addition, transcriptional regulatory networks are sensitive to metabolism. Pedro Mendes of Virginia Tech also included metabolic interactions in the most challenging in silico network he built for the fourth DREAM challenge.

One important class of interaction—which was not discussed much at the DREAM symposium—is RNA interference. Over the past decade, it has become increasingly clear that various short RNA sequences play an important regulatory role in both plants and animals. The detailed mechanisms are still being clarified, but small RNAs can both enhance the degradation of specific mRNAs and can also inhibit or enhance their translation.

Together with glycosylation, ubiquitination, and other post-translational modifications of proteins, these mechanisms allow great complexity in regulation, far beyond that embodied in transcription factor activity. Since these mechanisms are likely to be poorly understood for some time, network models need to deal gracefully with the indirect effect of unrecognized regulatory pathways.

Bringing it together

Many speakers emphasized the importance of integrating different types of experimental data—as well as less formal biological knowledge—to build robust network models. Ilya Shmulevich and his colleagues at the Institute for Systems Biology, for example, are exploring ways to combine expression data with additional sources of evidence, such as CpG islands, nucleosome positioning, regulatory scores, even ChIP-chip data. Such data fusion is "very natural in the Bayesian setting" that his team operates in, Shmulevich observed.

In a completely different approach. Daniel Marbach of the Ecole Polytechnique Fédérale de Lausanne said that his evolutionary algorithm also gracefully incorporates various types of information. Other researchers are also working to develop the software and databases that fuse different types of experimental data to get the fullest possible picture of the associated networks. "We are already starting to learn, what we will probably learn better over the years, how to integrate all these networks," said IBM's Gustavo Stolovitzky.


  • A time delay between a change in the level of a transcription factor and a change in gene transcription supports a causal linkage between them.
  • Incorporating prior biological knowledge can help researchers build better networks, especially for feedback loops, which are hard to infer from raw data.
  • Most models will have many parameters whose values are not determined from experiments, but often this will not matter.
  • Researchers use a variety of dynamical models, both discrete and continuous, saturating and non-saturating, and a variety of learning algorithms to determine the parameters.

Using the time dimension

Although many researchers mine static expression data to learn about interactions, others aim to exploit the dynamic response of biological systems to sudden perturbations. Ilya Shmulevich of the Institute for Systems Biology uses extensive microarray data obtained at various times after a change in conditions. He said the analysis is "based on the assumption that a transcription factor is correlated with a putative target, with some time delay," which for mammals is about 40 minutes. Many mechanisms contribute to this delay: the lifetime of the messenger RNA, the time to translate, fold, and otherwise modify the protein, its diffusion back into the nucleus, its subsequent degradation, and the time to transcribe the target gene. Shmulevich and his colleagues used this analysis to build a network describing the Toll-like-receptor response in macrophages in mice, augmenting the dynamic data with sequence information on binding motifs.

"The expression of a transcription factor is correlated with that of a putative target, with some time delay."

In his dissection of signaling involving the ErbB family of receptors, Peter Sorger of the Massachusetts Institute of Technology based the network model on known biology of this pathway. "There's a lot of information about the constituents and the way in which they interact," he observed. This prior knowledge is especially important for this pathway, in which much of the signaling occurs through homo- and heterodimers, so that even with "28 gene products, that gives rise to 200–300 species."

The resulting system of equations has hundreds of parameters, Sorger said, and many of their values are not determined from experiments. "Multiple sets of parameters give nearly equivalently good fits to the data," he observed, so "we're going to have to work with models that have a substantial degree of inestimability." Still, although some parameters may never be well determined, they may also have little impact on the biological behavior. In other cases, the observations may require a specific value for a combination of parameters, but not for the individual values, which can be traded off against each other.

The modeling of the ErbB system, Sorger said, helped to explain the response of cancers to the drug Iressa, which targets this pathway. Previous studies had shown that patients with a particular mutation responded particularly well to the drug. The modeling, however, recapitulates disappointing recent findings that other conditions, such as receptor expression levels, also change the drug's effectiveness.

Sorger also noted that many molecules need to be partitioned into distinct populations to reflect their spatial separation. However, rather than requiring a full spatial description with partial differential equations, his team represents this situation by coupling multiple ordinary differential equations, with a separate variable representing each population.

How to begin?

"You always have to have a biological question," said Ioannis Xenarios of the Swiss Institute of Bioinformatics. Although Xenarios and his colleagues use saturating differential equations to model network behavior, they begin network construction, like Sorger, by looking to the known biology. This prior knowledge lets them build known feedback loops into the network topology. They then identify possible steady-state points, and use a Boolean analysis to explore the dynamical behavior near these points.

Imaging of a FACS-sorted cell population shows the heterogeneous response to treatment.

Like many speakers, Xenarios views network generation as part of cycle of research activities, a loop that is completed by experimental validation. He emphasized the power of single-cell techniques in clarifying aspects of pathways that may be obscured in populations. In particular, his team used automated confocal microscopy to view the probe individual cells that fluorescence-activated cell sorting has identified as having a specific chemical marker. In many cases, conclusions drawn from a population of cells will distort the network that researchers infer.

Chris Sander of the Memorial Sloan-Kettering Cancer Center also uses differential equations in which the time variation of expression of each gene varies sigmoidally with the combined expression of other genes. This saturating response "represents, in an empirical way, our ignorance" of the biology that is not included in the model, he said. The resulting networks are mathematically similar to artificial neural networks, and Sander and his colleagues adapted the learning algorithms used for those systems to fit the networks to the data.

"Reverse engineering, taken literally, is just wrong."

Sander cautioned against taking the network models too seriously. "The networks that we're trying to study aren't reality in the way that the Platonic ideal of a chair is sort of an abstract reality. Reverse engineering, taken literally, is just wrong, because there is no engineer." Instead, he said, researchers should use their networks as tools to generate ideas and new experiments. "That's the game. Not reverse engineering: model construction."

Pruning the network

Diego Di Bernardo of the Telethon Institute of Genetics and Medicine used time-series data to look for targets of the transcription factor p63, whose absence causes fatal skin problems in mice. Although time-series information is more detailed than correlation data, he stressed, there are usually far fewer molecules tracked. As a result, when applied to large networks, "We have a dimensionality problem," with far fewer conditions than nodes.

One approach is to reduce the effective network size by analyzing only clusters, Di Bernardo observed. His team used a different approach, enforcing sparseness in a network model of linear differential equations by using singular-value decomposition. They confirmed that the genes that this scheme identified as the most likely targets for p63 were statistically more likely to respond to short-RNA suppression of p63 levels.

Fabio Parisi of the Ecole Polytechnique Fédérale de Lausanne (EPFL) also enforced sparseness on a dynamical model. He stressed, however, that even if the underlying system of differential equations is sparse, the corresponding discrete-time dynamics may not be. He employed adaptive ridge regression to enforce sparseness of the discrete dynamics.

In addition, Parisi argued that most transcription factors are either activators or repressors of all of their targets, at least in lower organisms. Imposing the constraint that all regulatory interactions have the same sign simplifies network generation, he said. Using this method, the team was one of the best predictors for the DREAM Challenge # 3, the five-gene synthetic network.

Evolving a network

Since evolution created biological networks, researchers could "take advantage of the natural process."

Daniel Marbach, also of EPFL, used a novel technique to create new networks. He noted that since evolution created the biological networks in the first place, researchers could "take advantage of the natural process and build a stochastic optimization method by mimicking this natural process of evolution." He implemented a fairly literal evolutionary algorithm, with a sequence that includes both coding regions and noncoding regions, which undergoes various types of mutation. The ability of the resulting network to match observed behavior determined its likelihood to contribute to the next generation.

Marbach suggested that this framework allows researchers to "integrate prior knowledge at a more fundamental level" than other algorithms. This approach was also among the best performers for the five-gene synthetic-network challenge, DREAM Challenge # 3. However, Marbach admitted that it does not appear practical to extend it to very large networks.

Although dynamical data provides important insight into molecular interactions, there is still no agreement on how best to exploit it, or how to capture the dynamical information in the models.


  • Positive or negative correlations between the steady-state levels of various biomolecules under different conditions reveal aspects of the underlying regulation.
  • Choosing the right sets of conditions or perturbations is crucial to clarifying which molecules affect one another.
  • Mutual information measures seek to isolate direct correlations arising from direct interactions from those induced by background changes.
  • The significance of a particular correlation depends strongly on the network context, and accounting for this improves predictions.
  • Large conditional mutual information or synergy between two genes, in terms of how they jointly affect an outcome such as the expression of a third gene, suggests that they lie on a common pathway.
  • Biology features scale-free, highly clustered, small-world networks that have recurring motifs, so mathematically generated networks for assessing algorithms should include these attributes.

Building on prior knowledge

The surge of interest in systems biology was largely launched by the availability of high-throughput data on mRNA expression levels from microarray analysis. Rather than relying solely on this data to infer large networks, however, keynote speaker Ron Shamir of Tel Aviv University suggested "using prior knowledge from the literature in order to help this kind of analysis." But he noted that data described in the literature often lacks the formal structure needed for network models, and is of uneven quality. "Choosing the right data is critical in such situations," he said.

"Choosing the right data is critical."

Shamir and his colleagues model expression in terms of discrete levels, and analyze only steady-state expression. They then assign probabilities to different types of logical interactions. Their procedure, Shamir said, "looks pretty much in the spirit of Bayesian modeling, even though we assume that the underlying logic is discrete. The probability comes in our level of confidence in the logic."

In addition to establishing a network describing the osmotic stress response in yeast, the team sought to expand the network by looking for missing interactions. This is much more challenging, because there are "millions of possible expansions," Shamir noted, so "we have to be very, very careful about overfitting." Nonetheless, by restricting possible expansions to modules controlled by the same transcription factors with the same logic, this "was a relatively clean process, in spite of the huge space of possible solutions." Many of the predictions were experimentally validated.

Mutual information

Although prior information can be powerful, it is also important to understand how good networks can be when they are derived exclusively from high-throughput data. Andrea Califano of Columbia University reviewed the ARACNe algorithm that he and his coworkers developed for this purpose a few years ago. Rather than simply relying on the correlation in expression of two genes, this technique isolates the mutual information between them, by correcting for background correlations. The method exploits the fact that information transfer always decreases along a chain of interactions. "By using this inequality you can actually eliminate the vast majority of indirect interactions, and get networks that are realistic," Califano commented.

Califano described several transcription factors for which ARACNe found both known and novel targets, including MYC in B cells and PBX1 in addiction-prone rats. In both cases, a large fraction of the targets were validated in the lab.

Trying to find targets for NOTCH1, however, showed "some of the limitations of the approach," Califano said, since the simple-minded use of ARACNe yields "absolute nonsense." He noted that this protein affects transcription only after γ-secretase cleaves off a small peptide that is the true transcription factor. Nonetheless, he and his colleagues developed a "metagene" that combines the levels of several transcripts to predict the activity of NOTCH1 in the nucleus. Using this metagene as the hub in ARACNe was "very successful," he observed.

A larger context

Other researchers described different ways to mine the correlation data. Tim Gardner's team at Boston University, for example, used a variable threshold for assigning significance to correlations, depending on the local network context. By analogy, Gardner joked, a tuft of hair that is disappointingly small on a balding head becomes much more significant in a bowl of soup.

The team tested their algorithm, called CLR (for context-likelihood relatedness), on a compendium of more than 3000 interactions from the RegulonDB network for the bacterium, Escherichia coli. They found that the context-weighted method significantly improved the precision of their results relative to other algorithms (including ARACNe). But Gardner acknowledged that "some of this success may reflect the simplicity of the network we looked at." Furthermore, none of the techniques maintained high precision when the researchers increased the sensitivity enough to identify a significant fraction of the interactions.

"You want to establish as many physiologically distinct conditions as possible."

Gardner reviewed earlier work from his group that used the network response as a filter, to distinguish the direct targets of a drug from the many other genes whose expression the drug changed indirectly. A major challenge, however, is to get enough independent measurements to constrain the network topology. "You want to establish as many physiologically distinct conditions as possible," Gardner observed. In addition, like other researchers at the conference, his group enforces sparseness by assessing a "cost" for extra connections in the model network.

An earlier static algorithm developed by Diego Di Bernardo and Gardner used a greedy algorithm to solve a set of linear equations, including enforced sparseness. Di Bernardo and his colleagues, including Irene Cantone at the Telethon Institute of Genetics and Medicine, used this NIR algorithm (for Network Identification by Reverse Engineering) to perform best in DREAM2 Challenge # 4, the computer-generated network InSilico1.

Working together

Dimitris Anastassiou of Columbia University extended the CLR algorithm. He introduced the idea of synergy between two genes, with respect to some outcome, as the change in the mutual information when both genes are expressed compared with the sum of their individual effects. (This idea is related in spirit, but is not the same as the "conditional mutual information" described by Andrea Califano). When applied to expression data, the relevant outcome is the expression of a candidate "partner" gene. High synergy (which can be positive or negative) identifies triplets that are very likely to be part of a common pathway. This algorithm was the best predictor for DREAM2 Challenge # 5, the genome-scale network derived from RegulonDB.

Kathleen Marchal of Katholieke Universiteit Leuven also described the combinatorial influences of multiple transcription factors. Her team combines published expression data with genome-wide screening for regulatory sequence motifs. They use a hierarchical algorithm to identify modules whose regulation changes significantly under different conditions. At the module level, she observed, regulatory complexity is lower than it is at the level of individual genes. But she noted a role for "connector" genes that are shared by multiple modules whose other regulators are different.

As illustrated by these examples, techniques for extracting networks from static expression data continue to improve. In addition, many researchers are developing ways to use the dynamic response to perturbations to further illuminate the underlying interactions.

Generating artificial networks

Assessing reverse-engineering algorithms requires gold-standard networks whose connections are completely known. Such knowledge is hard to come by in real biological networks, and even harder to keep under wraps once it is found. An alternative is to artificially generate networks on computers. Pedro Mendes of Virginia Tech used this approach to generate the three networks used for DREAM2 Challenge # 4.

Barbara Di Camillo, of the University of Padua, described a tool called netsim that generates transcriptional networks that reproduce key features of biological networks. One feature is the distribution of the connectivity degree: how likely a node is to be connected to various numbers of neighbors. "Biology networks are known to have a scale-free distribution of connectivity degree," Di Camillo observed, meaning that this probability varies as a power of the number of connections. The power law implies that most nodes are connected to at most a few other nodes, but that a handful of "hub" nodes have many, many neighbors.

Biological networks also typically have the "small-world" property, in which two nodes are separated by only a few sequential links. Finally, biological networks tend to have a high clustering coefficient, so that neighbors of a node are more likely to be connected to one another than are random pairs of nodes.

Random networks, scale-free networks, and geometric networks can each produce only some of these properties, Di Camillo said. "No one of these models is able to simultaneously reproduce all of the observed features." In contrast, hierarchical modular networks simultaneously display all three properties. These networks can be built by iteratively, grouping first nodes, and then clusters of nodes, into known network motifs, while enforcing the scale-free condition.


  • Even with a presumed-correct gold-standard network, evaluating the performance of reverse-engineering algorithms is not simple.
  • Many biological networks are sparse, so true positive pairs, which actually interact, are much rarer than negatives, which do not.
  • Algorithms generally have a tradeoff, identifying more true interactions only at the cost of including a greater fraction of false ones.
  • The "ROC" curve compares the true-positive rate to the false-positive rate, while the "precision-recall" curve compares the fraction of correct positive predictions with the false positive rate.
  • The area under these curves gives global measures of the effectiveness of an algorithm.
  • DREAM2 adopted a combined metric that combines the effectiveness as measured by both types of curves.

Many at once

The goal of reverse engineering is to reconstruct the underlying biological network. The DREAM project aims to assess success by reconstructing networks whose connections are perfectly known—the "gold standard." Comparing the inferred network with the gold standard defines true positives (TP) and negatives (TN), but also incorrectly predicted links or false positives (FP) and incorrectly predicted absence of links or false negatives (FN). One challenge is how to rank different schemes. There is always a tradeoff: biasing the classification to deliver more true positives will also generate a larger fraction of false positives.

The DREAM team evaluated this tradeoff using two distinct types of curve: the "ROC" curve (unhelpfully named after the "receiver operating characteristic" used in classifying radar signals) and the "precision-recall" curve. In both types of plot, one axis is the "recall" or "true positive rate," which is the fraction of actual positives that the algorithm correctly identifies (TP/(TP+FN)).

In the ROC curve, this true positive rate is plotted on the vertical axis, while the horizontal axis is the false positive rate (FP/(FP+TN)), the fraction of actual negatives that the system identifies as positive. (This is equal to one minus the specificity, which is the fraction of actual negatives that the system correctly identifies.)

One problem with the ROC curve occurs when interactions are sparse, as is common in biological networks. In this case, actual positives make up a small fraction of the total. Predicting that every possible interaction is negative then appears—misleadingly—to be an effective strategy, like predicting no rain in Los Angeles.

In the precision-recall curve, the true-positive rate (now called recall) is on the horizontal axis, while the vertical axis is "precision," the fraction of positive classifications that are correct (TP/(TP+FP)). The precision-recall plot can be more useful for sparse interactions, although the differences are subtle.

In both cases, the horizontal and vertical axes range from zero to one. It is customary to compute the area under each curve. An ideal classifier would capture all true interactions and no false ones, so the area would be one, the entire area in the square.

If some interactions are misclassified, the area falls below one. For each curve, the DREAM evaluators calculated an area, which sometime necessitated careful interpolation. They then computed the associated p-value, which is the probability that a random classifier could do as well. Finally, they computed a combined metric as minus the base-10 logarithm of the product of the two p-values. For example, if both p-values were at the common significance threshold of 0.05, the metric would be −log10(0.05×0.05) = 2.6; more significant results would yield a higher value for the DREAM2 metric.

This metric is less useful when there are only a few predictions. For this reason, the evaluators also tracked the precision after the first, second, and subsequent correctly predicted interaction.

Should reverse engineering algorithms be evaluated by comparing predicted network topology to that of gold standards, or by comparing the predicted behavior of these networks to observations?

Is there an objective way to assess algorithms that incorporate informal biological knowledge?

How can researchers choose the sets of experimental conditions that are most informative with respect to the network?

If researchers seek to predict the results of experiments instead of network models, how could their predictions be assessed quantitatively?

What are the best ways to quantify which features of a network model are known most confidently?

When is the DREAM goal of determining "the" network unrealistic, because more than one network explains the available data?

How can researchers know when a heterogeneous population of cells is skewing the network they infer?

Can reverse engineering succeed when critical biological mechanisms of regulation are still not understood?

How many transcription factors have thousands of targets, like some oncogenes do?