• Single-Cell Level Systems Biology

    Single-Cell Level Systems Biology

    Speakers: Grégoire Altan-Bonnet (Memorial Sloan-Kettering Cancer Center), Narendra Maheshri (Massachusetts Institute of Technology), Johan Paulsson (Harvard Medical School), and Chris Wiggins (Columbia University)
    Organizers: Andrea Califano (Columbia University), Manuel Duval (Network Therapeutics Inc.), Aris Economides (Regeneron Pharmaceuticals), Gustavo Stolovitzky (IBM Research), and Jennifer Henry (The New York Academy of Sciences)
    Presented by the Systems Biology Discussion Group
    Reported by Kahn Rhrissorrakrai | Posted April 17, 2012


    The remarkable advances in our ability to measure an organism's gene and protein expression states at a very granular level have challenged researchers to discover more nuanced models of cellular dynamics. On January 17, 2012, the Systems Biology Discussion Group gathered at the New York Academy of Sciences for the Single-Cell Level Systems Biology symposium. Speakers discussed the use of computational models for understanding single-cell dynamics and focused in particular on how the field should handle the technical and biological noise, or variability, when measuring expression levels of cellular components.

    Grégoire Altan-Bonnet, from Memorial Sloan Kettering Cancer Center, began the evening by describing new ways of using flow cytommetry, e.g. Fluorescence-Activated Cell Sorting (FACS), to characterize signal transduction pathways. With the aim of understanding how kinases and STAT proteins (Signal Transducer and Activator of Transcription proteins) are activated, his group used their FACS system to perform single-cell phosphorylation profiling (or phospho-profiling). They assessed the variability of measured levels of labeled proteins, such as CD8, in large populations of T-cells, and they identified individual cells with significant differences in expression levels. His group found a 105 range in the activation threshold—the protein expression level needed to begin a reaction—of CD8 to its ligand, SHP-1, in an isogenic population of T-cells. This amount of variability, or noise, suggested that a model of T-cell activity based on average expression levels would be insufficient.

    Others had previously found that the broad range in interleukin 2 receptor α (IL-2Rα) levels in CD8+ T-cells revealed different CD8+ T-cell fates. To understand the precise relationship between this variability and cell fate, his group developed the software package ScatterSlice to analyze dose-dependent phosphorylation responses. They discovered that T-cells from the same source showed differential sensitivity to the ligand IL-2 because of the varying levels of the receptor IL-2Rα. The group implemented a Bayesian inference method to model these differences. This Bayesian method provided the necessary 'wiggle room' in the model parameters to account for the observed variability.

    Altan-Bonnet then discussed how his studies of "noise" in cell-signaling dynamics could be applied to network structure inference, a method of probing how different cell-signaling components and pathways interact. He looked at the fluctuation in protein kinase C (PKC) activity upon activation by phorbol 12-myristate 13-acetate (PMA) by measuring expression variation of the mitogen-activated kinases MEK and ERK. His single-cell analysis used the propagation of noise to calculate the correlation of protein fluctuations. With this method, he was able to test multiple models of the ERK activation cascade to discover which model recapitulates the observed behavior. Through this systems analysis, they found it was likely ERK activation contained an additional pathway member plus a separate activation pathway leading from PMA through an intermediary to ERK.

    Next, Johan Paulsson from Harvard Medical School described the advantages and pitfalls of relying heavily on a single fluctuation model of the cell. Researchers normally generate schematics of molecular interactions and try to fit equations such that fluctuations and variations in the data are recapitulated. While this approach has found success, Paulsson identified its two major problems: 1) the data are not of sufficient resolution to distinguish between specific models, and 2) the uncertainty introduced in the underlying assumptions goes unaddressed. His alternative strategy was to make a few, very specific assumptions and to leave the remainder of parameters in a "cloud" with no assumptions. While using this "cloud" approach prevents discovering what is precisely occurring within it, it has the feature of eliminating unlikely hypotheses based on measurements of input and output signals, like protein and mRNA expression levels. This method, like Altan-Bonnet's Bayesian analysis, captured the observed fluctuations without numerous assumptions.

    Paulsson was then able to infer mechanisms of the conditional independences of measured factors. For instance, in his model of the cell, the effect of a cellular network could be seen in the dynamics of network elements, such as proteins, that were accounted for by conditions placed on network mRNA. They also found that they could use fluctuations in mRNA transcript levels to differentiate between extrinsic and intrinsic system variables, i.e., variables that can be explained by an element external to a particular signaling or transcription network or variables that cannot. If, for example, two factors co-vary, there is an extrinsic factor equal to their covariance over time. With respect to cellular dynamics, models that only consider intrinsic factors break down because of the enormous number of possible interactions in the cell. His group controlled for this dimensionality by designing experiments to distinguish extrinsic and intrinsic variance. These experiments included calculating the conditional average of 100 copies of a fluorescent reporter as the extrinsic value while evaluating one reporter many times for the intrinsic value. To avoid systematic biases in experimental design, he advocated making measurements by two methods. Here they compared changes in the system when an upstream component was tagged or untagged, thereby revealing whether the tag affected the system. In fact, when they compared the protein degradation of tagged and untagged substrates in E. coli, they found that the tag did indeed affect the system. The team discovered that fluorescing spots observed in E. coli and believed to be novel were actually artifacts of avidity-induced fusions between fluorescent proteins.

    By treating most cellular processes as a 'cloud' that can not be known, Paulsson was able to represent it with a single function and use only specific parameters for a few variables, like mRNA or protein expression levels, to describe the cellular dynamics. (Image courtesy of Johan Paulsson)

    Previous work has suggested that nature has found a nearly optimal solution for representing information flow through noisy channels, such as through signal transduction pathways. This solution led Chris Wiggins of Columbia University to ask two questions: what is the informational capacity of a regulatory cascade with limited numbers of cascade components, and given oscillatory driving (regular fluctuations in cascade input levels), what is the best way to get an optimal response? Both questions were difficult to answer mathematically because the cascade response probability distribution, or likelihood that any specific response will be observed, was unknown. Wiggins's group overcame this limitation by using a "master equation" to describe how probability flows forward in time through the creation/destruction of pathway components, and a Fourier transform of the equation could provide the rates of change. This approach was appealing because it yielded a simple eigenfunction that could serve as bases to model regulatory cascades.

    The group developed SpecMark, named after the software's spectral Markov method, to perform the above analyses on simple regulatory relationships. SpecMark is several orders of magnitude faster than the more commonly used directed Markov Chain Monte Carlo method while both methods have comparable accuracy. Next, the group looked at the bimodal outputs from regulatory cascades, such as those governing the expression of the bicoid and hunchback maternal effect genes in Drosophila, to find if information increased by moving from a simple high/low cascade activation level model to one with more intermediate levels; it did not. For these regulatory cascades, they used a Markovian approximation to understand the propagation of noise as cascade length grows, while assuming that immediate interactions between cascade components were local and while assuming, as Paulsson did, that these transitions could be approximated with a single function. His group found that noise did propagate but that it did not dramatically increase with length. Additionally, bimodal expression levels were rarely seen in cases of down-regulation, though they were more apparent in up-regulating cascades where there was higher informational capacity—a greater ability to continue to propagate the signal—because the system was not required to make as many components.

    For oscillatory driving, he found measuring the downstream cascade component, or child, more informative in cases where the upstream component, or parent, was regulated by an oscillating input. They found parent and child levels were synchronous with slow oscillations, but as oscillation input speed increased, asynchrony grew, resulting in a lack of coordination of different steps in the transcriptional regulation process. Furthermore, optimal input oscillation frequency appeared as a function of the quantities of cascade components present. Thus single cells are limited in their ability to transduce information because of low numbers of these components. To account for this limit, it would necessary to solve for the probability distribution of different transcriptional outcomes. Their method is able to solve for the distribution while accounting for low cascade component quantities.

    Spectral methods, such as SpecMark, are much faster than directed simulations, such as MCMC, at describing the response of pathway components in a regulatory cascade, here the relationship between n and m. (Image courtesy of Chris Wiggins)

    Closing the evening, Narendra Maheshri, from the Massachusetts Institute of Technology, discussed fluctuations in gene expression as they related to two epigenetic switches: a feedback loop mediated by a trans-acting transcription factor (TF) binding to a promoter, and a cis-encoded switch based on multiple factors binding a single promoter. First, he probed whether cis-encoded epigenetic switches could lead to bimodal gene expression based on two-state (ON and OFF) switching rates. His group looked at the FLO gene family of yeast ADHESIN proteins. FLO11, with its two transcriptional states, was fluorescently labeled and recorded over 10 minutes so that the group could monitor the switching of transcriptional states. They found an exponential distribution of the waiting times (or delays) between changing from one transcriptional state to another that supported a two-state model for inferring state transitions. The complexity of the FLO11 promoter then prompted questions about how activators turn on gene expression. Investigating these issues, they determined that activators could be placed into three classes: those that stabilize the active state, those that destabilize the inactive state, and those that do intermediate weak activation. Maheshri suggested these three states were the result of non-coding RNAs mediating loop and chromatin structures to facilitate state changes.

    He went on to describe how loops could give rise to 'all-or-none' responses through non-linear expression resulting from cooperative binding of TFs. The bursting state model yields a negative binomial distribution that captures the normally inactive state of gene expression and the relatively infrequent bursts of activity. For a bimodal pattern, burst timing—periods of transcriptional activity—must be on the same time scale as protein lifetimes since the transition from the active to the inactive state is contingent on protein degradation occurring before the next burst. For the reverse process, the transition from inactive to active, the burst strength must be high enough to trigger the positive feedback. Using Fluorescence in situ hybridization, Maheshri and colleagues experimentally verified this model in the Tet system for controlling transcriptional activation and confirmed that having more binding sites was noisier than a single site, a single binding site in a promoter with a highly variable response to binding (a so-called "noisy" promoter) could yield a bimodal response, and if the activator is unstable then stabilization of the TF would change a bimodal response to a graded one. Maheshri's group thus demonstrated the feasibility of generating a bimodal response from a system without the need for external factors to actively maintain different transcriptional states thereby revealing additional mechanisms nature can use to generate multiple activity states.

    Use the tab above to find multimedia from this event.

    Presentations available from:
    Grégoire Altan-Bonnet, PhD (Memorial Sloan-Kettering Cancer Center)
    Narendra Maheshri, PhD (Massachusetts Institute of Technology)
    Johan Paulsson, PhD (Harvard Medical School)
    Chris Wiggins, PhD (Columbia University)

    Log in or Join Now to continue