High-resolution temporal profiling of E. coli transcriptional response
Miano, A., Rychel, K., Lezia, A. et al. Nature Communications 14, 7606 (2023). https://doi.org/10.1038/s41467-023-43173-7
High-resolution temporal profiling of E. coli transcriptional response
Miano, A., Rychel, K., Lezia, A. et al. Nature Communications 14, 7606 (2023). https://doi.org/10.1038/s41467-023-43173-7
Understanding how cells dynamically adapt to their environment is a primary focus of biology research. Temporal information about cellular behavior is often limited by both small numbers of data time-points and the methods used to analyze this data. Here, we apply unsupervised machine learning to a data set containing the activity of 1805 native promoters in E. coli measured every 10 minutes in a high-throughput microfluidic device via fluorescence time-lapse microscopy. Specifically, this data set reveals E. coli transcriptome dynamics when exposed to different heavy metal ions. We use a bioinformatics pipeline based on Independent Component Analysis (ICA) to generate insights and hypotheses from this data. We discovered three primary, time-dependent stages of promoter activation to heavy metal stress (fast, intermediate, and steady). Furthermore, we uncovered a global strategy E. coli uses to reallocate resources from stress-related promoters to growth-related promoters following exposure to heavy metal stress.
Illustration of the different steps of our data analysis pipeline. The analysis starts with raw fluorescence data which are processed with a background signal removal algorithm, normalized by promoterless strains, and smoothed by median filtering. The log2 is taken to convert to fold change, and the data is formatted as a matrix of genes versus conditions (heavy metal inductions). ICA is applied to this matrix to obtain the M (promoter coefficients) and A (activity coefficients) matrices respectively.Â
By applying ICA to this data set, we demonstrated the importance of time-dependent analysis in providing insights into the dynamic nature of gene expression in response to environmental stressors. By splitting different heavy metal inductions into separate 40 minute time windows, we were able to apply ICA to time-series transcriptomic data for the first time. We observed the richest response for the zinc inductions. Specifically, we found four different iModulons which differentiate fast responders (genes that are activated at the start of the induction window) from intermediate responders (genes that are maximally active in the middle of the induction window), steady responders (genes whose expression steadily increases throughout the induction window) and partially steady responders (genes that steadily increase over time until they are repressed in the last window). The zinc data demonstrates the ability of this analysis method to resolve the activation sequence of promoters involved in the same metabolic pathway. We detect the activation of promoter narZ as an early responder and promoter nrfE as a late responder which are involved in the first and second step of dissimilatory nitrate reduction to ammonium metabolic pathway. This result is a clear example of the power of this platform when used for metabolic pathway reconstruction which is a topic of great interest in systems biology70.
Lastly, this study expands our understanding of the recovery process E. coli following the removal of stressors such as heavy metals. Our findings indicate a marked shift in the cellular functions of enriched promoters during and after heavy metal induction. We quantified a transition from the activation of promoters associated with stress defense mechanisms and detoxification processes to the activation of promoters involved in ribosome biosynthesis, tRNA synthesis, mRNA processing and decay, amino acid biosynthesis, and replication.While the library of 1805 E. coli promoters employed in this study represents the most comprehensive collection currently available and allows for substantial insights into gene expression dynamics, we acknowledge that it encompasses less than half of the total known promoters in E. coli. This limitation signifies that our analysis might not fully capture all the potential regulatory mechanisms at play, and we advocate for future studies to incorporate a more expansive promoter library to offer a more encompassing view of E. coli gene expression.