Genetically diverse uropathogenic Escherichia coli adopt a common transcriptional program in patients with UTIs

  1. Anna Sintsova
  2. Arwen E Frick-Cheng
  3. Sara Smith
  4. Ali Pirani
  5. Sargurunathan Subashchandrabose
  6. Evan S Snitkin
  7. Harry Mobley  Is a corresponding author
  1. University of Michigan, United States
  2. Texas A&M University, United States

Abstract

Uropathogenic Escherichia coli (UPEC) is the major causative agent of uncomplicated urinary tract infections (UTIs). A common virulence genotype of UPEC strains responsible for UTIs is yet to be defined, due to the large variation of virulence factors observed in UPEC strains. We hypothesized that studying UPEC functional responses in patients might reveal universal UPEC features that enable pathogenesis. Here we identify a transcriptional program shared by genetically diverse UPEC strains isolated from 14 patients during uncomplicated UTIs. Strikingly, this in vivo gene expression program is marked by upregulation of translational machinery, providing a mechanism for the rapid growth within the host. Our analysis indicates that switching to a more specialized catabolism and scavenging lifestyle in the host allows for the increased translational output. Our study identifies a common transcriptional program underlying UTIs and illuminates the molecular underpinnings that likely facilitate the fast growth rate of UPEC in infected patients.

https://doi.org/10.7554/eLife.49748.001

Introduction

Urinary tract infections (UTIs) are among the most common bacterial infections in humans, affecting 150 million people each year worldwide (Flores-Mireles et al., 2015). A high incidence of recurrence and frequent progression to chronic condition exacerbates the negative impact of UTIs on patients’ quality of life and healthcare cost (Foxman, 2010). Despite the magnitude of the problem, treatment remains limited by a strain’s susceptibility to available antibiotics, which are often ineffectual (Albert et al., 2004; Nickel, 2005; Sihra et al., 2018).

The major causative agent of uncomplicated UTIs is Uropathogenic Escherichia coli (UPEC), which is responsible for upwards of 70% of all cases (Flores-Mireles et al., 2015). The majority of our insights into UPEC pathogenesis have been obtained through in vitro assays, cell culture systems, and animal models (Alteri et al., 2009; Alteri and Mobley, 2015; Sivick and Mobley, 2010; Subashchandrabose and Mobley, 2015). While these studies have identified virulence and fitness factors that are important for UPEC infection, how these studies translate to human infection is not clear. As a result, we do not yet have a complete understanding of UPEC physiology in the human urinary tract. Moreover, the genetic heterogeneity of UPEC isolates, which carry diverse and functionally redundant virulence systems including iron acquisition, adherence, and toxins, further complicates our understanding of uropathogenesis (Johnson et al., 1998; Johnson et al., 2001; Köhler and Dobrindt, 2011; Schreiber et al., 2017; Takahashi et al., 2006). The different constellations of virulence factors and diverse genetic backgrounds raise the question of whether different UPEC strains vary in their strategies for pathogenesis.

Since defining conserved UPEC characteristics have proven elusive to comparative genomics strategies, we hypothesized that comparing functional responses in the context of the host may uncover disease-defining features. To that end, we examined UPEC gene expression directly from 14 patients with documented significant bacteriuria and presenting with uncomplicated UTI and compared it to the gene expression of the identical strains cultured to mid-exponential stage in filter-sterilized pooled human urine. Despite the genetic diversity of the pathogen and the human hosts, we identified a remarkably conserved gene expression program that is specific to human infection and strongly supports previous findings of extremely rapid UPEC growth rate during UTI (Bielecki et al., 2014; Burnham et al., 2018; Forsyth et al., 2018). Importantly, we show that this transcriptional program is recapitulated in the mouse model of infection and propose a mechanism by which the fast growth rate can be achieved. Based on extensive analysis, we propose a model where UPEC shut down all non-essential metabolic processes and commit all available resources to rapid growth during human UTI. Critically, our discovery of a common transcriptional program of UPEC in patients significantly expands our understanding of bacterial adaptation to the human host and provides a platform to design universal therapeutic strategies.

Results

Study design

To better understand UPEC functional responses to the human host, we isolated and sequenced RNA from the urine (stabilized immediately after collection) from fourteen otherwise healthy women diagnosed with UPEC-associated urinary tract infection. To identify infection-specific responses, we cultured the same fourteen UPEC isolates in vitro in filter-sterilized human urine (mid-exponential phase, 2 hr time point in Figure 1—figure supplement 1), and isolated and sequenced RNA from these cultures (study design and quality control is described in detail in Methods section). Phylogenetic analysis showed a high degree of genetic diversity, as we identified strains belonging to three distinct phylogroups, 13 different sequence types, and 13 distinct serogroups (Figure 1—figure supplement 2, Table 1, Table 2). The majority of UPEC isolates (10 of 14) belonged to the B2 phylogroup, which is consistent with previously published studies (Foxman, 2010; Schreiber et al., 2017). Although the majority (10 of 14) of patients had a previous history of UTIs, we found no relationship between patients’ previous UTI history and bacterial genotype (Figure 1—figure supplement 2). Moreover, the 14 clinical isolates showed a wide array of antibiotic resistance phenotypes (Figure 1—figure supplement 2).

Table 1
Sequence type for 14 clinical UPEC isolates
https://doi.org/10.7554/eLife.49748.002
StrainSequence typeAdkfumCgyrBIcdMdhpurArecA
HM01692135276554
HM03101434115181176
HM0613153404713362829
HM07641*9633*1312487
HM14Novel64416241314
HM17733624913171125
HM43Novel*40*1419361710203
HM54404*14*14101417774
HM5653813401913362830
HM57733624913171125
HM60648924879670582
HM66801324191423110
HM68998135215614172517
HM8612713141936231110
Table 2
In silico determined serotypes for 14 clinical UPEC strains
https://doi.org/10.7554/eLife.49748.003
StrainH_typeO_type
HM01H4O25
HM03H21NA
HM06H4O25
HM07H45O45
HM14H10O8
HM17H1O6
HM43H23NA
HM54H5O75
HM56H4O13/O135
HM57H1O2/O50
HM60H10O102
HM66H7O7
HM68H6O2/O50
HM86H31O6

Virulence factor expression is observed both during urine culture and human infection

We first assessed the virulence genotype of the fourteen UPEC strains by looking at the presence or absence of a comprehensive list of known virulence factors, including adhesins, toxins, iron acquisition proteins, and flagella (Johnson et al., 2001; Johnson and Stell, 2000; Köhler and Dobrindt, 2011; Schreiber et al., 2017; Subashchandrabose and Mobley, 2015; Tarchouna et al., 2013) (Figure 1A). As previously reported (Schreiber et al., 2017), B1 strains appear to carry fewer virulence factors overall when compared to B2 strains, suggesting that UTIs can be established by UPEC strains with vastly diverse virulence genotypes. We then compared the levels of gene expression of these virulence factors following culture in filter-sterilized urine (Figure 1B, Figure 1—figure supplement 3) to that during infection. As expected, we detected expression of genes involved in iron acquisition during both in vitro urine culture and human UTI (Figure 1B). However, we also observed high strain-to-strain variability in gene expression, especially for hma, iutA, iucC and fyuA, which is consistent with previous reports (Subashchandrabose et al., 2014).

Figure 1 with 5 supplements see all
Clinical UPEC isolates carry a highly variable set of virulence factors.

Phenotypic and genotypic information about the strains can be found in Figure 1—figure supplement 1, Figure 1—figure supplement 2, Table 1, and Table 2. (A) Clinical UPEC isolates were examined for presence of 40 virulence factors. Virulence factors were identified based on homology using BLAST searches (≥80% identity,≥90% coverage). The heatmap shows presence (black) or absence (white) of virulence factors across 14 UPEC strains. Hierarchical clustering based on presence/absence of virulence factors shows separate clustering of B1 isolates. (B) Log2 TPM for iron acquisition genes (top panel) and adhesins (bottom panel) in urine and patient samples. Gene expression of other virulence factors is shown in Figure 1—figure supplement 3. Correlations of virulence factor expression among in vitro and patient samples is shown in Figure 1—figure supplement 4. (C) Log2 TPM of fim (top panel) and flg (bottom panel) operons across the 14 UPEC strains during in vitro urine culture and human UTI.

https://doi.org/10.7554/eLife.49748.004

Most of the adhesin genes were expressed at very low levels both during in vitro culture and infection, with the exception of fim genes (Figure 1B). Interestingly, we observed high variability in fim and flg operon expression between patients (Figure 1C). In the majority of the cases, we detected high levels of fim operon expression (9/14) and low levels of flg operon expression (12/14). However, in the sample collected from patient HM07, we observed high levels of both fim and flg expression, potentially indicating a mixed population of both motile and adherent bacteria present in the sample. Overall, the variability in the expression of adhesin and motility machinery might suggest different stages of infection.

Other virulence factors examined were expressed at either similar or lower levels during human UTI compared to in vitro urine cultures (Figure 1—figure supplement 3). Notably, virulence factor carriage varies greatly between UPEC strains and we did not discern any infection-specific gene expression among the virulence factors we examined (Figure 1—figure supplement 4).

The UPEC core genome exhibits a common gene expression program during clinical infection

Since patient samples contained fewer bacterial reads compared to in vitro controls, we first performed a rigorous quality assurance analysis, which indicated that we possessed sufficient sequencing depth for downstream analyses (Table 3, Table 4, Figure 2—figure supplement 1, Figure 2—figure supplement 2, see Materials and methods for details). Next, to perform a comprehensive comparison of gene expression between the different clinical UPEC strains, we identified a set of 2653 genes present in all 14 UPEC strains in this study as well as the reference E. coli MG1655 strain (hereafter referred to as the core genome). We then compared the gene expression correlation of the core genome to that of the accessory genome (i.e., 2219 genes that were present in at least two but not all of the clinical UPEC strains) for all 14 isolates cultured in vitro in filter-sterilized urine. As expected for bacterial strains cultured under identical conditions, we saw high correlation of gene expression between any two isolates cultured in vitro irrespective of whether these genes were part of the core or accessory genome (Figure 2A). Remarkably, we also observed a high degree of gene expression correlation for the core genome, but not the accessory genome, across all 14 patient samples (Figure 2B). This suggested the expression of core genes is conserved during human UTI, while expression of accessory genome might be more reflective of the specific conditions during each infection. Furthermore, the gene expression correlation within urine samples (Figure 2C, Figure 2D, median correlation 0.92, URINE:URINE), and within patient samples (Figure 2C, Figure 2D, median correlation 0.91, PATIENT:PATIENT) was considerably higher than the gene expression correlation between in vitro urine and patient samples (Figure 2C, Figure 2D, median correlation 0.73, URINE:PATIENT). The gene expression correlation between in vitro and patient samples remained low, even when we directly compared identical strains (i.e. HM56 cultured in vitro in urine vs. HM56 isolated from the patient) (Figure 2C, Figure 2D, median of 0.74, URINE:PATIENT:matched). This analysis suggested that UPEC adopt an infection-specific gene expression program that is distinct from UPEC undergoing exponential growth in urine in vitro. Finally, we independently confirmed this observation using principal component analysis (PCA), which revealed that patient samples form a tight cluster, distinct from in vitro cultures (Figure 2E), demonstrating the common transcriptional state of UPEC during human UTI.

Table 3
Summary of alignment statistics (% mapped).
https://doi.org/10.7554/eLife.49748.016
Sample:Total
reads
Mapped
reads
% Mapped% Mapped
to CDS
% Mapped
to misc_RNA
% Mapped
to rRNA
% Mapped
to tRNA
% Mapped
to sRNA
% Mapped
to tmRNA
HM01 | UR172884191648032695.374.915.510.010.2610.25.49
HM01 | UTI18496607371704020.180.443.3600.513.422.45
HM03 | UR21354719209275419877.774.7800.369.495.21
HM03 | UTI16544044805907648.780.182.4500.862.231.35
HM06 | UR233598472284737497.878.723.9600.336.33.23
HM06 | UTI5799351947090928.176.942.6200.361.550.87
HM07 | UR213122242098047398.475.26.0200.1910.324.79
HM07 | UTI708046882097350373.714.1400.62.080.77
HM14 | UR219273022153381798.276.135.3300.159.975.16
HM14 | UTI159447621296821881.380.512.2100.462.251.5
HM17 | UR197902151936029497.877.414.2900.137.023.32
HM17 | UTI2387458518425837.774.354.1400.732.731.6
HM43 | UR185414841823982698.476.545.0300.219.074.76
HM43 | UTI5830685981385591480.382.7600.373.952.38
HM54 | UR216125812116254497.974.964.130.010.127.174.06
HM54 | UTI1800084363019983577.333.050.010.521.540.98
HM56 | UR174941351713084797.977.934.0900.097.143.56
HM56 | UTI254087551493594858.879.412.5900.581.981.17
HM57 | UR192530781896674898.577.074.8500.088.263.86
HM57 | UTI1056298169267950.971.484.200.652.631.5
HM60 | UR158980451565191698.576.354.1400.097.474.05
HM60 | UTI76149837764255170.693.7600.71.841.04
HM66 | UR171840181673606697.474.154.9300.129.535.28
HM66 | UTI25954183798590.365.412.7100.461.420.67
HM68 | UR158416391556271198.278.312.8400.146.033.67
HM68 | UTI6541393124010893.773.114.800.834.582.73
HM86 | UR150196691460634697.276.064.0900.166.993.54
HM86 | UTI10667404641379460.178.332.800.773.081.62
Table 4
Summary of alignment statistics (raw counts).
https://doi.org/10.7554/eLife.49748.017
Sample:CDSmisc_RNArRNAtRNAsRNAtmRNA
HM01 | UR123459339079001504434351680592905367
HM01 | UTI29898891247441431913312698591056
HM03 | UR16274560999727447618119858851090263
HM03 | UTI64617811974332469006179905109081
HM06 | UR1798517490428743761601439268738927
HM06 | UTI362318112342823170157287340864
HM07 | UR1577698612622361773936321655371005391
HM07 | UTI15460608676130126814370816065
HM14 | UR163934711148443863262521461801110769
HM14 | UTI104410622864905059823291189194198
HM17 | UR1498623783064748248651358261642452
HM17 | UTI13700477622715134945027329443
HM43 | UR1396027691683621374501653607867656
HM43 | UTI65418102250032930200321597194030
HM54 | UR158639338734141662253261517844858505
HM54 | UTI4873058192289353329329732161939
HM56 | UR1334957670131378156971222601609922
HM56 | UTI118608353868455286723295607175048
HM57 | UR14617905919256157150691567276732845
HM57 | UTI662515389101360572434013929
HM60 | UR1194973164730662136011169464633959
HM60 | UTI54021528718115361140627958
HM66 | UR1240969382558351193231595303884439
HM66 | UTI52232216103661137534
HM68 | UR121870244423122222226938831571220
HM68 | UTI1755457115276161997011005265627
HM86 | UR11110009597368551234241021292517105
HM86 | UTI50238031798234649276197828103919
Figure 2 with 3 supplements see all
Core genome expression in patients is highly correlated.

The analysis details are described in Materials and methods, and figure supplements. (A)-(B) Histogram of Pearson correlation coefficients among all samples cultured in vitro (A) or isolated from patients (B) based either on core genome or accessory genome comparisons. Accessory genome includes genes that were found in at least two but fewer than 14 of the clinical isolates. (C) Correlations among in vitro and patient samples measured by Pearson correlation coefficient of normalized gene expression plotted according to hierarchical clustering of samples. (D) Pearson correlation coefficient among all samples cultured in vitro (URINE | URINE, median = 0.92), among all samples isolated from patients (PATIENT | PATIENT, median = 0.91), between samples cultured in urine and samples isolated from patients (URINE | PATIENT, median = 0.73), and between matching urine/patient samples (ex. HM14 | URINE vs HM14 | PATIENT), (URINE | PATIENT:matched, median = 0.74). (E) Principal component analysis of normalized gene expression of 14 clinical isolates in patients and in vitro urine cultures shows distinct clustering of in vitro and patient isolates.

https://doi.org/10.7554/eLife.49748.010
Figure 2—source data 1

Genes differentially expressed between B1 and B2 phylogroup strains during in vitroculture in urine.

https://doi.org/10.7554/eLife.49748.014
Figure 2—source data 2

Genes differentially expressed between B1 and B2 phylogroup strains during human UTI.

https://doi.org/10.7554/eLife.49748.015

We also performed PCA analysis on in vitro (Figure 2—figure supplement 3A,B) and patient samples (Figure 2—figure supplement 3C,D) separately, to ascertain whether there was any discernible effect of bacterial phylogroup (Figure 2—figure supplement 3A,C) or patients’ previous history of UTI (Figure 2—figure supplement 3B,D) on gene expression. Interestingly, B1 and B2 strains did cluster separately and a number of genes were expressed differentially in B1 and B2 backgrounds (Figure 2—source data 1, Figure 2—source data 2), suggesting that variation in gene regulatory elements between phylogroups has a small but discernible role in gene expression both in vitro and during infection. However, we found that patients’ history of UTI had no effect on bacterial gene expression.

Taken together, our data indicate diverse UPEC strains adopt a specific and conserved transcriptional program for their core genes during human infection.

UPEC show increased expression of replication and translation machinery during UTI

Differential expression analysis of the infection and in vitro transcriptomes identified 492 differentially expressed genes (log2 fold change greater than two or less than −2, adjusted p values < 0.05) (Figure 3A, Figure 3—source data 1, Figure 3—source data 2). Interestingly, pathway analysis (Table 5) and manual curation of the differentially expressed gene list (Figure 3—source data 1) revealed that expression of ribosomal subunits (r-proteins), and enzymes involved in rRNA, tRNA modification, purine and pyrimidine metabolism, and ribosome biogenesis are significantly higher in patients compared to in vitro cultures (Figure 3B). Together with previous studies (Bielecki et al., 2014; Burnham et al., 2018; Forsyth et al., 2018), these data strongly suggest that replication rates during infection are significantly higher than during mid-exponential growth in urine in vitro.

Figure 3 with 1 supplement see all
Patient-associated transcriptional signature is consistent with rapid bacterial growth.

(A) The DESeq2 R package was used to compare in vitro urine cultures gene expression to that in patients. Each UPEC strain was considered an independent replicate (n = 14). Genes were considered up-regulated (down-regulated) if log2 fold change in expression was higher (lower) than 2 (vertical lines), and P value < 0.05 (horizontal line). Using these cutoffs, we identified 149 upregulated genes, and 343 downregulated genes. GO/pathway analysis showed that a large proportion of these genes belonged to one of the four functional categories (see legend). For each category, only the genes that have met the significance cut off are shown. The sugar transporters upregulated in UTI patients are shown in figure supplement. (B) Mean normalized expression for genes belonging to differentially expressed functional categories/pathways. The number of up or down-regulated genes belonging to each category is indicated next to the category name.

https://doi.org/10.7554/eLife.49748.019
Figure 3—source data 1

Genes upregulated during human UTI.

https://doi.org/10.7554/eLife.49748.021
Figure 3—source data 2

Genes downregulated during human UTI.

https://doi.org/10.7554/eLife.49748.022
Table 5
GO modules differentially expressed in UTI patients.
https://doi.org/10.7554/eLife.49748.018
Go idAnnotatedSignificantExpectedP valueTerm
GO:0006518892416.630.03134peptide metabolic process
GO:0016052763614.20.00403carbohydrate catabolic process
GO:0044262752914.010.0022cellular carbohydrate metabolic process
GO:0015980702013.080.02632energy derivation by oxidation of organic compounds
GO:0043043691912.890.04306peptide biosynthetic process
GO:0046395652512.140.00556carboxylic acid catabolic process
GO:0006412631811.770.03421translation
GO:0008643553010.280.02488carbohydrate transport
GO:190382539127.290.04583organic acid transmembrane transport
GO:000803338137.10.0159tRNA processing
GO:190503938127.10.03786carboxylic acid transmembrane transport
GO:004636538217.10.04177monosaccharide catabolic process
GO:003421937206.910.0005carbohydrate transmembrane transport
GO:004271035116.540.04746biofilm formation
GO:004401034116.350.03879single-species biofilm formation
GO:000640034116.350.03879tRNA modification
GO:007232932155.980.02795monocarboxylic acid catabolic process
GO:000940130115.60.01501phosphoenolpyruvate-dependent sugar phosphotransferase system
GO:001060829105.420.03121posttranscriptional regulation of gene expression
GO:00342482694.860.03925regulation of cellular amide metabolic process
GO:00064172694.860.03925regulation of translation
GO:001574924134.480.03338monosaccharide transmembrane transport
GO:00512482394.30.01728negative regulation of protein metabolic process
GO:004427522114.110.04263cellular carbohydrate catabolic process
GO:00322692284.110.03829negative regulation of cellular protein metabolic process
GO:00158071973.550.04819L-amino acid transport
GO:00171481883.360.01044negative regulation of translation
GO:00342491883.360.01044negative regulation of cellular amide metabolic process
GO:19024751773.180.02607L-alpha-amino acid transmembrane transport
GO:00094091482.620.00144response to cold
GO:00422551492.620.00021ribosome assembly
GO:00193211482.620.03705pentose metabolic process
GO:00468351362.430.02143carbohydrate phosphorylation
GO:00065261282.240.00034arginine biosynthetic process
GO:00425421051.870.02449response to hydrogen peroxide
GO:00193231071.870.02539pentose catabolic process

We also observed infection-specific downregulation of pathways involved in amino acid biosynthesis and sugar metabolism, and a general switch from expression of sugar transporters to that of amino acid transporters (Figure 3B, Figure 3—source data 2) (with the exception of 4 sugar transporters that were expressed at higher levels in patients: ptsG, fruA, fruB, and gntU. Figure 3—figure supplement 1). Downregulation of sugar catabolism genes and upregulation of amino acid transporters suggest a metabolic switch to a more specific catabolic program as well as a scavenger lifestyle as elaborated below.

A shift in metabolic gene expression during UTI to optimize growth potential

During our analysis, we observed that 99% (on average 2621/2653 genes) of core genome was expressed during in vitro culture, in contrast to only 94% in patient samples (2507/2653 genes). Patient samples also contained higher proportion of genes expressed at low levels when compared to in vitro samples. (Figure 2—figure supplement 2). Moreover, we noted that the majority of differentially expressed genes were downregulated in patients (343/492 differentially expressed genes). On the other hand, 30% of all upregulated genes (48/149) were ribosomal proteins. Together, these data gave us the first indication that UPEC may undergo a global gene expression reprogramming during urinary tract infection.

Bacterial growth laws postulate that bacteria dedicate a fixed amount of cellular resources to the expression of ribosomes and metabolic machinery. As a consequence, higher growth rates are achieved by allocating resources to ribosome expression at the expense of metabolic machinery production (Basan, 2018; Basan et al., 2015; Molenaar et al., 2009; Scott et al., 2010; Scott and Hwa, 2011; You et al., 2013). However, this resource reallocation between ribosomal and metabolic gene expression has not yet been measured in vivo.

First, we wanted to determine what proportion of the total transcriptome is dedicated to core genome expression. We hypothesized that during infection transcription could shift from the core genome to the accessory genome, which is enriched for virulence factors. However, we found that approximately 50% of total reads mapped to the core genome regardless of whether the bacteria were isolated from the patients or cultured in vitro (Figure 4A). Therefore, our data indicated that a fixed proportion of cellular resources were being dedicated to expression of conserved ribosomal and metabolic machinery, regardless of external environment.

Figure 4 with 1 supplement see all
UPEC optimize growth potential via resource reallocation during UTI.

(A) Percentage of reads that aligned to the core genome (2653 genes) out of total mapped reads. (B) Percentage of core genome reads that mapped to r-proteins (ribosomal subunit proteins, 48 genes). (C) Percentage of core genome reads that mapped to catabolic genes (defined as genes regulated by Crp and present in the core genome (277 genes). (D) Percentage of core genome reads that mapped to amino acid biosynthesis genes (54 genes). The equivalent analysis of Subashchandrabose et al. (2014) dataset is shown in the figure supplement.

https://doi.org/10.7554/eLife.49748.023

We next looked at r-protein expression. Remarkably, we found that almost 25% of core genome reads mapped to r-proteins during infection, while this number was only 7% during exponential growth in urine (Figure 4B). These findings support the idea of extremely fast UPEC growth during UTI. Furthermore, this increase in r-protein expression correlated with a marked decrease in the proportion of core genome reads dedicated to the expression of catabolic genes (20% in vitro, 11% in patients, Figure 4C) and amino acid biosynthesis genes (5% in vitro, 1% in patients, Figure 4D). We then performed the same analysis on our previously published dataset (Subashchandrabose et al., 2014), and found a consistent trend of increased r-protein production, and decreased catabolic enzyme expression during human UTI (Figure 4—figure supplement 1, Table 6, Table 7). Thus, our data, which are consistent across multiple data sets, highlight a dramatic and conserved resource reallocation from metabolic gene expression to replication and translational gene expression during human UTI. We postulate that this resource reallocation is required to facilitate the rapid growth rate of UPEC in the host, which has been previously documented (Burnham et al., 2018; Forsyth et al., 2018).

Table 6
Summary of alignment statistics (% mapped) for Subashchandrabose et al. (2014).
https://doi.org/10.7554/eLife.49748.025
Sample:TotalMapped
reads
% MappedMapped
to CDS
Mapped to
misc_RNA
Mapped
to rRNA
Mapped
to tRNA
Mapped
to tmRNA
HM46 | UR841954388144752596.742.410.0560.550.010.01
HM26 | UTI2025325210009684.9416.750.2421.240.090.16
HM46 | UTI633384181078379817.036.930.1240.30.10.1
HM27 | LB674224986506561596.52.250.0455.60.020.01
HM27 | UTI672587481830817127.229.250.1345.490.080.2
HM26 | UR622429785999453896.392.310.0860.580.010.01
HM65 | LB734513467122133896.962.53051.410.010
HM69 | LB13769075813364972797.073.490.0567.260.010.01
HM69 | UTI725092143850655953.116.520.1342.090.040.21
HM46 | LB780180267559029796.892.780.0656.90.010.01
HM27 | UR981851809468353496.432.820.03610.010.01
HM26 | LB709198966867179896.832.020.0655.740.020.01
HM65 | UR760240087355593996.752.49055.040.010
HM65 | UTI734465765969671881.286.19040.30.040
HM69 | UR671127506483431196.612.450.0452.920.010.01
Table 7
Summary of alignment statistics (% mapped) for Subashchandrabose et al. (2014).
https://doi.org/10.7554/eLife.49748.026
SampleCDSmisc_RNArRNAtRNAtmRNA
HM46 | UR1960841369014931260473025604
HM26 | UTI16766323662126419491605
HM46 | UTI7477021294843458811028911281
HM27 | LB14636272608136173268117175088
HM27 | UTI16934482424583290041442736287
HM26 | UR1387110488473634562065325837
HM65 | LB180185803661219072631
HM69 | LB46645797188189896218138287949
HM69 | UTI251173351962162066801707081355
HM46 | LB20994934235643011663111358549
HM27 | UR26732833118557757240101528399
HM26 | LB13857663897138278745110815724
HM65 | UR182803904048661156751
HM65 | UTI3697360024059705240552
HM69 | UR1587484263223430817047377686

Increase in r-protein transcripts is an infection-specific response

Doubling time during exponential growth in urine is longer than the doubling time during exponential growth in rich media, such as LB (Plank and Harvey, 1979). Thus, we wanted to determine whether the differences between the infection-specific and in vitro transcriptomes are due to longer doubling times of UPEC cultured in urine. For that purpose, one of the clinical strains, HM43, was cultured in LB, and in a new batch of filter sterilized urine. Using the growth curves shown in Figure 5A, we estimated the doubling time of HM43 during exponential growth in LB to be approximately 33 min and the doubling time in urine to be 54 min. In addition, we sequenced RNA from 3-hour-old LB cultures, 3-hour-old urine cultures and from the urine of CBA/J mice, 48 hr after transurethral inoculation with HM43 (Table 8, Table 9).

Table 8
Summary of alignment statistics (% mapped) for mouse UTI study.
https://doi.org/10.7554/eLife.49748.028
SampleTotal
reads
Mapped
reads
% MappedMapped
to CDS
Mapped to
misc_RNA
Mapped
to rRNA
Mapped
to tRNA
Mapped
to sRNA
Mapped
to tmRNA
HM43 | LB | rep1639666466281394698.273.015.4900.211.036.41
HM43 | LB | rep2378339573709086398.0471.595.9100.211.636.69
HM43 | UR | rep1431799464229300697.95638.900.0619.9611.94
HM43 | UR | rep2441769524309384097.5553.6410.940.010.0327.817.9
HM43 | mouse4431453736901748.3376.722.7500.246.114
Table 9
Summary of alignment statistics (% mapped) for mouse UTI study.
https://doi.org/10.7554/eLife.49748.029
SampleCDSmisc_RNArRNAtRNAsRNAtmRNA
HM43 | LB | rep145862961344923232712395069292614028787
HM43 | LB | rep22655454621925392047439643120752482416
HM43 | UR | rep12664407137652812182648884396685049595
HM43 | UR | rep2231154564714597296214049119799137714978
HM43 | mouse2831120101419558994225533147467
Increased expression of ribosomal subunit transcripts is a host specific response.

(A) Growth curve for HM43 strain cultured in LB and filter-sterilized urine. (B) Percentage of HM43 core genome reads that mapped to ribosomal subunit proteins under different conditions (URINE: in vitro culture in filter-sterilized urine, LB: in vitro culture in LB, MOUSE: mice with UTI, PATIENT: human UTI. (C) Percentage of HM43 core genome reads that mapped to catabolic genes under different conditions.

https://doi.org/10.7554/eLife.49748.027

We then determined the proportion of r-protein transcripts in the HM43 transcriptomes isolated from urine and LB cultures. Consistent with our previous experiments, this proportion was very small in urine culture (4%). Interestingly, while the proportion of r-protein transcripts was approximately three times larger in LB cultures compared to urine, it was still significantly lower compared to what we observed during infection (Figure 5B). In contrast, the bacterial transcriptome during mouse infection exhibited r-protein expression that was similar to the human infection (Figure 5B). Additionally, the proportion of the transcriptome dedicated to catabolic gene expression was highest during urine cultures and lowest during mouse and human infections, indicating a negative correlation between the expression of r-protein and sugar catabolism genes. (Figure 5C). Overall, we show that exponential growth in rich medium alone cannot recapitulate the transcriptional signature observed during human infection. Taken together, our data suggest that the resource reallocation described in this study is an infection-specific response.

Environment-responsive regulators facilitate patient-specific gene expression program

We next sought to identify potential regulators involved in resource reallocation that facilitate the infection-specific UPEC gene expression program. To do so, we performed gene set enrichment analysis (GSEA) on E. coli co-regulated genes (regulons). This analysis allowed us to identify regulons enriched in differentially expressed genes. We identified 22 transcriptional factors whose regulon’s expression was statistically different between infection and in vitro cultures (Table 10). 18/22 regulons were expressed at higher level during in vitro culture, and eight representative regulons are shown in Figure 6. Overall, we found that these regulons accounted for 50% of differentially expressed genes that were determined to be significantly down-regulated. In contrast, only 6% of upregulated genes belonged to the four regulons that were expressed at higher levels during infection. These included genes involved in the SOS response, as well as purine synthesis (Table 10).

Table 10
GSEA results.

Gene sets found to be enriched in differentially expressed genes. For example, Lrp, Repressor indicates gene set repressed by Lrp (data obtained from RegulonDB 9.4). Expression indicates whether regulon expression was higher in patients of during in vitro culture in urine. Regulon size: number of genes in the gene set; Matched size: number of genes found in data set; NES: normalized enrichment score; FDR: false discovery rate.

https://doi.org/10.7554/eLife.49748.031
FunctionExpression
(higher in)
Regulon sizeMatched sizeNESFDR
LrpRepressorUrine85272.290799780
NarLRepressorUrine87652.244358010
LrpActivatorUrine38192.212695650
MetJRepressorUrine15142.128852230.00083422
CrpActivatorUrine4252772.121504020.00066738
CsgDActivatorUrine13122.011976930.00250267
GadXActivatorUrine23151.893503040.00929563
ModEActivatorUrine31281.872896060.0108449
YdeOActivatorUrine18141.819751460.02002136
FurRepressorUrine110661.766586930.02752936
PhoPActivatorUrine45331.76073790.0256334
RcsBActivatorUrine58281.706675580.03781812
HnsRepressorUrine144621.698806650.03657748
GadEActivatorUrine70381.694004780.03515655
RcsAActivatorUrine42241.686156330.03448122
NarPActivatorUrine32291.656758980.04045982
NarPRepressorUrine33261.64063590.04279074
FhlAActivatorUrine30151.625360480.04514074
FliZRepressorUrine20151.609489530.04750681
LexARepressorPatients5943−1.6960720.03586007
CraRepressorPatients5950−1.71218550.04267527
PurRRepressorPatients3131−1.7522990.04410253
FadRActivatorPatients1211−1.98715240.00342544
Differential regulon expression suggests role for multiple regulators in resource reallocation.

Regulon expression for 8 out of 22 regulons enriched for genes downregulated in the patients. Expression of each gene in the regulon during in vitro culture (blue) or during UTI (red) is shown along the x-axis. Histograms show proportion of genes in the regulon expressed at any given level.

https://doi.org/10.7554/eLife.49748.030

In support of our previous data, the expression of catabolic genes controlled by the Crp regulator was lower in patients compared to urine cultures. In conjunction with the previously described role for Crp in resource reallocation (You et al., 2013), our in vivo findings strongly suggest that catabolite repression plays an important role in bacterial growth rate during UTI. Interestingly, other regulators identified in this analysis (NarL, ModE, MetJ, GadE, YdeO) are known sensors of environmental cues, suggesting that the infection-specific gene expression program may be driven by additional environmental signals. Taken together, we propose a model where simultaneous sensing of multiple environmental cues in the urinary tract leads to the global down-regulation of multiple metabolic regulons during infection. The cellular resources (e.g., RNA polymerase) that are freed as a result are then allocated to the transcription of genes (for example, r-proteins), which are required to maintain rapid growth rate.

Discussion

UPEC causes one of the most prevalent bacterial infections in humans; consequently, the virulence mechanisms of UPEC infection have been well-characterized. However, while we know that these virulence strategies (e.g., iron acquisition, adhesion, immune evasion) are essential for establishing infection, UPEC strains can differ dramatically in the specific factors that are utilized. Additionally, our data indicate that the expression of virulence factors can change from patient to patient, suggesting that the need for a specific factor might vary during the course of the infection.

In this study, we set out to uncover universal bacterial features during human UTIs, regardless of the stage of the infection or patient history. To do so, we performed transcriptomic analysis on bacterial RNA isolated directly from the urine of 14 patients and compared it to the gene expression of identical strains cultured to mid-exponential phase in sterile urine. Our analysis focused on the core genome as opposed to the more commonly studied accessory genome, which contains the majority of the classical virulence factors. This allowed us to identify a remarkably conserved gene expression signature shared by all 14 UPEC strains during UTI.

Although frequently overlooked, bacterial metabolism is an essential component of bacterial pathogenesis. Since the core genome is enriched for metabolic genes, we anticipated that our study would illuminate the UPEC metabolic state during human infection. Our data revealed an infection-specific increase in ribosomal protein expression in all 14 UPEC isolates, which was suggestive of bacteria undergoing rapid growth. These data strongly support the previous findings of Bielecki et al. (2014), which found a gene expression profile consistent with rapid growth in elderly patients with UTIs. Furthermore, while we did observe increased r-protein expression in exponentially growing UPEC cultured in LB, these transcripts were dramatically more abundant in the context of infection (human and mouse). Thus, the findings that UPEC maintain a conserved gene expression during UTI and grow faster in the host in comparison to in vitro conditions is consistent across multiple studies and patient cohorts (Bielecki et al., 2014), and supports recent studies that have documented very rapid UPEC growth rate measured directly in patients (Burnham et al., 2018; Forsyth et al., 2018).

Importantly, our analysis reveals how this growth rate can be achieved. We found that regardless of external environment,~50% of total gene expression is allocated to the core genome, consisting of metabolic and replication machinery, which mediate bacterial growth potential. When the infection-specific transcriptome was compared to that of UPEC cultured to mid-exponential phase in urine, we observed that elevated levels of ribosomal transcripts correlated with decreased levels of metabolic gene expression. We propose that this reallocation of resources within the core genome drives the rapid growth rate of UPEC during infection.

This resource reallocation is equivalent to what has been described as the bacterial ‘growth law’. Based on in vitro studies, the growth law proposes that increases in ribosomal gene expression occurs at the expense of a cell’s metabolic gene expression (Basan, 2018; Scott et al., 2010). Our analysis of UPEC gene expression directly from patients is consistent with this hypothesis. In addition, regulatory network analysis revealed that multiple metabolic regulons exhibit decreased transcript levels in patients suggesting an actively regulated process. In contrast, synthesis of ribosomal RNA (rRNA) coordinates the expression of ribosomal proteins by a translational feedback regulation mechanism (Jin et al., 2012; Jin and Cabrera, 2006; Nomura et al., 1984). rRNA synthesis is proposed to be regulated by the competition of RNA polymerase between transcription of rRNA operons and that of other genes, with some studies suggesting that mid-log growing cells might require almost all RNA polymerase dedicated to rRNA synthesis (Jin et al., 2012; Jin and Cabrera, 2006). Thus, decreased metabolic gene expression could allow the cell to shift its allocation of RNA polymerase towards rRNA synthesis and as a result, ribosomal protein expression. Although we cannot exclude other mechanisms, we propose that the reallocation of RNA polymerase molecules from metabolic genes to rRNA and ribosomal protein genes is a common feature adopted by diverse UPEC to promote rapid growth during UTI.

Three recent studies have attempted to characterize UPEC gene expression in patients with UTIs (Bielecki et al., 2014; Hagan et al., 2010; Subashchandrabose et al., 2014). These studies focused on the importance of virulence factor expression in specific strains and have demonstrated changes in gene expression between infection and in vitro cultures. It should be noted that all of these studies, as well as our own, were performed using bacterial RNA isolated from patient urine (that was immediately stabilized upon collection). As a result, we cannot exclude the possibility that gene expression of UPEC residing in the bladder may differ from UPEC isolated from patient urine. However, the fact remains that we and others (Bielecki et al., 2014) report that patients with different histories of UTIs all harbored a population of actively dividing bacteria in a remarkably specific metabolic state, which we have also recapitulated in a mouse model of infection in this study.

These findings raise a number of interesting questions. Firstly, how is rapid growth rate beneficial to UPEC? For example, rapid growth rate could be necessary to avoid the hosts’ innate immune response such as micturition or epithelial cell shedding. Additionally, how does this growth rate influence the tempo and mode of bacterial evolution, especially with regards to genomic integrity and the acquisition of antibiotic resistance? Finally, what are the external cues that launch the infection-specific transcriptional response? It has been noted previously that filtered urine lacks some proteins that are present in unfiltered urine (Greene et al., 2015), thus it would be interesting to see if supplementation of filtered urine with specific proteins/metabolites could recapitulate in vivo phenotype. While our study was not designed to identify infection-specific metabolites, our regulatory network analysis suggests that multiple environmental cues might reinforce the suppression of metabolic gene expression. We suggest that identifying and targeting these environmental cues is a promising approach to limit UPEC growth during UTI and gain the upper hand on this pathogen.

Materials and methods

Key resources table
Reagent type
(species) or
resource
DesignationSource or
reference
IdentifiersAdditional information
Strain, strain background
(Escherichia coli)
Uropathogenic Escherichia coli HM01This studyStrain isolation described in Study Design section below
Strain, strain background
(Escherichia coli)
Uropathogenic Escherichia coli HM03This studyStrain isolation described in Study Design section below
Strain, strain background
(Escherichia coli)
Uropathogenic Escherichia coli HM06This studyStrain isolation described in Study Design section below
Strain, strain background
(Escherichia coli)
Uropathogenic Escherichia coli HM07This studyStrain isolation described in Study Design section below
Strain, strain background
(Escherichia coli)
Uropathogenic Escherichia coli HM14This studyStrain isolation described in Study Design section below
Strain, strain background
(Escherichia coli)
Uropathogenic Escherichia coli HM17This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM43This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM54This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM56This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM57This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM60This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM66This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM68This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM86This studyStrain isolation described in Study Design section below
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM26(Subashchandrabose et al., 2014)
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM27(Subashchandrabose et al., 2014)
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM46(Subashchandrabose et al., 2014)
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM65(Subashchandrabose et al., 2014)
Strain, strain background (Escherichia coli)Uropathogenic Escherichia coli HM69(Subashchandrabose et al., 2014)
Strain, strain background (Mus musculus)CBA/J
commercial assay or kitMICROBEnrich KitThermo FisherAM1901
commercial assay or kitRNeasy kitQiagen74104
commercial assay or kitTurbo DNase kitThermo FisherAM2238
commercial assay or kitiScript cDNA synthesis kitBio Rad1708890
commercial assay or kitScriptSeq Complete Gold Kit (Epidemiology)IlluminaDiscontinued
commercial assay or kitScriptSeq Complete Kit (Bacteria)IlluminaDiscontinued
commercial assay or kitPowerUP SYBR Green Master MixBio RadA25779
commercial assay or kitDynabeads mRNA DIRECT Purification kitThermo Fisher61011
chemical compound, drugRNAprotectQiagen76526
software, algorithmTrimmomatic(Bolger et al., 2014)0.36
software, algorithmBowtie2(Langmead and Salzberg, 2012)2.3.4
software, algorithmsamtools(Li, 2011)1.5
software, algorithmHTseq(Anders et al., 2015)0.9.1
software, algorithmGet_homologues(Contreras-Moreira and Vinuesa, 2013)20170807
software, algorithmDESeq2(Love et al., 2014)1.22.2

Study design

Request a detailed protocol

Sample collection was previously described (Subashchandrabose et al., 2014). Briefly, a total of 86 female participants, presenting with symptoms of lower UTI at the University of Michigan Health Service Clinic in Ann Arbor, MI in 2012, were enrolled in this study. The participants were compensated with a $10 gift card to a popular retail store. Clean catch midstream urine samples from participants were immediately stabilized with two volumes of RNAprotect (Qiagen) to preserve the in vivo transcriptional profile. De-identified patient samples were assigned unique sample numbers and used in this study. Of the 86 participants, 38 were diagnosed with UPEC-associated UTIs (Subashchandrabose et al., 2014). Of these, 19 samples gave us sufficient RNA yield of satisfactory quality. Five were used for a pilot project (Subashchandrabose et al., 2014), the remaining 14 were used in this study.

Genome sequencing and assembly

Request a detailed protocol

The genomic DNA from clinical strains of E. coli were isolated with CTAB/phenol-chloroform based protocol. Library preparation and sequencing were performed on PacBio RS system at University of Michigan Sequencing Core. De novo assemblies were performed with canu de novo assembler (Koren et al., 2017) with all the parameters set to default mode and correction phase turned on. Finished genome assembly of reference strains (MG1655, CFT073, UTI89, EC958) were downloaded from NCBI and were converted to fastq reads using ArtificialFastqGenerator v1.0. Trimmomatic 0.36 (Bolger et al., 2014) was used for trimming adapter sequences. Variants were identified by (i) mapping filtered reads to reference genome sequence CFT073 (NC_004431) using the Burrows-Wheeler short-read aligner (bwa-0.7.17) (Li and Durbin, 2009), (ii) discarding polymerase chain reaction duplicates with Picard (picard-tools-2.5.0), and (iii) calling variants with SAMtools (samtools-1.2,) (Li, 2011) and bcftools (Li, 2011). Variants were filtered from raw results using GATK’ s (GenomeAnalysisTK-3.3–0 [Van der Auwera et al., 2013]) VariantFiltration (QUAL,>100; MQ,>50; DP >= 10 reads supporting variant; and FQ <0.025). In addition, a custom python script was used to filter out single-nucleotide variants that were <5 base pairs (bp) in proximity to indels. Positions that fell under the following regions were masked (substituted with N): (i) Phage and Repeat region of the reference genome (identified using Phaster and Nucmer; MUMmer3.23 [Kurtz et al., 2004]) (ii) Low MQ and Low FQ regions (ii) base positions that didn’t pass the hard filters (QUAL,>100; DP >= 10) were individually masked in each sample. Recombinant region identified by Gubbins 2.3.1 (Croucher et al., 2015) were filtered out and a maximum likelihood tree was constructed in RAxML 8.2.8 (Stamatakis, 2014) using a general-time reversible model of sequence evolution from the gubbins filtered alignment. Bootstrap analysis was performed with the number of bootstrap replicates determined using the bootstrap convergence test and the autoMRE convergence criteria (-N autoMRE). Bootstrap support values were overlaid on the best scoring tree identified during rapid bootstrap analysis (-f a).

Phylogroup, MLST, and serogroup typing

Request a detailed protocol

Phylogroups were assigned using an in-house script based on the presence and absence of primer target sequences and typing scheme (Clermont et al., 2013). MLST schemes from pubmlst (Jolley et al., 2018) were downloaded using ARIBA’s pubmlstget tool and sequence types were determined by running ARIBA (Hunt et al., 2017) against this pubmlst database. Serogroups were determined using SerotypeFinder (Joensen et al., 2015).

Bacterial culture conditions

Request a detailed protocol

Human urine was pooled from four age-matched healthy female volunteers. Overnight cultures of clinical isolates were washed once in human urine, then 250 μl of overnight culture was added to 25 ml of filter-sterilized human urine and cultured statically at 37C for 2 hours. Six milliliters of this culture were stabilized with RNAprotect (Qiagen) and used for RNA purification.

Antibiotic resistance profiling

Request a detailed protocol

As described in Subashchandrabose et al. (2014), identity and antibiotic resistance profiles of UPEC isolates were determined using a VITEK2 system (BioMerieux).

RNA isolation and sequencing

Request a detailed protocol

RNA isolation protocol was previously described (Subashchandrabose et al., 2014). Briefly, samples were treated with proteinase K and total RNA was isolated using Qiagen RNAeasy minikit. Turbo DNase kit (Ambion) was used to remove contaminating DNA. Bacterial content of patient samples was enriched using MICROBEnrich kit (Ambion), which depletes RNA of eukaryotic mRNA and rRNA. Library preparation and sequencing was performed by University of Michigan sequencing core. ScriptSeq Complete Kit (Bacteria) library kit was used to both deplete samples of bacterial rRNA and to construct stranded cDNA libraries from the rRNA-depleted RNA (Table 3, Table 4). While the original in vitro samples submitted for sequencing were not treated with MICROBEnrich kit, we have since performed extensive testing with two different clinical UTI strains (HM86 and HM56) to show that treatment with the kit does not affect the measured gene expression (Figure 1—figure supplement 5, Supplementary file 1). All samples were sequenced using Illumina HiSeq2500 (single end, 50 bp read length).

RT-PCR validation of MICROBEnrich-treated samples

Request a detailed protocol

Clinical strains HM56 and HM86 were cultured overnight in LB broth at 37°C. The next morning, the culture was spun down, and the pellet washed once with PBS. Pooled filter-sterilized human urine was then inoculated with the washed bacteria at a ratio of 1:100 and incubated shaking at 37°C for five hours. Cultures were then treated with bacterial RNAprotect (Qiagen), pellets collected and stored at −80°C. The bacterial pellets were treated with both lysozyme and proteinase K, and then total RNA was extracted using the RNAeasy kit (Qiagen). Genomic DNA was removed using the Turbo DNA free kit (ThermoFisher). The extracted RNA was then halved. One half was treated using the MICROBEnrich kit (ThermoFisher), which should only remove eukaryotic mRNA and eukaryotic rRNA. The second half of the RNA remained untreated. Both the MICROBEnrich and untreated samples were reverse-transcribed into cDNA using the iScript cDNA synthesis kit (Biorad), with 1 μg RNA as template. Real-Time Quantitative Reverse Transcription PCR (qRT-PCR) was performed in a Quantstudio 3 PCR system (Applied Biosystem) in technical triplicate, using SYBR green (ThermoFisher). Samples were normalized to gapA transcript levels, by subtracting the Ct values of gapA from the Ct values of monitored genes. This value is reported as ΔCt.

Characterization of virulence factors’ gene expression

Request a detailed protocol

We compiled a literature search-based list of virulence factors belonging to different functional groups. Sequences for each virulence factor gene were extracted from reference UPEC genomes (either CFT073 or UTI89). Presence or absence of each virulence factor within clinical genomes was determined using BLAST (with percent identity ≥80% and percent coverage ≥90%, e-value ≤10−6). Hierarchical clustering of strains based on presence or absence of virulence factors was performed using Python’s scipy.cluster.hierarchy.linkage function with default parameters. Heatmaps of virulence factors’ gene expression in urine and in patients show normalized transcripts per million (TPMs) (same as for correlation analysis and PCA, see below).

RNAseq data processing

Request a detailed protocol

A custom bioinformatics pipeline was used for the analysis (Sintsova, 2019; copy archived at https://github.com/elifesciences-publications/rnaseq_analysis). Raw fastq files were processed with Trimmomatic (Bolger et al., 2014) to remove adapter sequences and analyzed with FastQC to assess sequencing quality. Mapping was done with bowtie2 aligner (Langmead and Salzberg, 2012) using default parameters. Alignment details can be found in Table 3 and Table 4. Read counts were calculated using HTseq htseq-count (union mode) (Anders et al., 2015).

Quality control

Request a detailed protocol

Since some of our clinical samples yielded lower numbers of bacterial reads than desired (Table 3), we performed a comprehensive quality assurance to determine if the sequencing depth of our clinical samples was sufficient for our analysis (see Saturation curves and Gene expression ranges analysis below, Figure 2—figure supplement 1, Figure 2—figure supplement 2). Overall, all patient samples except for HM66 passed quality control (see gene expression ranges analysis, Figure 2—figure supplement 2). While we elected to keep all of the strains in our subsequent analysis, this observation explains why the patient HM66 sample appears as an outlier in Figure 2.

Saturation curves

Request a detailed protocol

We created saturation curves for each of our sequencing files to assess whether we have sufficient sequencing depth for downstream analysis. Each sequencing file was subsampled to various degrees and number of genes detected in those subsamples (y-axis) was graphed against number of reads in the subsample (x-axis). As expected, all of the in vitro samples reached saturation (Figure 2—figure supplement 1, blue lines). Unfortunately, 6 out of our 14 samples did not reach saturation, which warranted us to investigate further (see Gene expression ranges analysis) Figure 2—figure supplement 1, red lines). Additionally, dropping the six samples that did not reach saturation from our analysis did not affect any of the results.

Core genome identification

Request a detailed protocol

Core genome for 14 clinical isolates and MG1655 was determined using get_homologues (Contreras-Moreira and Vinuesa, 2013). We explored multiple parameter values for our analysis and their effect on final core genome, in the end we set the cut off of 90% of sequence identity and 50% sequence coverage (similar results were obtained when using different cutoffs). The intersection of three algorithms employed by get_homologues contained 2653 gene clusters.

Gene expression ranges analysis

Request a detailed protocol

Due to low sequencing depth of 6 of our isolates, we were worried we would not be able to detect genes expressed at low levels in those samples. To evaluate whether we were losing information about low-level expression, we compared a number of genes in the core genome that were expressed at different levels (1000 TPMS, 100 TPMS, 10 TPMS and 1 TPM) between clinical samples that reached saturation (Figure 2—figure supplement 2A) and those that did not (Figure 2—figure supplement 2B). Only one of the clinical samples (HM66) seemed to lack genes expressed in the range of 1–10 TPMs. Thus, we conclude that all but one sample (HM66) had sufficient coverage for downstream analysis.

Pearson correlation coefficient calculation and PCA analysis

Request a detailed protocol

For PCA and correlation analysis, transcript per million (TPM) was calculated for each gene, TPM distribution was then normalized using inverse rank transformation. Pearson correlation and PCA was performed using python Python sklearn library. Jupyter notebooks used to generate the figures are available at https://github.com/ASintsova/HUTI-RNAseq.

Differential expression analysis

Request a detailed protocol

Differential expression analysis was performed using DESeq2 R package (Love et al., 2014). Genes with log2 fold change of greater than two or less than −2 and adjusted p value (Benjamini-Hochberg adjustment) of less than 0.05 were considered to be differentially expressed. DESeq2 normalized counts were used to generate Figure 3 and Figure 6. Pathway analysis was performed using R package topGO (Alexa and Rahnenfuhrer, 2018).

RNA sequencing of HM43 from the mouse model of UTI

Request a detailed protocol

Forty CBA/J mice were infected using the ascending model of UTI as previously described (Hagberg et al., 1983). Briefly, 40 six-week-old female mice were transurethrally inoculated with 108 CFU of UPEC isolate HM43. 48 hr post infection urine was collected from each mouse directly into bacterial RNAprotect (Qiagen). All collected urine was pooled together and pelleted, and immediately placed in the −80°C freezer. This collection was repeated every 45 minutes five more times, resulting in six collected pellets consisting of bacterial and eukaryotic cells.

For in vitro controls, UPEC strain HM43 was cultured overnight in LB. The next morning, the culture was spun down, and the pellet washed twice with PBS. LB or pooled human urine was then inoculated with the washed bacteria at a ratio of 1:100 and incubated with shaking at 37°C for 3 hr. Cultures were then treated with bacterial RNAprotect (Qiagen), pellets collected and stored at −80°C.

All the pellets were treated with both lysozyme and proteinase K, and then total RNA was extracted using RNAeasy kit (Qiagen). Genomic DNA was removed using the Turbo DNA free kit (ThermoFisher). Eukaryotic mRNA was depleted using dynabeads covalently linked with oligo dT (ThermoFisher). The in vitro samples underwent the same treatment with dynabeads to reduce any potential biases this procedure might introduce to the downstream sequencing. The supernatant was collected from this treatment, and the RNA was concentrated and re-purified using RNA Clean and Concentrator kit (Zymo). Library preparation and sequencing was performed by University of Michigan sequencing core. The ScriptSeq Complete Gold Kit (Epidemiology) library kit was used to both deplete samples of bacterial and eukaryotic rRNA and to construct stranded cDNA libraries from the rRNA-depleted RNA. These were sequenced using Illumina HiSeq2500 (single end, 50 bp read length). RNAseq analysis was performed as described above, alignment statistics are shown in Table 8 and Table 9.

Analysis of RNAseq data from Subashchandrabose et al. (2014). Sample collection and RNA isolation is described in Subashchandrabose et al. (2014). Briefly, RNA samples were treated with proteinase K and total RNA was isolated using Qiagen RNAeasy minikit. Turbo DNase kit (Ambion) was used to remove contaminating DNA. Bacterial content of patient samples was enriched using MICROBenrich kit (Ambion). The depleted RNA was used to generate sequencing libraries using the Ovation Prokaryotic RNA-Seq system (NuGen) and the Encore next-generation sequencing library system (NuGen). The libraries were sequenced using an Illumina HiSeq2000 (paired-end, 100 bp) by the Genome Resource Center at the Institute for Genome Sciences, University of Maryland, Baltimore, MD. RNAseq analysis was performed as described above, alignment statistics are shown in Table 6 and Table 7.

Estimation of HM43 doubling time

Request a detailed protocol

For both LB and urine OD curves were performed using Bioscreen-C Automated Growth Curve Analysis System (Growth Curves USA) eight separate times. For each time point, the mean values of the eight replicates were used for doubling time estimation. The equation bellow was used to estimate doubling time during logarithmic growth in LB or urine, where DT is doubling time, C2 is final OD, C1 is initial OD, and t is time elapsed between when C2 and C1 were taken.

DT= t*log2logC2-log(C1)

DT was calculated for every two measurements taken between 30 and 180 min and mean of these values is reported.

Regulon analysis

Request a detailed protocol

Regulon gene sets were extracted from RegulonDB 9.4 (Gama-Castro et al., 2016) using custom Python scripts (available https://github.com/ASintsova/HUTI-RNAseq). Gene set enrichment analysis was performed using Python GSEAPY library.

Data access

Request a detailed protocol

Jupyter notebooks as well as all the data used to generate the figures in this paper are available on github: https://github.com/ASintsova/HUTI-RNAseq.

Data availability

Sequencing data have been deposited in GEO under accession codes GSE128997.

The following data sets were generated
    1. Sintsova A
    2. Frick-Cheng A
    3. Smith S
    4. Pirani A
    5. Snitkin E
    6. Mobley H
    (2019) NCBI Gene Expression Omnibus
    ID GSE128997. Genetically diverse uropathogenic Escherichia coli adopt a common transcriptional program in patients with urinary tract infections.
The following previously published data sets were used

References

    1. Alexa A
    2. Rahnenfuhrer J
    (2018)
    topGO: Enrichment Analysis for Gene Ontology
    topGO: Enrichment Analysis for Gene Ontology, R package version 2.34.0..
    1. Hagberg L
    2. Engberg I
    3. Freter R
    4. Lam J
    5. Olling S
    6. Svanborg Edén C
    (1983)
    Ascending, unobstructed urinary tract infection in mice caused by pyelonephritogenic Escherichia coli of human origin
    Infection and Immunity 40:273–283.
    1. Johnson DE
    2. Lockatell CV
    3. Russell RG
    4. Hebel JR
    5. Island MD
    6. Stapleton A
    7. Stamm WE
    8. Warren JW
    (1998)
    Comparison of Escherichia coli strains recovered from human cystitis and pyelonephritis infections in transurethrally challenged mice
    Infection and Immunity 66:3059–3065.
    1. Nickel JC
    (2005)
    Practical management of recurrent urinary tract infections in premenopausal women
    Reviews in Urology 7:11–17.

Article and author information

Author details

  1. Anna Sintsova

    Department of Microbiology and Immunology, University of Michigan, Ann Arbor, United States
    Contribution
    Conceptualization, Data curation, Formal analysis, Validation, Investigation, Visualization, Writing—original draft
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4075-6366
  2. Arwen E Frick-Cheng

    Department of Microbiology and Immunology, University of Michigan, Ann Arbor, United States
    Contribution
    Validation, Investigation, Visualization, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7202-4701
  3. Sara Smith

    Department of Microbiology and Immunology, University of Michigan, Ann Arbor, United States
    Contribution
    Validation, Investigation
    Competing interests
    No competing interests declared
  4. Ali Pirani

    Department of Microbiology and Immunology, University of Michigan, Ann Arbor, United States
    Contribution
    Data curation, Formal analysis, Validation
    Competing interests
    No competing interests declared
  5. Sargurunathan Subashchandrabose

    Department of Veterinary Pathobiology, Texas A&M University, College Station, United States
    Contribution
    Investigation, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  6. Evan S Snitkin

    Department of Microbiology and Immunology, University of Michigan, Ann Arbor, United States
    Contribution
    Conceptualization, Formal analysis, Supervision, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  7. Harry Mobley

    Department of Microbiology and Immunology, University of Michigan, Ann Arbor, United States
    Contribution
    Conceptualization, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Writing—review and editing
    For correspondence
    hmobley@med.umich.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9195-7665

Funding

National Institute for Health Research (R01 DK094777)

  • Anna Sintsova
  • Arwen E Frick-Cheng
  • Sara Smith
  • Ali Pirani
  • Sargurunathan Subashchandrabose
  • Evan S Snitkin
  • Harry Mobley

American Urological Association Foundation (Research Scholar Fellow)

  • Anna Sintsova

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Ethics

Human subjects: All procedures involving human samples were performed in accordance with the protocol (HUM00029910) approved by the Institutional Review Board at the University of Michigan. This protocol is compliant with the guidelines established by the National Institutes of Health for research using samples derived from human subjects.

Animal experimentation: Mouse infection experiments were conducted according to the protocol PRO00007111 approved by the University Committee on Use and Care of Animals at the University of Michigan. This protocol is in complete compliance with the guidelines for humane use and care of laboratory animals established by the National Institutes of Health.

Version history

  1. Received: June 27, 2019
  2. Accepted: October 4, 2019
  3. Version of Record published: October 21, 2019 (version 1)

Copyright

© 2019, Sintsova et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,828
    Page views
  • 377
    Downloads
  • 35
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Anna Sintsova
  2. Arwen E Frick-Cheng
  3. Sara Smith
  4. Ali Pirani
  5. Sargurunathan Subashchandrabose
  6. Evan S Snitkin
  7. Harry Mobley
(2019)
Genetically diverse uropathogenic Escherichia coli adopt a common transcriptional program in patients with UTIs
eLife 8:e49748.
https://doi.org/10.7554/eLife.49748

Share this article

https://doi.org/10.7554/eLife.49748

Further reading

    1. Microbiology and Infectious Disease
    Swati Jain, Gherman Uritskiy ... Venigalla B Rao
    Research Article

    A productive HIV-1 infection in humans is often established by transmission and propagation of a single transmitted/founder (T/F) virus, which then evolves into a complex mixture of variants during the lifetime of infection. An effective HIV-1 vaccine should elicit broad immune responses in order to block the entry of diverse T/F viruses. Currently, no such vaccine exists. An in-depth study of escape variants emerging under host immune pressure during very early stages of infection might provide insights into such a HIV-1 vaccine design. Here, in a rare longitudinal study involving HIV-1 infected individuals just days after infection in the absence of antiretroviral therapy, we discovered a remarkable genetic shift that resulted in near complete disappearance of the original T/F virus and appearance of a variant with H173Y mutation in the variable V2 domain of the HIV-1 envelope protein. This coincided with the disappearance of the first wave of strictly H173-specific antibodies and emergence of a second wave of Y173-specific antibodies with increased breadth. Structural analyses indicated conformational dynamism of the envelope protein which likely allowed selection of escape variants with a conformational switch in the V2 domain from an α-helix (H173) to a β-strand (Y173) and induction of broadly reactive antibody responses. This differential breadth due to a single mutational change was also recapitulated in a mouse model. Rationally designed combinatorial libraries containing 54 conformational variants of V2 domain around position 173 further demonstrated increased breadth of antibody responses elicited to diverse HIV-1 envelope proteins. These results offer new insights into designing broadly effective HIV-1 vaccines.

    1. Microbiology and Infectious Disease
    Markéta Častorálová, Jakub Sýs ... Tomas Ruml
    Research Article Updated

    For most retroviruses, including HIV, association with the plasma membrane (PM) promotes the assembly of immature particles, which occurs simultaneously with budding and maturation. In these viruses, maturation is initiated by oligomerization of polyprotein precursors. In contrast, several retroviruses, such as Mason-Pfizer monkey virus (M-PMV), assemble in the cytoplasm into immature particles that are transported across the PM. Therefore, protease activation and specific cleavage must not occur until the pre-assembled particle interacts with the PM. This interaction is triggered by a bipartite signal consisting of a cluster of basic residues in the matrix (MA) domain of Gag polyprotein and a myristoyl moiety N-terminally attached to MA. Here, we provide evidence that myristoyl exposure from the MA core and its insertion into the PM occurs in M-PMV. By a combination of experimental methods, we show that this results in a structural change at the C-terminus of MA allowing efficient cleavage of MA from the downstream region of Gag. This suggests that, in addition to the known effect of the myristoyl switch of HIV-1 MA on the multimerization state of Gag and particle assembly, the myristoyl switch may have a regulatory role in initiating sequential cleavage of M-PMV Gag in immature particles.