Characterization of the aggregates formed during recombinant protein expression in bacteria
© Schrödel and de Marco. 2005
Received: 10 February 2005
Accepted: 31 May 2005
Published: 31 May 2005
Skip to main content
© Schrödel and de Marco. 2005
Received: 10 February 2005
Accepted: 31 May 2005
Published: 31 May 2005
The first aim of the work was to analyze in detail the complexity of the aggregates formed upon overexpression of recombinant proteins in E. coli. A sucrose step gradient succeeded in separating aggregate subclasses of a GFP-GST fusion protein with specific biochemical and biophysical features, providing a novel approach for studying recombinant protein aggregates.
The total lysate separated into 4 different fractions whereas only the one with the lowest density was detected when the supernatant recovered after ultracentrifugation was loaded onto the sucrose gradient. The three further aggregate sub-classes were otherwise indistinctly precipitated in the pellet. The distribution of the recombinant protein among the four subclasses was strongly dependent on the DnaK availability, with larger aggregates formed in Dnak- mutants. The aggregation state of the GFP-GST recovered from each of the four fractions was further characterized by examining three independent biochemical parameters. All of them showed an increased complexity of the recombinant protein aggregates starting from the top of the sucrose gradient (lower mass aggregates) to the bottom (larger mass aggregates). These results were also confirmed by electron microscopy analysis of the macro-structure formed by the different aggregates. Large fibrils were rapidly assembled when the recombinant protein was incubated in the presence of cellular extracts, but the GFP-GST fusion purified soon after lysis failed to undergo amyloidation, indicating that other cell components probably participate in the active formation of large aggregates. Finally, we showed that aggregates of lower complexity are more efficiently disaggregated by a combination of molecular chaperones.
An additional analytical tool is now available to investigate the aggregation process and separate subclasses by their mass. It was possible to demonstrate the complexity of the aggregation pattern of a recombinant protein expressed in bacteria and to characterize biochemically the different aggregate subclasses. Furthermore, we have obtained evidence that the cellular environment plays a role in the development of the aggregates and the problem of the artifact generation of aggregates has been discussed using in vitro models. Finally, the possibility of separating aggregate fractions with different complexities offers new options for biotechnological strategies aimed at improving the yield of folded and active recombinant proteins.
The concept of protein aggregation suggests a non-physiological process resulting in the formation of large structures, often chaotic, and in which the proteins have lost their original function/activity. Nevertheless, the collapse of the native conformation can also produce very regular structures, as in the case of amyloid fibrils . Such a process can originate from sensitive protein intermediates during folding as well as from partially denatured proteins that lost their native conformation as a consequence of stress conditions.
Cells possess a sophisticated quality control system to prevent the accumulation of protein aggregates. Molecular chaperones are engaged to promote the correct (re)-folding of misfolded molecules that otherwise undergo protease degradation. Misfolded proteins escaping the quality control may form aggregates that can be trapped in precipitates (aggresome in eukaryotic cells, inclusion bodies in bacteria) to limit their interference with the cell physiology . Inclusion bodies also have a storage function and parts of the trapped proteins are in a dynamic equilibrium with their soluble fraction . Under pathological conditions aggregates develop into structures that hinder the cell functions, as in the case of neuron degenerative diseases.
In bacteria the stress-dependent development of aggregates has been exploited to study the function of the chaperone network. Aggregation has been reversed in vivo and the identification of the chaperone combinations necessary for the re-folding of the proteins from aggregates was performed using in vitro conditions [4–7]. Nevertheless, the biophysical features of the aggregates have never been investigated. Heat shock is the most studied stress factor but recombinant protein expression can also dramatically modify the cell balance. In fact, the exploitation of highly efficient polymerases increases the rate of protein synthesis so that as much as 50% of the totally accumulated protein can be represented by the recombinant one and the cell folding machinery can become limiting. The optimization of some growth parameters, like the use of low growth temperatures and non-saturating amounts of expression inducer as well as the over-expression of chaperones by means of short heat shock, ethanol stress or recombinant co-expression [8, 9], has often improved the yields of recombinant soluble proteins. Nevertheless, in most of the cases part or all of the recombinant protein expressed in bacteria is recovered as precipitates in the inclusion bodies.
Both amorphous and organized inclusion bodies have been isolated . Their composition varies from almost homogeneous to cases in which 50% of the material is represented by contaminants [11, 12]. The structural heterogeneity of the inclusion bodies has recently been shown [13, 14] and it could be a consequence of the variable aggregation pattern to which a single protein can undergo under different conditions . Proteins trapped in the inclusion bodies can be re-solubilised in vivo by impairing the de novo protein synthesis because the block of new protein production makes available larger amounts of chaperones and foldases for refolding precipitated proteins . The temporal separation between recombinant expression of chaperones and target proteins has also been successfully used to improve the yield of soluble recombinant proteins . These results suggest a model for which soluble proteins are in a dynamic equilibrium with aggregates. In conclusion, modifications of the cell conditions can modulate the aggregation rate and the protein aggregation process can be reversed by conditions favorable for the folding machinery.
This dynamic view for which proteins can pass from soluble to insoluble and back to soluble state suggests the presence of different degrees of aggregation complexity. Soluble aggregates of recombinant proteins have been described [16, 17] and in a recent paper we have shown that the GFP-GST fusion protein expressed in bacteria forms aggregates with an estimated mass ranging from a few hundred kDa to more than 1000 kDa . The separation of the aggregates using a blue native gel electrophoresis followed by SDS-PAGE indicated an almost continuous distribution with few regions of concentrated accumulation. This kind of analysis allows for precise identification of aggregate patterns and comparison among different samples but is not suitable for the further characterization of the aggregates. Therefore, we present here an alternative protocol to separate sub-classes of aggregates using a sucrose step gradient and the results concerning the biophysical organization and biochemical specificities of such aggregates.
Preliminary experiments showed that the recombinant GFP-GST produced in bacteria grown at temperature higher than 30°C was mainly recovered in the pellet after ultracentrifugation of the lysates. Nevertheless, decreasing growth temperatures enabled the proportionally inversed recovery of the fusion protein in the supernatant. At 20°C roughly half of the total GFP-GST was in the supernatant (data not shown).
The recombinant protein from the four fractions was purified by metal affinity chromatography and both fluorescence and SDS-PAGE analysis indicated that the entire recombinant protein was bound and specifically eluted (data not shown). Protein amount determined by Bradford indicated that, on average, 39% of the total GFP-GST accumulated in the fraction 1, 14%, 22% and 25% in the other three, respectively, from the top to the bottom.
After ultracentrifugation of the lysate, the supernatant was loaded onto the sucrose gradient and the GFP-GST migrated exclusively to the interface between 0% and 30% sucrose (Fig. 1A, tube number 1). We knew from the preliminary experiments that bacteria grown at 30°C produced only insoluble GFP-GST. The fusion protein present in the total lysate from such bacteria was distributed almost exclusively in the fractions 3 and 4 and the fluorescence was almost undetectable (Fig. 1A, tube number 3).
The role of chaperones in limiting the protein aggregation has been widely demonstrated and DnaK has a key role in the chaperone network [4–7]. The sucrose step gradient demonstrated what kind of aggregate pattern modifications occur when the DnaK concentrations vary. No GFP-GST was recovered anymore in the upper fraction when DnaK- mutant bacteria were grown at 20°C and non-fluorescent aggregates largely accumulated in the lower fractions and even on the bottom of the tube (Fig. 1A, tube number 4). In contrast, both soluble GFP-GST and stronger fluorescence were detected after separation of a lysate from bacteria over-expressing DnaK grown at 30°C (Fig. 1A, tube number 5), suggesting that DnaK can improve the GFP-GST stability.
This first set of experiments showed the complexity of the aggregation pattern. In fact, the previously non-characterized insoluble fraction recovered in the pellet was distributed in three classes according to mass and it was possible to separate soluble and insoluble recombinant protein by means of a sucrose gradient. Noteworthy is also the fact that fluorescence can be found in all the four fractions (Fig. 1A), indicating that even in the insoluble aggregates of a larger mass at least part of the trapped recombinant protein conserved a native-like structure. This is in agreement with the report that part of the protein present in the inclusion bodies conserves its secondary structure . Aggregate sub-classes with different complexity and protease resistance have previously been identified in inclusion bodies and also in that case a protein fraction was still active [13, 14, 21]. In this study, the structural hetereogenity of the proteins trapped in the aggregates is confirmed by our data.
Biophysical characterization of the different aggregate fractions separated by sucrose gradient. The 4 fractions were analysed for their aggregation index, their elution profile using size exclusion chromatography (SEC) and calculating the ratio between aggregated and monodispersed protein, and their binding to the dye ThioflavinT, indicative of amyloid formation. The results refer to one experiment representative of three repetitions.
Aggregation index Abs 280/340 nm
SEC index monodispersed/ aggregated protein
ThioflavinT Abs 482 nm
The 4 GFP-GST fractions were also subjected to SEC and the ratio between the areas of the peaks corresponding to the monodispersed and the aggregated protein was calculated (SEC index). Such an index confirmed an increasing state of aggregation from sucrose fraction 1 to 4 (Table 1). Surprisingly, the SEC experiments showed that both aggregated and functional forms of the fusion protein were present in both the three fractions corresponding to the insoluble GFP-GST and the (soluble) fraction 1. Soluble aggregates have been described before and are probably common when fusion proteins are expressed [16, 17]. It was not possible to separate monodispersed GFP-GST from soluble aggregates by means of sucrose gradients of decreasing concentrations (data not shown).
We finally tried to characterize the aggregates according to their specific structure. ThioflavinT (ThT) is a dye that preferentially binds to amyloid-like fibrils . We measured an increasing binding when aggregates of higher complexity were used (Table 1). In contrast, there was not significant binding of any aggregate to 8-anilino-1-naphtalenesulfonic acid (ANSA) that has been used as a marker of the amorphous aggregates . This suggests that the aggregates formed by GFP-GST probably have a regular structure involving β-sheets rather than being a chaotic complex held together by hydrophobic interactions. Instead, a micellar organization has been proposed for the soluble aggregates [17, 22].
In the case of the GFP-GST fractions we showed that the degree of amyloidation detected by ThT-binding progressively increased from fraction 1 to fraction 4 (Table 1). The capacity to form fibrils is sequence specific  and it seems a generic feature of polypeptide chains . The development into fibrils is characterized by a log phase during which the aggregation seeds are formed followed by a period of rapid growth . Once formed, the fibrils act as aggregation seeds, speeding up the process. Therefore, it could be expected that larger aggregate networks have the possibility to develop faster into structures of higher complexity. In order to test this hypothesis, the GFP-GST from the four sucrose gradient fractions was recovered immediately after centrifugation and mounted for electron microscopy analysis.
Fibrils are the end product of GFP-GST aggregation but the different classes of aggregates separated by sucrose gradient can be considered as dynamic intermediates that can either develop to larger structures or be reversed into lower-complexity aggregates . Both the initial complexity and the incubation time of polypeptides prone to aggregation are crucial for the building of the aggregates. We wished to demonstrate the importance of these factors in a control experiment. GFP-GST was separated into fractions by sucrose gradient and the fractions 1 and 4 were mounted for electron microscopy only after 24 hours of incubation in the presence of the co-migrated cell components. Both samples raised similar large fibrils (Figure 2B), indicating that the incubation period was sufficient for both, independent of their initial aggregation state, to reach the rapid growth phase that leads to the fibril formation.
This experiment underlines once more the importance of the parameter time in studies dealing with aggregation and questions the meaning of some in vitro experiments. In fact, the fibril maturation outside the bacterial cell could have peculiar features. For instance, the lack of space-constrain or limitations in the disaggregation processes could enable the formation of fibrils the length of which are difficultly compatible with the size of E. coli cells (Figure 2B). The experiments described in the two last paragraphs will show the impact of cell components in promoting aggregation and disaggregation.
Finally, the presence of aggregation seeds smaller than 40 nm in diameter shows that it is not possible to discriminate between soluble and aggregated fractions by the use of simplified methods in high-throughput protocols as, for instance, the exploitation of a 0.65 μm pore size filter .
Therefore, these results strongly suggest that the co-presence of other molecules is necessary to trigger the process of regular aggregation of the recombinant protein, probably by facilitating the formation of aggregation seeds. Chaperones can play a role in the aggresome formation  and GroEL has been claimed to be actively involved in bacterial inclusion body formation . Our data can only confirm that GroEL co-migrates with the aggregates of larger mass (Fig. 1B). Finally, we are looking for an analytical method to determine if the process of cell lysis is crucial for the development of the aggregates.
Both in vivo and in vitro experiments illustrated the co-operative action of chaperone networks in disaggregating misfolded proteins [4–7] but the features of the real aggregates that are the target of the chaperones in the cells have never been investigated. We used the aggregates from fractions 3 and 4 to test if they could be a substrate for chaperone-dependent refolding and if the different structure complexity had a role on the refolding kinetic.
The preferential disaggregation of subclasses of aggregates with lower complexity observed in vitro is reminiscent of previous works indicating that specific subclasses of the proteins trapped in the inclusion bodies are preferentially refolded under physiological conditions [3, 13] and that the reversibility is increasingly difficult and dependent on the size of the aggregates . The limit of this experiment is that it is difficult to scale up and the small amount of the protein used was insufficient for undertaking further biophysical analysis. The aggregation index gives only relative values and, therefore, we can state that the degree of aggregation decreased but cannot conclude that the disaggregated protein was also correctly folded. Nevertheless, the results suggest that it would be of biotechnological interest to separate the aggregate subclasses and use the lower complexity aggregates in refolding protocols.
There is increasing evidence that aggregates are heterogeneous in size and complexity [2, 12–16, 26]. The aggresomes are actively built in eukaryotic cells and the physiological meaning of the process would be the packing of disorganized aggregates that could interfere with the normal cell functions by non-specifically binding to other cell components [33, 34]. The possibility to recover functional proteins from the insoluble aggregates  would indicate that at least in bacteria they can function as a reserve in dynamic equilibrium with soluble fractions.
In this paper we present data supporting the idea of a progressive maturation of recombinant GFP-GST aggregates into amyloid fibrils. Furthermore, it seems that the process is facilitated by some other cell components since the fibril maturation was extremely slower when the recombinant protein was separated from the other cell components soon after the lysis (Fig. 3). For instance, GroEL has been reported having an active role in inclusion body formation  and specifically co-migrate with the larger aggregates could (Fig. 1B). Conversely, the combination of DnaK, DnaJ, GrpE and ClpB could disaggregate large insoluble structures (Figures 4 and 5A).
It seems that the aggregation process of recombinant proteins is extremely more complicated than normally accepted and our separation protocol turned out to be a useful tool for characterizing the aggregates. Furthermore, such an aggregation process shares many features with the maturation of pathological amyloids in eukaryotic cells and, therefore, the bacterial system -experimentally easy to modify- would be considered as a model to integrate the results obtained using in vitro systems and to study the impact of chemical and biophysical parameters on the aggregation development. We simplified the work by using a fluorescent construct but any protein for which antibodies are available could be used for following the aggregation development.
A fusion construct His-GST-GFP cloned in a Gateway destination vector (Invitrogen, kindly provided by D. Waugh) was transformed and expressed in the following bacterial strains: BL 21 (DE3), BL 21 (DE3) RIL codon plus, GK2 (dnak -), BL 21 (DL3) co-expressing the chaperone combinations GroELS and GroELS/DnaK/DnaJ/GrpE/ClpB, respectively (kindly provided by B. Bukau). Bacteria were grown at 37°C until the OD600 reached 0.4, then the cultures were adapted to different temperatures (20°C, 25°C, 30°C, 37°C), induced at an OD600 of 0.6 with 0.1 mM IPTG and grown for further 20 h. The bacteria were pelleted by centrifugation (6000 g × 15 min), washed in 10 mL of PBS and finally stored at -20°C.
The pellet was resuspended in 10 mL of lysis-buffer (50 mM potassium phosphate buffer, pH 7.8, 0.5 M NaCl, 5 mM MgCl2, 1 mg/mL lysozyme, 10 μg/mL DNase), sonicated in a water bath (Branson 200) for 5 min and the lysate was incubated for 30 min on a shaker at room temperature. The supernatant was recovered after ultracentrifugation (35 min at 150000 × g).
Fractions from sucrose gradients were recovered using a bent Pasteur pipette and affinity purified using a HiTrap chelating affinity column (Amersham Biosciences) pre-equilibrated with 20 mM Tris HCl, pH 7.8, 500 mM NaCl, 15 mM imidazole. The His-tagged recombinant protein was eluted in 20 mM Tris, pH 7.8, 125 mM NaCl, and 250 mM imidazole. Protein quantification was based on the absorbance at 280 nm.
Total cell lysates or supernatants from ultracentrifugation of total cell lysates (1 mL) were loaded onto 14 × 95 mm Ultra-Clear centrifuge tubes (Beckman) prepared with a step gradient formed by four layers of 20 mM TrisHCl buffer, pH 8, containing 80%, 70%, 50%, 30%, and 0% sucrose, respectively. The tubes were centrifuged 15 hours at 180,000 × g at 4°C using a SW40Ti rotor and a L-70 Beckman ultracentrifuge. The protein fractions were recovered from the interfaces between two sucrose layers, affinity purified as described above and used for further analysis. The samples for gel filtration were concentrated and the buffer replaced with 50 mM TrisHCl, pH8.0, 150 mM NaCl using a Vivapore concentrator (Vivascience) and then separated by gel filtration using a Superose 12 HR 10/30 column (Amersham).
The aggregation rate of the proteins was analysed according to Nominé et al.  using an AB2 Luminescence Spectrometer (Aminco Bowman Series 2) equipped with SLM 4 software. The excitation was induced at 280 nm and the emission scan was recovered between 260 and 400 nm.
Amyloid aggregates were estimated according to their binding to the specific dye thioflavin-T (ThT), as described by LeVine , and protein surface hydrophobicity was determined using the fluorescent probe 8-anilino-1-naphtalenesulfonic acid (ANSA) .
Circular dichroism (CD) spectra were recorded between 250 and 190 nm using suprasil precision cells (Hellma) and a Jasco J-710 instrument.
Western blots were performed as previously described  using anti-GST primary antibodies. For dot blotting the proteins were transferred onto a PVDF membrane using a Bio-Rad Criterion blotter. The primary rabbit antibodies were a gift from Dr. Bukau and were purified from sera using Protein G Plus/Protein A Agarose (Oncogene) to minimize the background. Peroxidase-conjugated secondary antibodies for chemioluminescent detection were purchased from Dianova and the detection performed using the SuperSignal® West Femto Maximum Sensitivity Substrate (Pierce), following the supplier's instructions. Blots were used repeatedly by effectively removing the antigen-antibody interaction using the Western Blot Recycling it (Alpha Diagnostic Int.).
Protein samples were purified by affinity chromatography and equal amounts fixed by using the "single-droplet" parafilm protocol. 5 μL of each protein sample were pipetted on a grid (Agar Scientific) and incubated 1 min at room temperature. Excess fluid was removed using filter paper, the unbound protein was washed and the grids were placed on a 50 μL drop of 1% uranyl acetate with the section side downwards. Finally, the grids were dried, placed in the grid-chamber and stored in desiccators before the samples were observed with a CM120 BioTwin electron microscope (Philips).
The conditions for the chaperone-dependent disaggregation of GST-GFP in vitro were chosen according to Mogk et al.  and the process was monitored using the fluorimetric assay described above . 1 μM of aggregated protein was resuspended in 50 mM Tris HCl, pH 7.5, 20 mM MgCl2, 150 mM KCl, 2 mM DTT, in the presence of 1 μM ClpB, 1 μM DnaK, 0.2 μM DnaJ, 0.1 μM GrpE, 3 mM phosphoenolpyruvate, and 20 ng/mL of pyruvate kinase. The reaction was started by the addition of 2 mM NaATP.
The authors wish to thank Dr. M. Lopez de la Paz for the assistance with the electron microscopy, Dr. B. Bukau for having provided the chaperone vectors and antisera, Dr. A. Mogk for his refolding protocol, and Dr. D Waugh for his GFP-GST construct.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.