A 120 base pair (bp) DNA fragment containing a 60 bp tract of poly(CA) · poly(TG), when incubated under specific conditions and analyzed by polyacrylamide gel electrophoresis, can give rise to a series of bands. While some of these bands have been shown to correspond to multistranded complexes [5,6,7], a series of closely spaced bands initially named 'Form X' [5] were drawn to our attention for two reasons. First, they migrated near the regular double-stranded form of the fragment, suggesting that they might correspond to double-stranded, not multi-stranded, structures. Second, they were bound with high affinity by proteins HMG1 and HMG2, two abundant non-histone nuclear proteins for which no double-stranded DNA substrate with such a high affinity was known.
These bands are shown in Figure 1a, where they are noted X and migrate on this polyacrylamide gel between the two single strands of the DNA fragment. These structures are stable and can be purified by electroelution from a polyacrylamide gel ([5]; Fig. 1b lane 1). The study of their thermal stability (Fig. 1b) shows that they dissociate upon moderate heating to give the regular double-stranded fragment, which dissociates in turn at higher temperature to give its two single strands. Form X, therefore, contains both strands of the fragment in equimolar amount. Figure 1c shows the affinity of proteins HMG1 and HMG2 for Form X DNA. Under conditions where neither double-stranded nor single-stranded DNA is bound, Form X is entirely complexed with HMG1/2 even at a ratio of competitor DNA to Form X higher than 106. A detailed analysis of the relative affinities of HMG1/2 to different DNA substrates (double-stranded or single-stranded DNA, loops, minicircles, cruciform, Form X) will be presented elsewhere (C.G. and F.S., in preparation).
To study the detailed structure of Form X it was necessary to purify it in sufficient amounts. We have never been able to induce its formation directly, even by incubating the double-stranded fragment in the presence of a wide variety of agents or by varying the pH between 6.0 and 9.0. But, after it was shown that the formation of multistranded structures by fragments containing a tract of poly(CA) · poly(TG) was correlated to some extent with the opening of the double helix [6,7,8,9], we found that the most efficient way of producing Form X was to dissociate the DNA strands by thermal denaturation and to let them reassociate in the presence of protein HMG1 or HMG2. In this manner, we obtain the complexes of Form X with HMG1/2, which can then be dissociated by SDS and Form X purified (Fig. 1d, lane 2). In the absence of HMG1/2, this same process gives no Form X or hardly detectable amounts of Form X.
To study the structure of Form X, experiments were performed to study its sensitivity to single-strand specific nucleases (S1 and P1 nucleases), and to chemicals which react specifically with non-B regions of DNA (diethylpyrocarbonate, hydroxylamine, permanganate). Such experiments (not shown) clearly showed a change of conformation limited to the repetitive region of the fragments, but did not allow us to determine its exact structure. For example, the hypothesis that Form X might correspond to four-stranded structures could not be ruled out at that stage. In addition, structures containing staggered single-stranded loops resulting from shifted reassociation in the repetitive region had to be considered [7, 8]. Starting with this hypothesis, in an attempt to measure the length of such loops, we set out to determine the linking number of DNA in Form X, i.e. the number of times one strand turns around the other strand in such a structure. Indeed, if Form X contains unpaired regions, the linking number is expected to decrease by one unit for each unpaired turn of double helix (10.5 bp).
To test this hypothesis, Form X obtained with a 258 bp fragment containing the same 60 bp tract of poly(CA) · poly(TG) as above was circularized, and the products obtained were compared to a series of marker topoisomers obtained by circularization of the regular linear fragment in the presence of increasing amounts of ethidium bromide [10]. To clearly resolve all the topoisomers, the analysis was performed on polyacrylamide gels in the absence or in the presence of chloroquine [11] and is shown in Figure 2. It is observed that circularized Form X (noted Xc) does not migrate like any of the marker topoisomers. It can also be seen that, unlike linear Form X (XL), circular Form X is extremely stable and is not modified by incubation at 100°C (nor by alkaline pH, result not shown). Neither is it modified by incubation with calf thymus topoisomerase I or by human topoisomerase II, suggesting that it contains no superhelical stress. Since circular Form X does not correspond to any band in the series of topoisomer markers, we then considered that Form X might contain a large number of negative supercoils, larger than in the most supercoiled of the marker topoisomers, and that the corresponding torsional stress was absorbed and constrained by a change of conformation strictly limited to the poly (CA) · poly (TG) region and stabilized by supercoiling of the small circles. For example, a change of conformation from B-DNA to Z-DNA might have corresponded to this hypothesis.
This was shown not to be the case when Form X was inserted in a large DNA fragment, leading to the formation of 2384 bp circles in which the linking number was measured by two-dimensional agarose gel electrophoresis, the first dimension in the absence of chloroquine and the second dimension in the presence of chloroquine [11] (Figure 3). No difference of migration is visible between circles containing Form X and circles containing the regular form of the fragment. If the circles containing Form X are preincubated at 100°C before electrophoresis, exactly the same distribution of topoisomers is obtained (not shown). Therefore the presence of Form X on a DNA circle of that size does not modify its electrophoretic mobility, and it is impossible to show any change of linking number in Form X relative to regular DNA. In addition, as observed with 258 bp minicircles (Fig. 2), recutting the 2384 bp circles containing Form X shows that Form X is completely stable and resistant to heating at 100°C when contained in a covalently closed circle (Fig. 3d, lanes 3 and 4). Therefore the hypothesis of a global change of conformation of poly(CA) · poly(TG) induced or stabilized by supercoiling had to be ruled out. This experiment also shows with no ambiguity that Form X contains two ends only, since ligation of Form X with the vector yields almost exclusively monomeric circles, even in the presence of an excess of vector. Therefore Form X can only be a two-stranded structure.
At this stage of the work, Form X looked as a paradox: a stable non-B double stranded DNA structure, with no visible change of DNA linking number. This puzzle was resolved by ligating hairpin oligonucleotides at the ends of linear Form X. Figure 4 shows that Form X remains stable in molecules with closed ends, and resists heating to 100°C and treatment by alkaline pH (lanes 4 and 5), as do circular molecules, but unlike open linear molecules. It should also be noted that upon adding a hairpin oligonucleotide at one end only, Form X is not as stable and can be dissociated by heat treatment (lane 7), although not as easily as when it is contained in an open linear fragment.
These results show that both DNA strands in Form X do not simply run side by side as in regular DNA, but that they are somehow associated in a knot. A simple model appears if one considers the process used for producing Form X. During their reassociation, the two repetitive DNA strands do not necessarily pair in perfect register, but can also pair with a shift in the repetitive poly (CA) · poly (TG) region. In such a case, one should expect a pause when the reassociation process reaches the sides of the repetitive region. We suggest that during this pause one of the single strands of one end can insert in the fork formed by the two single strands at the opposite end, possibly through interactions between the short single-stranded repetitive sequences remaining on both sides of the central double-stranded region (Fig. 5). Then, complete pairing of the non repetitive sequences at both ends yields the formation of a loop at the base of which two duplexes cross, with one of the strands of one duplex passing between the two strands of the other duplex, and reciprocally, to form a structure which is schematically represented in Figure 5. Several parameters of such a structure can vary, including the location of the junction within the repetitive nucleotide sequence, the size of the loop, and the DNA linking number inside the loop. This is probably the explanation for the number of different bands shown by Form X on a polyacrylamide gel, where up to seven bands can be seen depending on the gel concentration.
The role of HMG1/2 in the process of formation of Form X can be envisioned at two stages. On one hand, HMG1/2 increases the flexibility of DNA [12, 13] which should facilitate the formation of a loop in the central region. On the other hand, the known affinity of HMG1/2 for DNA junctions should help stabilize the transient junction before complete pairing of the non-repetitive terminal regions.
The possibility that two DNA duplexes might associate to form such junctions has been proposed previously [14,15,16,17,18,19], and such structures have been termed 'hemicatenanes' to reflect the fact that each duplex is linked to the other duplex by only one of its strands, although from an etymological point of view the term 'semicatenane' formed with two Latin roots would seem more correct. We have continued to use the term 'Form X' which we used before we began to determine its structure. Semicatenanes have been considered to explain the association of meiotic chromosomes [16, 17], although a double Holliday junction structure has so far been preferred. Semicatenanes have also been proposed as intermediates in the replication of the genome of SV40 virus, to explain some intermediates which seem not to correspond to fully catenated structures [15, 18]. The possibility for producing such structures in vitro and their remarkable stability should now allow the study of their characteristics, which should, in turn, facilitate the investigation of their significance in vivo. It should be noted that the repetitive sequence is required only because of the particular process used to prepare Form X, but that there should be no theoretical objection to the existence of such structures with non-repetitive sequences.
The remarkable stability of this structure makes it possible to consider several experiments. For example, it should be possible to insert such structures into vectors and to introduce them into living cells, allowing to study their evolution and their effect on the biological activity of the DNA molecule in which they are inserted. Equally interesting would be the study of the enzymes and more generally of the proteins able to interact with these structures. The fact that proteins HMG1 and HMG2, two of the most abundant non-histone proteins, bind to Form X with very high affinity (C.G. and F.S., in preparation) already suggests that this structure might play a role in the function of the genome. In addition, the formation of DNA loops has often been proposed to explain several chromosomal structures or many regulation processes (see e.g. [20,21]), and the semicatenated loop described here is certainly among the most stable of all DNA loops observed so far.