GPY/F is a highly conserved domain of IN that mediates multimerization (Ebina, Chatterjee et al. 2008). .
The C-terminal domains in the INs of LTR-retrotransposons and retroviruses are not well conserved. However, close examination of C-termini did identify one motif that exists in a wide variety of INs (Malik and Eickbush 1999). This module termed the GPY/F motif is present in the INs of a diverse set of LTR-retrotransposons in the Metaviridae family (formally Ty3/gypsy) and in the gamma class of retroviruses (Fig. 1A) (Malik and Eickbush 1999; Jern, Sperber et al. 2005). The function of this motif has not been studied. The IN of Tf1, as a recombinant protein, is highly soluble and possesses robust catalytic activity (Hizi and Levin 2005). These properties motivated us to consider the IN of Tf1 as a potential model to study the function of the GPY/F domain.
1. The IN of Tf1 possesses structural features typical of other INs.
Since little was known about the structure of Tf1 IN, our initial experiments tested whether it was sufficiently similar to other INs to serve as a model. IN purified from bacteria was subjected to partial proteolysis with trypsin to determine whether it possessed the three domain architecture found in other INs. The N-terminal amino-acid sequences of the protein fragments determined the cleavages occurred at amino acids 110 and 354. This confirmed IN was composed of N-terminal, central core, and C-terminal domains (Fig. 1B). When the full Tf1 IN is aligned with other INs its conserved residues and the size of its domains closely resembles the IN of Moloney murine leukemia virus (M-MuLV) (Fig. 1B). The cleavage between the central and C-terminal domains occurred in the middle of the GPY/F motif. This indicated the GPY/F region assembles into two stable segments split by less structured residues.

Figure 1. The GPY/F motif of IN. A. An alignment of INs from the Metavirus family of transposons and the Gamma family of retroviruses shows the conserved residues of the motif (yellow). B. A scaled diagram of INs showing the three domain architecture reveals similarities between the INs of Tf1 and M-MuLV.
To validate whether the domains of Tf1 IN possessed the functions associated with the domains of other INs, and to study the function of the GPY/F motif, a series of recombinant proteins consisting of different sections of Tf1 IN were expressed in bacteria and purified (Fig 2). The central core domain of HIV-1 IN contains the catalytic module that is sufficient to support strand cleavage and joining as measured with the disintegration assay. However, in the IN of M-MuLV the domains are larger, and in this case both the central and C-terminal domains are necessary for catalytic activity. We tested which portions of Tf1 IN were required for catalytic activity using the same disintegration assay in the previous section. The central domain by itself lacked activity. Sequential deletions revealed the N-terminal domain and the GPY/F motif were necessary for activity.

Figure 2. Section of Tf1 IN produced as recombinant proteins.

Figure 3. Gel filtration of IN proteins revealed that the GPY/F fragment formed multimers.
In solution the INs of HIV-1, M-MuLV, and avian sarcoma virus (ASV) form a dimer-tetramer equilibrium. In initial experiments to test the IN of Tf1 for the propensity to multimerize we tested full-length IN for interactions with the individual portions of the protein. Using a precipitation procedure and our recombinant proteins, we found that the N-terminal domain, the central core, and the 71 amino acid GPY/F fragment all bound to the full-length IN. To test directly for stable multimers we performed gel filtration with superdex 200. In a buffer of 50 mM HEPES, pH 7.5, 0.5 M NaCl, and 1% (v/v) glycerol, IN at 1 mg/ml eluted as a single peak with an observed molecular weight of 126.5 kDa, the size predicted for a dimer. The central core by itself was also observed to form a stable dimer. This ability of the central core and the full-length proteins to dimerize was typical of other INs.
2. The GPY/F fragment formed dimers, trimers, and tetramers.
To investigate the contribution of the C-terminal domains to the multimerization of IN, the GPY/F fragment and the CHD were subjected to gel filtration with superdex 75 (Fig. 3A). The CHD had an estimated molecular weight of 11.7 kDa, indicating it was a monomer. The profile produced by the GPY/F fragment included three major peaks (Fig. 3B). The apparent sizes of these species were monomer, dimer, and trimer. To test whether the GPF residues in the center of the motif contributed to multimerization, single amino acid substitutions were generated. Both substitutions, GPF to APF and GPF to GAF completely disrupted multimerization of the GPY/F fragment (Figs. 3C and 3D). These data indicate that the GPY/F residues play a central role in promoting multimerization. In separate experiments to test for multimers we subjected the GPY/F fragment to the chemical cross-linker dithiobis succinimidyl propionate. Gel electrophoresis of the cross-linked sample indicated the protein was in an equilibrium of monomers, dimers, trimers and tetramers.
3. The DNA binding of Tf1 IN.
The C-terminal domains of INs are known to bind DNA without sequence specificity. To map which sections of Tf1 IN interact with DNA, each of the individual domains was assayed for DNA binding. Labeled oligonucleotides were mixed with the individual domains and the mixtures were cross-linked by UV. The full-length IN, CH-,, core, and the GPY/F fragment had substantial DNA binding activity. These DNA binding activities correspond well to what has been described for other INs. Interestingly, CH- bound significantly more DNA than the full-length IN. This indicated that the inhibitory activity of the CHD may act by blocking DNA binding.
In additional experiments the single amino acid substitutions in the GPF residues did not reduce DNA binding. This result indicates that other sequences in the GPY/F fragment mediated the DNA binding activity and the contribution of the GPF residues in the GPY/F fragment appears to be specific for promoting multimerization.
4. The function of the GPY/F motif.
The contribution to catalysis of the GP residues was tested by generating recombinant IN with the substitutions GPF to APF and GPF to GAF. Both of these mutations abolished the strand transfer and disintegration activities. This requirement of the GPF residues for catalysis together with their contribution to multimerization indicated that the principle function of the GPY/F motif was to promote multimerization of the C-terminus. Our finding that the GPF to APF and GPF to GAF mutations did not reduce DNA binding indicated that multimerization was not required for DNA binding. The wide-spread conservation of the GPY/F motif supports the conclusion that these amino acids play a critical role in the function of IN.
