research advances

Achievements and milestones

SBKB [doi:10.1038/fa_sbkb.2010.38]

Achievements in protein structure determination in the past five years have provided clues to the evolutionary, structural and functional relationships among proteins that are not evident from sequence data alone.

© Copyright Nigel Callaghan and licensed for reuse under this Creative Commons Licence

The Protein Structure Initiative (PSI) was created to expand the impact and value of the Human Genome Project, and other genome sequencing projects, using three-dimensional (3D) protein structure analysis. The research highlighted here represents milestones in the field, providing key clues to evolutionary, structural and functional relationships among proteins that are not evident from sequence data alone.

Several of these papers highlight the development of new technologies to deal rapidly and cost-effectively with the volume of data that is emerging. Structural genomics has been revolutionized by the development of high-throughput structure determination pipelines. This was first demonstrated in 2002 by the investigation of the entire proteome (1,877 genes) of a single organism, Thermotoga maritima. By establishing high-throughput instrumentation, protein-production platforms and advances in structural methods, researchers can now use this highly productive approach to tackle more complex systems 1 .

An example of a study that uses a high-throughput pipeline is the analysis of a crystallized protein that is one of a very large family (more than 600 members) of novel, ocean metagenome-specific proteins identified by clustering of the dataset from the Global Ocean Sampling expedition of Craig Venter and colleagues. Despite the lack of sequence similarity, the crystal structure is similar to RNA-binding Sm proteins, although its pentameric nature is unique 2 . Database searches for homology suggest a cyanophage origin for this putative RNA/DNA-processing molecule 2 .

Using high-throughput technology, researchers are able to focus on large groups of unique structure motifs, including those that have never previously been studied experimentally, but some of which were predicted to have important and/or novel functions. For example, crystal structures of three members of the previously uncharacterized protein family Pfam PF08000 provide the first evidence of the existence of pleckstrin homology (PH)-like domains in bacteria. PH domains are found in proteins involved in a wide variety of targeting functions, from intracellular signaling to cytoskeletal organization. PH domains are ubiquitous in eukaryotic proteins and, until recently, were thought to be limited to eukaryotes. The structural genomics data suggest that the PH domain superfamily may have existed before prokaryotes and eukaryotes diverged, suggesting evolutionary conservation of intracellular signaling 3 .

One way to discover the function of numerous newly discovered genes is to solve their 3D structures. Researchers have been hampered until recently by the lack of ready access to both X-ray diffraction and NMR spectroscopy instruments. Compact Light Source (CLS) is the first laboratory-scale synchrotron light source that uses inverse Compton scattering to generate X-rays. CLS appeals to crystallographers because of its potential as a locally available synchrotron X -ray source providing rapid turnaround of results from crystallization trials to X-ray screening and data collection 4 .

Functional analysis of molecules goes hand in hand with data from crystallographic studies. For example, PSI JCSG researchers have been focusing on the Mre11 nuclease, which plays a central role in the repair of cytotoxic and mutagenic DNA. After defining the first crystal structure of a bacterial Mre11, biochemical characterization revealed that it acts as an endonuclease and an exonuclease on single-stranded and double-stranded DNA, respectively. The ability of Mre11 to discriminate between substrates during DNA repair may have implications for the design of cancer therapies that disrupt its activity 5 .

Proteins that are widely distributed attract much research interest, particularly those that deploy novel mechanisms in host defense. Clustered regularly interspersed, short palindromic repeats (CRISPRs), with their CRISPR-associated (CAS) proteins, protect microbial cells from invasion by phages or plasmids. Over 40 CAS protein families and multiple subtypes have been identified in prokaryotic genomes and seem to provide effective phage immunity. Crystallography data and functional studies suggest, for the first time, that CAS2 proteins are sequence-specific endoribonucleases. They may have a role in CRISPR-mediated anti-phage defense by degrading phage or cellular mRNAs 6 .

Many bacteria pathogenic to plants or animals use a type-III secretion apparatus to inject effector proteins into host cells. Effectors alter cell signaling and host responses induced upon infection and often target the ubiquitin pathway. Among the bacterial proteins recently characterized is a new class of E3 ubiquitin ligases that are delivered by the type-III secretion apparatus of Gram-negative pathogens into host cells. Shigella flexneri, which causes dysentery in humans, has effectors that are members of the IpaH family of E3 ubiquitin ligases. Analysis of IpAH crystals has demonstrated that the structure of the IpaH carboxy-terminal domain represents a unique, all-helical fold that carries the catalytic activity for ubiquitin transfer 7 .

The type-VI protein secretion system (T6SS) has emerged recently as important for the pathogenicity of several Gram-negative bacterial species. T6SS mediates cytotoxicity in phagocytic cells and is required for secretion of toxins by the bacterium. One of the common components of the system is the Hcp protein, originally described as a hemolysin co-regulated protein. Homologs of V. cholerae hcp genes have been found in all bacteria with a characterized T6SS and are also present in the serotype O1 strains of V. cholerae. The secretion apparatus functions during chronic infections and Hcp1 can be detected in the pulmonary secretions of cystic fibrosis patients and Hcp1-specific antibodies in their sera. The structure of Hcp1 from Pseudomonas aeruginosa was determined by PSI MCSG researchers. Type-VI secretion systems are widely distributed among bacterial pathogens and may play a general role in mediating host interactions 8 .

Keeping a constant source of energy available is crucial to bacterial metabolism. Inorganic polyphosphate (PolyP) is a linear polymer of tens to hundreds of phosphate residues linked by high-energy bonds. PolyP plays numerous and vital roles in metabolism and regulation in prokaryotes and eukaryotes and is environmentally ubiquitous and abundant. In prokaryotes, the synthesis and use of polyP are catalyzed by two families of polyP kinases, PPK1 and PPK2. Crystal structure and functional studies of PPK2 enzymes have shown that the PPK2 family function preferentially as PolyP-dependent nucleotide kinases; by the combined action of two PPK2 enzymes, AMP can be converted to ATP. These results suggest that the PPK2s represent a molecular mechanism by which bacteria can use polyP as an intracellular energy reserve under conditions of stress 9 .

In virology, researchers have been studying the structure of the non-structural (NS1) protein from influenza A and B viruses. NS1 has a key role in virulence by countering host antiviral defenses by binding double-stranded RNA (dsRNA) and various human host proteins involved in the innate immune response. NMR and crystallographic data have revealed highly conserved surface tracks of basic and hydrophilic residues that interact with dsRNA and are quite different from previously described models. At the center of this dsRNA-binding epitope is a deep pocket that provides a target for the development of antivirals against both influenza A and B 10 . In related work, PSI NESG researchers also determined the crystal structure of the C-terminal domain of the influenza A NS1 protein in a complex with the F2F3 region of one of its human host target proteins, the 30 KDa subunit of the cellular polyadenylation and specificity factor (CPSF30) 11 . The structure revealed a binding pocket for CPSF30. Mutation of residues within the pocket abolishes binding, but does not disrupt the overall fold. This structural information was used to design mutant viruses that produce NS1 proteins that do not bind CPSF30. These mutant viruses are attenuated as a result of their inability to block the processing of the mRNA for interferon 10 . The structure of the NS1:CPSF30 complex identifies the CPSF30-binding pocket on NS1 as a second potential target site for the development of antiviral drugs.

Development of novel therapeutics is just one of the outcomes of structural genomics research. These milestone papers show how PSI can deliver novel biological insights that are only evident from detailed molecular 3D structure and function analysis.

Catherine Whitlock

References:
  1. S. A. Lesley, P. Kuhn, A. Godzik, A. M. Deacon, I. Mathews et al. Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline.

    Proc. Natl Acad. Sci. USA 99, 11664-11669 (2002). doi:10.1073/pnas.142413399

  2. D. Das, P. Kozbial, H. L. Axelrod, M. D. Miller, D. McMullan et al. Crystal structure of a novel Sm-like protein of putative cyanophage origin at 2.60 Å resolution.

    Proteins 75, 296-307 (2009). doi:10.1002/prot.22360

  3. Q. Xu, A. Bateman, R. D. Finn, P. Abdubek, T. Astakhova et al. Bacterial pleckstrin homology domains: a prokaryotic origin for the PH domain.

    J. Mol. Biol. 396, 31-46 (2010). doi:10.1016/j.jmb.2009.11.006

  4. J. Abendroth, M. S. McCormick, T. E. Edwards, B. Staker, R. Loewen et al. X-ray structure determination of the glycine cleavage system protein H of Mycobacterium tuberculosis using an inverse Compton synchrotron X-ray source.

    J. Struct. Funct. Genomics 11, 91-100 (2010). doi:10.1007/s10969-010-9087-6

  5. D. Das, D. Moiani, H. L. Axelrod, M. D. Miller, D. McMullan et al. Crystal structure of the first eubacterial Mre11 nuclease reveals novel features that may discriminate substrates during DNA repair.

    J. Mol. Biol. 397, 647-663 (2010). doi:10.1016/j.jmb.2010.01.049

  6. N. Beloglazova, G. Brown, M. D. Zimmerman, M. Proudfoot, K. S. Makarova et al. A novel family of sequence-specific endoribonucleases associated with the clustered regularly interspaced short palindromic repeats.

    J. Biol. Chem. 283, 20361-20371 (2008). doi:10.1074/jbc.M803225200

  7. A. U. Singer, J. R. Rohde, R. Lam, T. Skarina, O. Kagan et al. Structure of the Shigella T3SS effector IpaH defines a new class of E3 ubiquitin ligases.

    Nature Struct. Mol. Biol. 15, 1293-1301 (2008). doi:10.1038/nsmb.1511

  8. D. Mougous, M. E. Cuff, S. Raunser, A. Shen, M. Zhou et al. A virulence locus of Pseudomonas aeruginosa encodes a protein secretion apparatus.

    Science 312, 1526-1530 (2006). doi:10.1126/science.1128393

  9. B. Nocek, S. Kochinyan, M. B. Proudfoot, G. Brown, E. Evdokimova et al. Polyphosphate-dependent synthesis of ATP and ADP by the family-2 polyphosphate kinases in bacteria.

    Proc. Natl Acad. Sci. USA 105, 17730-17735 (2008). doi:10.1073/pnas.0807563105

  10. C. Yin, J. A. Khan, G. V. T. Swapna, R. M. Krug, L. Tong et al. Conserved surface features form the double-stranded RNA-binding site of non-structural protein 1 (NS1) from influenza A and B viruses.

    J. Biol. Chem. 282, 20584-20592 (2007). doi:10.1074/jbc.M611619200

  11. K. Das, L.-C. Ma, R. Xiao, B. Radvansky, J. Aramini et al. Structural basis for suppression by influenza A virus of a host antiviral response.

    Proc. Natl Acad. Sci. USA 105, 13092-13097 (2008). doi:10.1073/pnas.0805213105

search

Explore proteins and this website

search

help