Structural genomics sheds light on protein functions and remote homologs across the insect tree of life

Structural genomics sheds light on protein functions and remote homologs across the insect tree of life

  • Lewin, H. A. et al. The Earth BioGenome Project 2020: Starting the clock. Proc. Natl. Acad. Sci. USA 119, e2115635118 (2022).

  • Thomas, G. W. C. et al. Gene content evolution in the arthropods. Genome Biol. 21, 1–14 (2020).

    Article 

    Google Scholar
     

  • Marks, R. A., Hotaling, S., Frandsen, P. B. & VanBuren, R. Representation and participation across 20 years of plant genome sequencing. Nat. Plants 7, 1571–1578 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Opulente, D. A. et al. Genomic factors shape carbon and nitrogen metabolic niche breadth across Saccharomycotina yeasts. Science 384, eadj4503 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hug, L. A. et al. A new view of the tree of life. Nat. Microbiol. 1, 1–6 (2016).

    Article 

    Google Scholar
     

  • Venter, J. C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Perez-Sepulveda, B. M. et al. An accessible, efficient and global approach for the large-scale sequencing of bacterial genomes. Genome Biol. 22, 349 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Shen, X.-X. et al. Tempo and mode of genome evolution in the budding yeast subphylum. Cell 175, 1533–1545.e20 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Blackstock, W. P. & Weir, M. P. Proteomics: quantitative and physical mapping of cellular proteins. Trends Biotechnol. 17, 121–127 (1999).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Anderson, N. L. & Anderson, N. G. Proteome and proteomics: new technologies, new concepts, and new words. Electrophoresis 19, 1853–1861 (1998).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hamamsy, T. et al. Protein remote homology detection and structural alignment using deep learning. Nat. Biotechnol. 42, 975–985 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kilinc, M., Jia, K. & Jernigan, R. L. Improved global protein homolog detection with major gains in function identification. Proc. Natl. Acad. Sci. USA 120, e2211823120 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Rost, B. Twilight zone of protein sequence alignments. Protein Eng. Des. Sel. 12, 85–94 (1999).

    Article 
    CAS 

    Google Scholar
     

  • Chothia, C. & Lesk, A. M. The relation between the divergence of sequence and structure in proteins. EMBO J 5, 823–826 (1986).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sali, A., Glaeser, R., Earnest, T. & Baumeister, W. From words to literature in structural proteomics. Nature 422, 216–225 (2003).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Seong, K. & Krasileva, K. V. Prediction of effector protein structures from fungal phytopathogens enables evolutionary analyses. Nat. Microbiol. 8, 174–187 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Barrio-Hernandez, I. et al. Clustering predicted structures at the scale of the known protein universe. Nature 622, 637–645 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Illergård, K., Ardell, D. H. & Elofsson, A. Structure is three to ten times more conserved than sequence—A study of structural response in protein cores. Proteins Struct. Funct. Bioinform. 77, 499–508 (2009).

    Article 

    Google Scholar
     

  • Nomburg, J. et al. Birth of protein folds and functions in the virome. Nature 633, 710–717 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kim, R. S., Levy Karin, E., Mirdita, M., Chikhi, R. & Steinegger, M. BFVD—a large repository of predicted viral protein structures. Nucleic Acids Res. 53, D340–D347 (2025).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Varadi, M. et al. AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models. Nucleic Acids Res. 50, D439–D444 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Burley, S. K. et al. RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res. 49, D437–D451 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Sillitoe, I. et al. CATH: increased structural coverage of functional space. Nucleic Acids Res. 49, D266–D273 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Lau, A. M. et al. Exploring structural diversity across the protein universe with The Encyclopedia of Domains. Science 386, eadq4946 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Stork, N. E. How many species of insects and other terrestrial arthropods are there on earth? Annu. Rev. Entomol. 63, 31–45 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • May, R. M. How many species are there on earth?. Science 241, 1441–1449 (1988).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Misof, B. et al. Phylogenomics resolves the timing and pattern of insect evolution. Science 346, 763–767 (2014).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Rainford, J. L., Hofreiter, M., Nicholson, D. B. & Mayhew, P. J. Phylogenetic distribution of extant richness suggests metamorphosis is a key innovation driving diversification in insects. PLoS One 9, e109085 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Whitfield, J. B. & Kjer, K. M. Ancient rapid radiations of insects: challenges for phylogenetic analysis. Annu. Rev. Entomol. 53, 449–472 (2008).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Sharma, P. P. Integrating morphology and phylogenomics supports a terrestrial origin of insect flight. Proc. Natl. Acad. Sci. USA 116, 2796–2798 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wipfler, B. et al. Evolutionary history of Polyneoptera and its implications for our understanding of early winged insects. Proc. Natl. Acad. Sci. USA 116, 3024–3029 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • van Kempen, M. et al. Fast and accurate protein structure search with Foldseek. Nat. Biotechnol. 42, 243–246 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Yeo, J. et al. Metagenomic-scale analysis of the predicted protein structure universe. bioRxiv https://doi.org/10.1101/2025.04.23.650224 (2025).

  • Akdel, M. et al. A structural biology community assessment of AlphaFold2 applications. Nat. Struct. Mol. Biol. 29, 1056–1067 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Monzon, V., Haft, D. H. & Bateman, A. Folding the unfoldable: using AlphaFold to explore spurious proteins. Bioinforma. Adv. 2, vbab043 (2022).

    Article 

    Google Scholar
     

  • Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zhong, X. et al. Structural mechanisms for regulation of GSDMB pore-forming activity. Nature 616, 598–605 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Johnson, A. G. et al. Structure and assembly of a bacterial gasdermin pore. Nature 628, 657–663 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Johnson, A. G. et al. Bacterial gasdermins reveal an ancient mechanism of cell death. Science 375, 221–225 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, C. et al. Structural basis for GSDMB pore formation and its targeting by IpaH7.8. Nature 616, 590–597 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Devant, P. & Kagan, J. C. Molecular mechanisms of gasdermin D pore-forming activity. Nat. Immunol. 24, 1064–1075 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Prashar, A. et al. Crystal structures of PirA and PirB toxins from Photorhabdus akhurstii subsp. akhurstii K-1. Insect Biochem. Mol. Biol. 162, 104014 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Lee, C.-T. et al. The opportunistic marine pathogen Vibrio parahaemolyticus becomes virulent by acquiring a plasmid that expresses a deadly toxin. Proc. Natl. Acad. Sci. USA 112, 10798–10803 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, H.-C. et al. A bacterial binary toxin system that kills both insects and aquatic crustaceans: Photorhabdus insect-related toxins A and B. PLoS Pathog. 19, e1011330 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Saito, M. et al. Fanzor is a eukaryotic programmable RNA-guided endonuclease. Nature 620, 660–668 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jiang, K. et al. Programmable RNA-guided DNA endonucleases are widespread in eukaryotes and their viruses. Sci. Adv. 9, eadk0171 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Altae-Tran, H. et al. The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases. Science 374, 57–65 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bao, W. & Jurka, J. Homologues of bacterial TnpB_IS605 are widespread in diverse eukaryotic transposable elements. Mob. DNA 4, 12 (2013).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yoon, P. H. et al. Eukaryotic RNA-guided endonucleases evolved from a unique clade of bacterial enzymes. Nucleic Acids Res. 51, 12414–12427 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gligorijević, V. et al. Structure-based protein function prediction using graph convolutional networks. Nat. Commun. 12, 3168 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xu, J. & Zhang, Y. How significant is a protein structure similarity with TM-score = 0.5? Bioinformatics 26, 889–895 (2010).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zhang, Y., Hubner, I. A., Arakaki, A. K., Shakhnovich, E. & Skolnick, J. On the origin and highly likely completeness of single-domain protein structures. Proc. Natl. Acad. Sci. USA 103, 2605–2610 (2006).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Shinoda, T. & Itoyama, K. Juvenile hormone acid methyltransferase: a key regulatory enzyme for insect metamorphosis. Proc. Natl. Acad. Sci. USA 100, 11986–11991 (2003).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Jindra, M., Palli, S. R. & Riddiford, L. M. The juvenile hormone signaling pathway in insect development. Annu. Rev. Entomol. 58, 181–204 (2013).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Bänziger, C. et al. Wntless, a conserved membrane protein dedicated to the secretion of Wnt proteins from signaling cells. Cell 125, 509–522 (2006).

    Article 
    PubMed 

    Google Scholar
     

  • Korkut, C. et al. Trans-synaptic transmission of vesicular Wnt signals through Evi/Wntless. Cell 139, 393–404 (2009).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Martin-Martin, I. et al. ADP binding by the Culex quinquefasciatus mosquito D7 salivary protein enhances blood feeding on mammals. Nat. Commun. 11, 2911 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Martin-Martin, I. et al. Aedes aegypti D7 long salivary proteins modulate blood feeding and parasite infection. MBio 14, e0228923 (2023).

    Article 
    PubMed 

    Google Scholar
     

  • Holleufer, A. et al. Two cGAS-like receptors induce antiviral immunity in Drosophila. Nature 597, 114–118 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Slavik, K. M. et al. cGAS-like receptors sense RNA and control 3′2′-cGAMP signalling in Drosophila. Nature 597, 109–113 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Li, Y. et al. cGLRs are a diverse family of pattern recognition receptors in innate immunity. Cell 186, 3261–3276.e20 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, J. & Meng, W. cGAS: Bridging immunity and metabolic regulation. J. Mol. Cell Biol. mjaf018 (2025).

  • Palmer, C. S. Innate metabolic responses against viral infections. Nat. Metab. 4, 1245–1259 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Liu, H., Wang, F., Cao, Y., Dang, Y. & Ge, B. The multifaceted functions of cGAS. J. Mol. Cell Biol. 14, mjac031 (2022).

  • Cai, H. et al. 2′3′-cGAMP triggers a STING- and NF-κB–dependent broad antiviral response in Drosophila. Sci. Signal. 13, eabc4537 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Antonova, Y., Alvarez, K. S., Kim, Y. J., Kokoza, V. & Raikhel, A. S. The role of NF-κB factor REL2 in the Aedes aegypti immune response. Insect Biochem. Mol. Biol. 39, 303–314 (2009).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Martin, M., Hiroyasu, A., Guzman, R. M., Roberts, S. A. & Goodman, A. G. Analysis of Drosophila STING reveals an evolutionarily conserved antimicrobial function. Cell Rep. 23, 3537–3550.e6 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kristensen, N. P. Phylogeny of insect orders. Annu. Rev. Entomol. 26, 135–157 (1981).

    Article 

    Google Scholar
     

  • Ribeiro, T. M. & Espíndola, A. Integrated phylogenomic approaches in insect systematics. Curr. Opin. Insect Sci. 61, 101150 (2024).

    Article 
    PubMed 

    Google Scholar
     

  • Chesters, D. The phylogeny of insects in the data-driven era. Syst. Entomol. 45, 540–551 (2020).

    Article 

    Google Scholar
     

  • Trautwein, M. D., Wiegmann, B. M., Beutel, R., Kjer, K. M. & Yeates, D. K. Advances in insect phylogeny at the dawn of the postgenomic era. Annu. Rev. Entomol. 57, 449–468 (2012).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Behura, S. K. Insect phylogenomics. Insect Mol. Biol. 24, 403–411 (2015).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yeates, D. K., Meusemann, K., Trautwein, M., Wiegmann, B. & Zwick, A. Power, resolution and bias: recent advances in insect phylogeny driven by the genomic revolution. Curr. Opin. Insect Sci. 13, 16–23 (2016).

    Article 
    PubMed 

    Google Scholar
     

  • Giribet, G. & Edgecombe, G. D. The phylogeny and evolutionary history of arthropods. Curr. Biol. 29, R592–R602 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Johnson, K. P. Putting the genome in insect phylogenomics. Curr. Opin. Insect Sci. 36, 111–117 (2019).

    Article 
    PubMed 

    Google Scholar
     

  • Tihelka, E. et al. The evolution of insect biodiversity. Curr. Biol. 31, R1299–R1311 (2021).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kohli, M. et al. Evolutionary history and divergence times of Odonata (dragonflies and damselflies) revealed through transcriptomics. iScience 24, 103324 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kawahara, A. Y. et al. Phylogenomics reveals the evolutionary timing and pattern of butterflies and moths. Proc. Natl. Acad. Sci. USA. 116, 22657–22663 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Johnson, K. P. et al. Phylogenomics and the evolution of hemipteroid insects. Proc. Natl. Acad. Sci. USA 115, 12775–12780 (2018).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Peters, R. S. et al. Evolutionary history of the hymenoptera. Curr. Biol. 27, 1013–1018 (2017).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Blaimer, B. B. et al. Key innovations and the diversification of Hymenoptera. Nat. Commun. 14, 1212 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Almeida, E. A. B. et al. The evolutionary history of bees in time and space. Curr. Biol. 33, 3409–3422.e6 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • de Moya, R. S. et al. Phylogenomics of parasitic and nonparasitic lice (Insecta: Psocodea): combining sequence data and exploring compositional bias solutions in next generation data sets. Syst. Biol. 70, 719–738 (2021).

    Article 
    PubMed 

    Google Scholar
     

  • Kawahara, A. Y. et al. A global phylogeny of butterflies reveals their evolutionary history, ancestral hosts and biogeographic origins. Nat. Ecol. Evol. 7, 903–913 (2023).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • McKenna, D. D. et al. The evolution and genomic basis of beetle diversity. Proc. Natl. Acad. Sci. USA 116, 24729–24737 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Delsuc, F., Brinkmann, H. & Philippe, H. Phylogenomics and the reconstruction of the tree of life. Nat. Rev. Genet. 6, 361–375 (2005).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Philippe, H., Delsuc, F., Brinkmann, H. & Lartillot, N. Phylogenomics. Annu. Rev. Ecol. Evol. Syst. 36, 541–562 (2005).

    Article 

    Google Scholar
     

  • Steenwyk, J. L., Li, Y., Zhou, X., Shen, X.-X. & Rokas, A. Incongruence in the phylogenomics era. Nat. Rev. Genet. 24, 834–850 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Shen, X.-X., Steenwyk, J. L. & Rokas, A. Dissecting incongruence between concatenation- and quartet-based approaches in phylogenomic data. Syst. Biol. 70, 997–1014 (2021).

    Article 
    PubMed 

    Google Scholar
     

  • Shen, X.-X., Hittinger, C. T. & Rokas, A. Contentious relationships in phylogenomic studies can be driven by a handful of genes. Nat. Ecol. Evol. 1, 0126 (2017).

    Article 

    Google Scholar
     

  • Mutti, G., Ocaña-Pallarès, E. & Gabaldón, T. Newly developed structure-based methods do not outperform standard sequence-based methods for large-scale phylogenomics. Mol. Biol. Evol. 42, msaf149 (2025).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Durairaj, J. et al. Uncovering new families and folds in the natural protein universe. Nature 622, 646–653 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Mifsud, J. C. O. et al. Mapping glycoprotein structure reveals Flaviviridae evolutionary history. Nature 633, 695–703 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Huang, J. et al. Discovery of deaminase functions by structure-based protein clustering. Cell 186, 3182–3195.e14 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Himmel, N. J., Moi, D. & Benton, R. Remote homolog detection places insect chemoreceptors in a cryptic protein superfamily spanning the tree of life. Curr. Biol. 33, 5023–5033.e4 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Liu, W. et al. PLMSearch: Protein language model powers accurate and fast sequence search for remote homology. Nat. Commun. 15, 2775 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hong, L. et al. Fast, sensitive detection of protein homologs using deep dense retrieval. Nat. Biotechnol. 43, 983–995 (2025).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Jenson, J. M. & Chen, Z. J. cGAS goes viral: a conserved immune defense system from bacteria to humans. Mol. Cell 84, 120–130 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wein, T. & Sorek, R. Bacterial origins of human cell-autonomous innate immune mechanisms. Nat. Rev. Immunol. 22, 629–638 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hobbs, S. J. & Kranzusch, P. J. Nucleotide immune signaling in CBASS, Pycsar, thoeris, and CRISPR antiphage defense. Annu. Rev. Microbiol. 78, 255–276 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sun, L., Wu, J., Du, F., Chen, X. & Chen, Z. J. Cyclic GMP-AMP synthase is a cytosolic DNA sensor that activates the type i interferon pathway. Science 339, 786–791 (2013).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Culbertson, E. M. & Levin, T. C. Eukaryotic CD-NTase, STING, and viperin proteins evolved via domain shuffling, horizontal transfer, and ancient inheritance from prokaryotes. PLoS Biol 21, e3002436 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Millman, A., Melamed, S., Amitai, G. & Sorek, R. Diversity and classification of cyclic-oligonucleotide-based anti-phage signalling systems. Nat. Microbiol. 5, 1608–1615 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • McFarland, A. P. et al. Sensing of bacterial cyclic dinucleotides by the oxidoreductase RECON promotes NF-κB activation and shapes a proinflammatory antibacterial state. Immunity 46, 433–445 (2017).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Xia, P. et al. The ER membrane adaptor ERAdP senses the bacterial second messenger c-di-AMP and initiates anti-bacterial immunity. Nat. Immunol. 19, 141–150 (2018).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Chow, K. L., Hall, D. H. & Emmons, S. W. The mab-21 gene of Caenorhabditis elegans encodes a novel protein required for choice of alternate cell fates. Development 121, 3615–3626 (1995).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Yamada, R. et al. Cell-autonomous involvement of Mab21l1 is essential for lens placode development. Development 130, 1759–1770 (2003).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Li, L. et al. Hydrolysis of 2′3′-cGAMP by ENPP1 and design of nonhydrolyzable analogs. Nat. Chem. Biol. 10, 1043–1048 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hou, Y. et al. SMPDL3A is a cGAMP-degrading enzyme induced by LXR-mediated lipid metabolism to restrict cGAS-STING DNA sensing. Immunity 56, 2492–2507.e10 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Maltbaek, J. H., Cambier, S., Snyder, J. M. & Stetson, D. B. ABCC1 transporter exports the immunostimulatory cyclic dinucleotide cGAMP. Immunity 55, 1799–1812.e4 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Porta-Pardo, E., Ruiz-Serra, V., Valentini, S. & Valencia, A. The structural coverage of the human proteome before and after AlphaFold. PLoS Comput. Biol. 18, e1009818 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Derry, A., Carpenter, K. A. & Altman, R. B. Training data composition affects performance of protein structure analysis algorithms. Pac. Symp. Biocomput. 27, 10–21 (2022).

    PubMed 
    PubMed Central 

    Google Scholar
     

  • Necci, M. et al. Critical assessment of protein intrinsic disorder prediction. Nat. Methods 18, 472–481 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Gramates, L. S. et al. FlyBase: a guided tour of highlighted features. Genetics 220, iyac035 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Sayers, E. W. et al. Database resources of the National Center for Biotechnology Information in 2023. Nucleic Acids Res. 51, D29–D38 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Mei, Y. et al. InsectBase 2.0: a comprehensive gene resource for insects. Nucleic Acids Res. 50, D1040–D1045 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Poelchau, M. et al. The i5k Workspace@NAL—enabling genomic data access, visualization and curation of arthropod genomes. Nucleic Acids Res. 43, D714–D719 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Martin, F. J. et al. Ensembl 2023. Nucleic Acids Res. 51, D933–D941 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kent, W. J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bai, X. et al. Database resources of the National Genomics Data Center, China National Center for Bioinformation in 2024. Nucleic Acids Res. 52, D18–D32 (2024).

    Article 

    Google Scholar
     

  • Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kriventseva, E. V. et al. OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs. Nucleic Acids Res. 47, D807–D811 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Li, Y. et al. HGT is widespread in insects and contributes to male courtship in lepidopterans. Cell 185, 2975–2987.e10 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zhao, T. et al. Whole-genome microsynteny-based phylogeny of angiosperms. Nat. Commun. 12, 3498 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Steenwyk, J. L., Shen, X.-X., Lind, A. L., Goldman, G. H. & Rokas, A. A robust phylogenomic time tree for biotechnologically and medically important fungi in the genera Aspergillus and Penicillium. MBio 10, 1–25 (2019).

    Article 

    Google Scholar
     

  • Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Capella-Gutierrez, S., Silla-Martinez, J. M. & Gabaldon, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Nguyen, L.-T., Schmidt, H. A., von Haeseler, A. & Minh, B. Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268–274 (2015).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Yin, J., Zhang, C. & Mirarab, S. ASTRAL-MP: scaling ASTRAL to very large datasets using randomization and parallelization. Bioinformatics 35, 3961–3969 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Zhang, C., Rabiee, M., Sayyari, E. & Mirarab, S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics 19, 153 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Letunic, I. & Bork, P. Interactive Tree Of Life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 49, W293–W296 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hu, G. et al. flDPnn: accurate intrinsic disorder prediction with putative propensities of disorder functions. Nat. Commun. 12, 4438 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Deiana, A., Forcelloni, S., Porrello, A. & Giansanti, A. Intrinsically disordered proteins and structured proteins with intrinsically disordered regions have different functional roles in the cell. PLoS One 14, e0217889 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wells, J. et al. Chainsaw: protein domain segmentation with fully convolutional neural networks. Bioinformatics 40, btae296 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lau, A. M., Kandathil, S. M. & Jones, D. T. Merizo: a rapid and accurate protein domain segmentation method using invariant point attention. Nat. Commun. 14, 8445 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Zhu, K., Su, H., Peng, Z. & Yang, J. A unified approach to protein domain parsing with inter-residue distance matrix. Bioinformatics 39, btad070 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Lees, J. et al. Gene3D: a domain-based resource for comparative genomics, functional annotation and protein network analysis. Nucleic Acids Res. 40, D465–D471 (2012).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Jones, P. et al. InterProScan 5: genome-scale protein function classification. Bioinformatics 30, 1236–1240 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Liao, Y., Smyth, G. K. & Shi, W. featureCounts: an efficient general purpose program for assigning sequence reads to genomic features. Bioinformatics 30, 923–930 (2014).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ge, S. X., Jung, D. & Yao, R. ShinyGO: a graphical gene-set enrichment tool for animals and plants. Bioinformatics 36, 2628–2629 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Cai, H. et al. The virus-induced cyclic dinucleotide 2′3′-c-di-GMP mediates STING-dependent antiviral immunity in Drosophila. Immunity 56, 1991–2005.e9 (2023).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar