Nat. 15, 550 (2014). Ran, J.-H., Shen, T.-T., Wang, M.-M. & Wang, X.-Q. RepeatModeler2 for automated genomic discovery of transposable element families. Nucleic Acids Res. PLoS ONE 8, e80870 (2013). In fact, the Jurassic is often referred to as the "Age of Cycads." 201916 to Yang Liu, No. 54) C) seedless vascular plants. Raven, P. H., Evert, R. F. & Eichhorn, S. E. Biology of Plants 7th edn (Macmillan, 2005). Distmat from the EMBOSS (v.6.5.7.0) package was then used to calculate the K value of the retrotransposons 5- and 3-LTR sequences. (c) Distribution of outer dense fiber protein and other key flagellar proteins across representative embrophyta. PubMed Central Nat. A. the first leaf structure to arise in the embryo of an angiosperm. 37) and the most differentiated nucleotide diversity () and heterozygosity ratios characterizing the window between 18 and 50Mb on chromosome 8 (Fig. Environ. As in other gymnosperm genomes, a large portion (76.14%) of the C. panzhihuaensis genome consists of ancient repetitive elements (Supplementary Note 4). 10, 23682386 (2008). PubMed The minimum, first quartile (Q1), median, third quartile (Q3), and maximum value was indicated in the box-plot by order after excluding the outliers. Fertilized ovules accumulated a high level of abscisic acid and expressed the genes related to cell wall organization and biogenesis, indicating their activity in embryo development, seed coat formation, and seed maturation and dormancy40 (Supplementary Note 10.110.5). 4f and Supplementary Fig. & Nicholls, K. J. 14, 29382943 (2000). Biol. https://db.cngb.org/codeplot/datasets/public_dataset?id=PwRftGHfPs5qG3gE, http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced, https://plantcode.online.uni-marburg.de/tapscan/, https://github.com/qiao-xin/DupGen_finder, Extended Data Fig. Gibberellin, which is reported to regulate integument development in the ovules of flowering plants39, accumulated in the late stage of the pollinated ovule in Cycas. Genetic differentiation (FST) and nucleotide diversity () were calculated within a non-overlapping 100-kb window using VCFtools105 (v.0.1.13). We also found gene families related to integument development (for example, those involved in cutin, suberine and wax biosynthesis), with increased expression levels at the late stage of the pollinated ovule. Kim, D., Langmead, B. Yang Liu, S.W., L.L., S.D., T.W., J.M. By contrast, Gnetum, conifers and angiosperms, which develop non-flagellated spermatozoa, lost many flagellar structural genes (Supplementary Note 12). J.H., J.M., G.C. *Cycadophyta (*cycads*)* A division of gymnosperms [1] comprising plants with leaves and habit similar to those of palm trees, although some species are quite small. All the candidate LTR elements were first identified using LTR_FINDER and LTR_retriever. BMC Evol. 60, 1622 (2021). Plant 13, 5971 (2020). Nature Plants 38). and Yang Liu conceived the study. As one of the four extant gymnosperm groups (cycads, Ginkgo, conifers and gnetophytes), cycads hold an important evolutionary position for understanding the origin and early evolution of seed plants. The qualitative study was carried out using a self-constructed database that was built using the reference standards. Parts of genes were then filtered if they were not the homologues according to their functional annotation of SwissProt (<1105). 1a)6. Article Among these, 106 of the new orthogroups and 55 of the expanded orthogroups are associated with seed development in Arabidopsis37, including the regulation of development during early embryogenesis, seed dormancy and germination, and seed coat formation, as well as in immunity and stress response of the seed (Supplementary Note 6). Nature 574, 679685 (2019). Extended Data Fig. 195205 (Springer, 2002). Lepiniec, L. et al. 4a). Nucleic Acids Res. Our understanding of biotic evolution relies heavily on phylogenetic and dating reconstructions that provide insight into the periods of major diversification [].In the last decade, the advent of molecular dating approaches has fostered an explosion of studies constructing time-calibrated trees for diverse plant clades like bryophytes [], ferns [3,4], gymnosperms [5-8] and angiosperms [9-11]. and S. Wu wrote the manuscript. 31, 9093 (2003). Gene numbers colored in light orange (y-axis min-max: 0-30). Our phylogenomic analyses based on 15 genomes and 1 transcriptome revealed 2,469 gymnosperm-wide duplications in 9,545 gene families and indicate that this WGD event dates to the most recent common ancestor (MRCA) of extant gymnosperms (Fig. 38, e199 (2010). 3c). Development 131, 53415351 (2004). P values were calculated from a mixed linear model association of SNPs. bryophytes. The genomic architecture of the sexdetermining region and sexrelated metabolic variation in Ginkgo biloba. Conifers, in the division Pinophyta or Coniferophyta, are the most numerous of the gymnosperms; woody and with vascular tissue, these are cone bearing trees and shrubs.. Conifers can be found growing in all parts of . In the strict sense, Bryophyta consists of the mosses only. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. a, Inference of the number of gene families with duplicated genes surviving after WGD events mapped on a phylogenetic tree depicting the relationships among 16 vascular plants included in this study. S.Z., H.L., X.G. 5c) and may play specific roles in initiating embryogenesis in gymnosperms. Biotechnol. Suyama, M., Torrents, D. & Bork, P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nanovesicles are secreted during pollen germination and pollen tube growth: a possible role in fertilization. Chen, N. Using Repeat Masker to identify repetitive elements in genomic sequences. Evol. 8, 446452 (2003). Adv. The genome and transcriptome data, genome assemblies and annotations can be found at https://db.cngb.org/codeplot/datasets/public_dataset?id=PwRftGHfPs5qG3gE. Adaptive innovation of green plants by horizontal gene transfer. The initial cross-linked long-distance physical interactions were then represented by chimeric fragments, which were processed into paired-end sequencing libraries. Sequence of the sugar pine megagenome. Jiao, Y. et al. 25 fossil calibrations and 2 secondary calibrations were used. The * denotes the C. panzhihuaensis specific TPS genes. 2a), surpassing those of Ginkgo14. Sentieon DNASeq variant calling workflow demonstrates strong computational performance and accuracy. STAG (https://github.com/davidemms/STAG) was also used to construct the species tree with default settings using low-copy genes (one to four copies). 62, 485514 (2011). They swim due to presence of water. The intra- and intergenomic syntenic analyses were conducted using MCscanX100, with the default settings. 4). (a) Comparison of the longest 10% of introns and gene in the representative land plants. Danecek, P. et al. trees that produce flowers. Leebens-Mack, J. H. et al. Bryophytes are a group of land plants, sometimes treated as a taxonomic division, that contains three groups of non-vascular land plants (embryophytes): the liverworts, hornworts and mosses. The software DISCOVISTA88 was used to summarize the conflicts among different analytical methods and datasets, regarding several focal phylogenetic relationships. 7c). c, Phylogeny of SSPs in some representative species in land plants. 45, 401415 (1970). The genome comprises 10.5Gb assembled in 5,123 contigs (N50=12Mb), with 95.3% of these contigs anchored to the largest 11 pseudomolecules, corresponding to the 11 chromosomes (n=11) of the C. panzhihuaensis karyotype11 (Supplementary Note 3 and Extended Data Fig. SSCG, single-copy genes; LCG, low-copy genes; MT, mitochondrial genes; PT, plastid genes; AA, amino acid sequences; NT, nucleotide sequences; NT12, codon 1st+2nd positions; ASTRAL, coalescent tree inference method using ASTRAL; CONCAT, maximum likelihood tree inferred with IQ-TREE based on concatenated datasets; STAG, species tree inference using software STAG with low-copy genes (one to four copies); Original, original organellar nucleotide sequences; RNA Editing, organellar genes with RNA editing site modified. BMC Evol. 833522) and from Ghent University (Methusalem funding, BOF.MET.2021.0005.01). Following Wu et al.95, we applied two basic requirements for the determination of a reliable duplication event: (1) at least one common species genes are present in two child branches; and (2) the bootstrap values of the parental node and one of the child nodes are both 50%. Our data suggest that RNL genes occur widely in gymnosperms. The abbreviated name given before the protein ID represents species name: CYCAS: Cycas panzhihuaensis, Gb: Ginkgo biloba, ELO: Encephalartos longifolius, SEGI: Sequoiadendron giganteum, GMON: Gnetum montanum, PICABI: Picea abies, PITA: Pinus taeda. Nature Plants thanks James Clugston and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Xie, T. et al. 4 Ancestral polyploidy events in extant gymnosperms. These facets include reproductive biology, anatomy, cytology . Plant J. Bioinformatics 30, 12361240 (2014). These are the nonvascular plants or bryophytes (mosses, liverworts, and hornworts), the seedless vascular plants (clubmosses and ferns including, horsetails, club mosses, and whisk ferns), gymnosperms (conifers, cycads, Ginkgo, and gnetophytes), and angiosperms, or flowering plants. 172, 24032415 (2016). Brenner, E. D., Stevenson, D. W. & Twigg, R. W. Cycads: evolutionary innovations and the role of plant-derived neurotoxins. Draft genome of the living fossil Ginkgo biloba. Mol. The SSPs analysed include germin-like protein (GLP), legumin-like SSP (l-SSP), vicilin-like SSP (v-SSP) and v-AMP. One possible explanation is that cycads may have acquired enhanced resistance to pathogens and herbivores through encoding diversified resistance-related genes and the biosynthesis of diversified secondary compounds4,8. Its fan-shaped leaves, unique among seed plants because they feature a dichotomous venation pattern, turn yellow in autumn and fall from . See Answer Pink boxes in ac represent the most differentiated regions between the sex chromosomes. Extended Data Fig. and M. Lisby contributed substantially to revisions. Folk, R. A. et al. For genome sequencing, the genomic DNA was extracted by the QIAGEN Genomic kit followed the manufacturers instructions65. Sci. 176, 14101422 (2018). Google Scholar. See Supplementary Note 16 for details on verification and phylogenetic analysis of the cytotoxin gene. Ming, R., Bendahmane, A. Biol. 33, 290295 (2015). In addition, the genome contains almost equal proportions of copia and gypsy long terminal repeat (LTR) elements, in contrast to other gymnosperm genomes, in which gypsy repeats are more frequent14,15 (Supplementary Note 6). To search for genome-wide duplications, we used DupGen_finder (https://github.com/qiao-xin/DupGen_finder) to identify duplicated genes that were classified into five different categories: WGD duplicates, tandem duplicates, proximal duplicates, transposed duplicates and dispersed duplicates. J. Syst. Simo, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Sci. BMC Bioinform. The assembled MSY had an almost 80-Mb difference in length from the corresponding region on the X chromosome, which agrees with the heteromorphy of the Cycas sex chromosomes. Because the sporophytic generation is not photosynthetic, bryophytes like Polytrichum and Marchantia are examples of bryophytes in which the sporophyte is dependent upon the gametophyte. 202105 to Y.G. Internet Explorer). Opin. Stanke, M. et al. This work is part of the 10KP project (https://db.cngb.org/10kp/) and was also supported by China National GeneBank (CNGB; https://www.cngb.org/). Phylogenetic analysis revealed that the cupin family can be subdivided into two groups: the germin-like and seed storage protein (SSP)-encoding genes. To maximize the opportunity of identifying transposable elements, a combination of de novo and homology-based approaches was performed following the Repeat Library Construction-Advanced pipeline (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The sample extracts were the analysed using the ultra high-performance liquid chromatography system Vanquish (ThermoFisher Scientific) and Q Exactive HF-X (ThermoFisher Scientific). USA 116, 1087410882 (2019). Many copies of these genes were found to be highly expressed in cambium or apical meristem of C. panzhihuaensis (Supplementary Note 6). The cytotoxin protein sequences of Cycas were used as query to perform BLASTP searches against the NCBI nr protein sequence database using the cut-off e value=1105 and max_target_seqs=20,000. For example, genes encoding endochitinases and chitinases as defences against chitin-containing fungal pathogens are expanded as tandem repeats in the C. panzhihuaensis and most gymnosperm genomes compared with other land plants. Answer: C 53) Chapter Section: 22.3. seedless vascular plants . Both fit and mcf toxins are known for their insecticidal properties, and fit- or mcf-producing bacteria are often used in pest biocontrol62,63,64. performed WGD analysis. The RNL family plays a critical role in downstream resistance signal transduction in angiosperms, and the broad occurrence of the RNL family in gymnosperms suggests that this signalling pathway may have been established no later than the origin of seed plants. Gene duplications and phylogenomic conflict underlie major pulses of phenotypic evolution in gymnosperms. No, Cycas are not bryophytes.They are classified under gymnosperms. Extended Data Fig. JCYJ20151015162041454 to Huan Liu). Injection of the synthesized C. panzhihuaensis fitD protein resulted in significantly higher mortality in larvae of both the diamondback moth (Plutella xylostella) and cotton bollworm (Helicoverpa armigera) (Fig. I, II, III, VI, V and VI indicate internal branches for which the pie charts depicting gene tree incongruence are complemented by histograms (lower panel) showing quartet support for the main topology (q1), the first alternative topology (q2) and the second alternative topology (q3). Tandem repeats were predicted using Tandem Repeat Finder (v.4.07)68 with the following parameters: Match=2, Mismatch=7, Delta=7, PM=80, PI=10, Minscore=50 and MaxPeriod=2,000. Science 338, 10931097 (2012). & Sanmartn, I. PubMedGoogle Scholar. Commun. The male-specific region of the Y chromosome of C. panzhihuaensis contains a MADS-box transcription factor expressed exclusively in male cones that is similar to a system reported in Ginkgo, suggesting that a sex determination mechanism controlled by MADS-box genes may have originated in the common ancestor of cycads and Ginkgo. Nostoc strains isolated from a cycad, the bryophyte Anthoceros a lichen all infect Gunnera. 3a). 5a). The number of gene families with retained gene duplicates reconciled on a particular branch of the species tree are shown above the branch across the phylogeny (Methods). b, Evolutionary analyses and phylogenetic profiles depicting the gains (light green), losses (light red), expansions (light yellow) and contractions (light blue) of orthogroups, according to the reconstruction of the ancestral gene content at key nodes and the dynamic changes of the lineage-specific gene characteristics. Cycads are ancient seed-bearing plants that appeared before the age of dinosaurs, during the Permian period, almost 280 million years ago. Nucleic Acids Res. Several TPS subfamilies (TPS-a to TPS-h) are known in plants60, among which the TPS-d family is unique to gymnosperms, and three of the four types of TPS-d were found in C. panzhihuaensis, with remarkable expansions of TPS-d2 compared with Ginkgo and most other gymnosperms (Supplementary Note 15). State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China, Yang Liu,Sibo Wang,Linzhou Li,Ting Yang,Tong Wei,Hongli Wang,Min Liu,Yan Xu,Hongping Liang,Jin Yu,Yuqing Cai,Zhaowu Zhang,Yannan Fan,Weixue Mu,Sunil Kumar Sahu,Guangyi Fan,Huanming Yang,Jian Wang,Xin Liu,Xun Xu&Huan Liu, Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China, Yang Liu,Shanshan Dong,Yiqing Gong,Shuchun Liu,Xiaoan Lang,Leilei Yang,Na Li,Sadaf Habib,Nan Li&Shouzhou Zhang, State Key Laboratory of Grassland Agro-Ecosystems, College of Ecology, Lanzhou University, Lanzhou, China, State Environmental Protection Key Laboratory of Regional Eco-process and Function Assessment, Chinese Research Academy of Environmental Sciences, Beijing, China, Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China, Xiuyan Feng,Jinling Huang,Jian Liu&Xun Gong, Key Laboratory of Plant Stress Biology, State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng, China, Jianchao Ma,Guanxiao Chang&Jinling Huang, Department of Biology, East Carolina University, Greenville, NC, USA, College of Biology and Environment, Nanjing Forestry University, Nanjing, China, College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China, Hongli Wang,Yan Xu,Hongping Liang,Jin Yu,Yuqing Cai&Zhaowu Zhang, School of Life Sciences, Sun Yat-sen University, Guangzhou, China, Sichuan Cycas panzhihuaensis National Nature Reserve, Panzhihua, China, Global Biodiversity Conservancy, Chonburi, Thailand, Department of Entomology, China Agricultural University, Beijing, China, Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA, Bernard Goffinet,Sumaira Zaman&Jill L. Wegrzyn, Guangdong Provincial Key Laboratory for Plant Epigenetics, Longhua Institute of Innovative Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China, Department of Biotechnology and Biomedicine, Technical University of Denmark, Lyngby, Denmark, Shenzhen Agricultural Genome Research Institute, Chinese Academy of Agricultural Sciences, Shenzhen, China, College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China, State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China, Chengdu University of Traditional Chinese Medicine, Chengdu, China, Department of Plant Biotechnology and Bioinformatics, Ghent University, VIB UGent Center for Plant Systems Biology, Gent, Belgium, College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China, Hainan Institute of Zhejiang University, Sanya, China, The College of Life Sciences, Sichuan University, Chengdu, China, Key Laboratory of Orchid Conservation and Utilization of National Forestry and Grassland Administration at College of Landscape Architecture, Fujian Agriculture and Forestry University, Fuzhou, China, State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China, College of Life Sciences, South China Agricultural University, Guangzhou, China, National Key Laboratory of Plant Molecular Genetics, Chinese Academy of Sciences Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China, Department of Biology, University of Copenhagen, Copenhagen, Denmark, Florida Museum of Natural History, University of Florida, Gainesville, FL, USA, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa, Department of Biology, University of Florida, Gainesville, FL, USA, You can also search for this author in Jones, D. L. Cycads of the World: Ancient Plants in Todays Landscape 2nd edn (Smithsonian Institution Press, 2002). Plants) The elution product containing pure protein were washed three times with TrisNaCl buffer and concentrated using centricon (Millipore PM10). RepeatClassifier was then used to classify the candidate LTR. Numbers above branches represent bootstrap scores from IQ-TREE. Roodt, D. et al. These results confirm that Cycas possesses an XY sex determination system positioned on chromosome 8. a, Manhattan plot of GWAS analysis of sex differentiation in 31 male and 31 female Cycas samples. 186, 577592 (2010). The shift from swimming to non-motile sperm is a major innovation in land plant evolution, and C. panzhihuaensis and G. biloba exhibit an ancestral gene content that is part of the shift from producing flagellate to non-flagellate sperm cells. Biol. For instance, those genes encoding egg cell-secreted proteins that prevent attraction of multiple pollen tubes48 originated in the MRCA of living seed plants. and Y.L. d, Expression levels of SSP in different tissues of C. panzhihuaensis. (b) Expression of CYCAS_034085 on MSY and CYCAS_010388 on chromosome 2 in male microsporophyll and in the ovule. J. Syst. Bryophytes are characteristically limited in size and prefer moist habitats although they can survive in drier environments. All seed plants produce pollen and deliver their sperm through the growth of a pollen tube, whereas all non-seed land plants (that is, bryophytes, lycophytes and ferns) rely on free-swimming motile sperm for sexual reproduction, as do the ancestors of land plants1,4 (Extended Data Fig. 40, e49 (2012). Bioinformatics 30, 21142120 (2014). FUS3 and LEC2 are shared by all living seed plants; the Cycas and other gymnosperm genomes contain genes composing a new clade of B3 domain proteins, that is, the FUS3/LEC2-like clade, which is sister to the clade of FUS3 and LEC2 (Extended Data Fig. 88, 7682 (2011). Birney, E., Clamp, M. & Durbin, R. GeneWise and genomewise. Bioinformatics 5, 4.10.1114.10.14 (2004). An integrated phylogenomic approach and a method to analyse synteny as described previously35,94,95 were used to identify the WGD events in seed plant evolution. The gymnosperms consist of the conifers, the cycads, the gnetophytes and the sole extant species of the Gynkgophyta division, the Gingko biloba.. The diversification pattern for cycads were analysed with Bayesian analysis of macroevolutionary mixture (www.bamm-project.org) following Condamine et al.93. Sahu, S. K., Thangaraj, M. & Kathiresan, K. DNA extraction protocol for plants with high levels of secondary metabolites and polysaccharides without using liquid nitrogen and phenol. 23, 352354 (1985). Unlike other extant seed plants, cycads and Ginkgo retain flagellated sperm, an ancestral trait shared with bryophytes, lycophytes and ferns 7. The origin of the seed marked one of the most important events of plant evolution10. The origin of seed plants is marked by the emergence of key traits including the seed, pollen and secondary growth of xylem and phloem36. The variant call format and VCFtools. Yang, D.-Q. Cycads comprise many more living species57 than Ginkgo, which was once diverse in the Mesozoic but includes only one extant species58. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. 67, 940964 (2018). Biol. 4g and Supplementary Fig. CAS 30, 177190 (2013). In addition, we identified a novel TPS subfamily in Cycas, with three copies in C. panzhihuaensis and eight copies in Cycas debaoensis (Extended Data Fig. Here, we report a high-quality, chromosome-level genome assembly of Cycas panzhihuaensis based on sequencing of the haploid megagametophyte using a combination of MGI-SEQ short-read, Oxford Nanopore long-read and Hi-C sequencing methods (Supplementary Note 2). The accuracy of Hi-C based chromosomal assembly was assessed using Juicerboxs chromatin contact matrix. The study was supported by the National Cycad Conservation Center at Fairy Lake Botanical Garden. TPM were calculated using the eXpress program, which was incorporated in the Trinity89 package. Nanodrop and Qubit (Invitrogen) were used to quantify the DNA. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. To better understand the dynamic changes in gene regulation and regulatory programmes during ovule pollination and fertilization, we performed a weighted correlation network analysis (WGCNA) and identified 11 co-expression modules at different developmental stages of the C. panzhihuaensis ovule and seed (Fig. ISRN Mol. Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. 4e). Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. 66, 393413 (2015). 285, 20181012 (2018). Evidence for an ancient whole genome duplication in the cycad lineage. Plant 7, 573577 (2014). The sliding window of the inner rings 1-4 is 1Mb. The transcriptome sequencing reads from 339 cycad species were generated in the current study. Sultana, M., Mukherjee, K. K. & Gangopadhyay, G. in Reproductive Biology of Plants (eds Johri, B. M. & Srivastava, P. S.) 118132 (Springer Science & Business Media, 2014). Stull, G. W. et al. (a) Sketch of the Cycas sperm. All three types of immune receptorsCC-NBS-LRR (CNL), TIR-NBS-LRR (TNL) and RPW8-NBS-LRR (RNL)show patterns of expansion in C. panzhihuaensis and other gymnosperms, compared with non-seed plants (Supplementary Note 14). Crane, P. R. An evolutionary and cultural biography of ginkgo. Genet. c, Genome alignment of the MSY scaffolds with the corresponding female-specific region on chromosome 8. Provided by the Springer Nature SharedIt content-sharing initiative, Nature Plants (Nat. The maximum likelihood tree was generated by RAxML with the PROTCATGTR model and 1,000 bootstrap replicates. The longest gene, CYCAS_013063, encoding a kinesin-like protein KIF3A, covers 2.1Mb in the C. panzhihuaensis genome; the longest intron is approximately 1.5Mb and was detected in CYCAS_030563, a gene that encodes a photosystem II CP43 reaction centre protein. Transcriptomes and PCR amplification from genomic DNA indicated that these genes occur in many Cycas species (Supplementary Note 16). Quartet support for six internal branches I, II, III, IV, V, VI were indicated on the left panel as barcharts. 85) and ASTRAL86 were used to align the sequences and to infer the species tree for cycads as aforementioned. Press, 1997). The modules are enriched in seed nutrition metabolic processes (M2, M6 and M8), membrane biosynthesis (M9, which may relate to the development of the integument) and genes synthesizing callose, a major component of the pollen tube (M4) (Supplementary Note 10). The BN kinship matrix and the first five components calculated from the principal component analysis104 (v.1.91.4beta3) were included as random effects. J.) T.W., S.L., X.W. 9, 559 (2008). Genes from MSY and autosomes are marked on the right, and those from Selaginella and Physcomitrium are used as outgroups. PubMed The Ole e 1-like gene families, which encode proteins that accumulate in the pollen tube cell wall and play a role in pollen germination and pollen tube growth49, are remarkably expanded in the MRCA of extant seed plants compared with non-seed plants (Supplementary Note 6). J. Genetics 204, 16131626 (2016). Red lines represent alignments >5kb on the forward strand, and blue lines represent those on the reverse strand. 4d,e and Extended Data Fig. Cell. 67, 735740 (2018). USA 99, 1074210747 (2002). Heteromorphic chromosomes have been reported to be associated with sex determination in Cycas54. The abbreviated name given before the protein ID represents species name: CYCAS: Cycas panzhihuaensis, Gb: Ginkgo biloba, SEGI: Sequoiadendron giganteum, GMON: Gnetum montanum, PICABI: Picea abies, PITA: Pinus taeda, ATH, Arabidopsis thaliana, DEBAO: Cycas debaoensis, AMTR: Amborella trichopoda, OS: Oryza sativa, AFILI: Azolla filiculoides, SACU: Salvinia cucullata, SELMO: Selaginella moellendorffii, PPATEH: Physcomitrella patens, MARPO: Marchantia polymorpha. The domain and gene ontology of the gene models was identified by InterProScan80 (using data from Pfam, PRINTS, SMART, ProDom and PROSITE). Soltis, D. et al. One thousand plant transcriptomes and the phylogenomics of green plants. Rev. The cellulose synthase superfamily in fully sequenced plants and algae. Wan, T. et al. Because of the presence in MSY and its exclusive expression pattern in males, we named CYCAS_034085 as MADS-Y, a potential sex determination gene. This sex-associated region is also the most differentiated between male and female Cycas genomes, with the largest fixation index (FST; Supplementary Fig. Stevens, K. A. et al. & Jiao, Y. Plant Biol. Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Finally, maximum likelihood trees were calculated using RAxML99 with the GTRGAMMA model and bootstrap support was estimated based on 100 replicates. Plant 8, 489492 (2015). Smith, S. A., Moore, M. J., Brown, J. W. & Yang, Y. 8 The phylogeny and expression level of TPS. The asterisk indicates a significant difference (two-sided Students t-test, P<0.05, n=3 biologically independent experiments), whereas the error bar represents the standard error. USA 117, 94519457 (2020). The authors declare no competing interests. 2019HJ2096001006 to Shouzhou Zhang and Yongbo Liu), the Major Science and Technology Projects of Yunnan Province (Digitalization, development and application of biotic resource, No.