摘要: |
刺梨是一种药食两用植物,在炎症和癌症研究方面显示出巨大的潜力。为了进一步推动刺梨的研究和应用,建立一个完整的刺梨全长转录组数据库是必要的,这将有助于揭示其复杂的分子机制和生物活性物质的合成途径。为了深入探究刺梨功能基因信息,选择三代建库方法,采用PacBio sequelII测序平台,利用单分子实时测序技术(Single Molecule Real-Time, SMRT)对刺梨6种组织混合样品进行全长转录组测序,利用生物信息学手段对其进行分析。结果表明:(1)共获得25 003条去冗余后的isoforms序列,平均长度为2 471 bp。成功预测到24 357个CDS序列,平均长度为1 727 bp,其中大部分CDS长度在300 ~3 000 bp之间;(2)利用GO、KEGG等7个数据库进行比对,共注释24 859个基因,占比达99.42%;挖掘出黄酮类化合物生物合成相关转录本99个;(3)鉴定出1 930个基因分别属于82种转录因子家族,并且有55个转录因子可能参与调控刺梨黄酮类化合物的生物合成;(4)共95个LncRNA,SSR位点12 588个,进一步利用Primer 3软件进行刺梨SSR的引物设计,共筛得10 545对SSR引物。该研究结果丰富了刺梨基因数据库,为下一步分子标记开发、生长发育、抗逆、次级代谢产物的生物合成以及遗传改良和育种提供理论基础。 |
关键词: 刺梨,单分子实时测序技术,全长转录组,生物信息学分析 |
DOI:10.11931/guihaia.gxzw202411033 |
分类号:S567.9 |
基金项目:贵州省高等学校刺梨发酵技术工程研究中心(黔教技[2022]008号);六盘水师范学院科研培育项目(LPSSY2023KJZDPY07);贵州省科技厅项目(黔科合基础MS[2025]099);六盘水师范学院高层次人才科研启动基金项目(LPSSYKYJJ202205);贵州省大学生创新训练项目(S2024109771628)。 |
|
Full-length transcriptome sequencing analysis and identification of putative genes for biosynthesis of flavonoid in Rosa roxburghii |
HE Bin, ZHANG Yangli, WU Yuhan, TANG Dahai, YANG Qunying, LIU Linya, HUANG Yacheng*
|
School of Biological Sciene and Technology, Liupanshui Normal University, Liupanshui, 553004, Guizhou, China
|
Abstract: |
Rosa roxburghii, a medicinal and edible plant native to Southwest China, is renowned for its rich bioactive compounds, including flavonoids, vitamin C, and polysaccharides, which exhibit significant anti-inflammatory, anticancer, and antioxidant properties. To advance the genetic research and application of R. roxburghii, this study aimed to construct a comprehensive full-length transcriptome database and identify key genes involved in flavonoid biosynthesis. Using PacBio Sequel II single-molecule real-time (SMRT) sequencing, mixed samples from six tissues (flower, leaf, stem, young bark, mature bark, and fruit) were analyzed. Bioinformatics tools were employed for transcriptome assembly, functional annotation, and structural characterization. The results were as follows: (1) A total of 25 003 non-redundant isoforms were obtained, with an average length of 2 471 bp. Among these, 24 357 coding sequences (CDS) were predicted, averaging 1 727 bp, with 91.84% ranging between 300 – 3 000 bp. (2) Functional annotation using seven databases (GO, KEGG, Nr, Swiss-Prot, TrEMBL, KOG, and Pfam) revealed that 24 859 isoforms (99.42%) were annotated. Notably, 99 transcripts were linked to flavonoid biosynthesis pathways, including phenylalanine ammonia-lyase (PAL), chalcone synthase (CHS), and flavonol synthase (FLS). (3) A total of 1 930 transcription factors (TFs) from 82 families were identified, with 55 TFs (e.g., WRKY, MYB, and bHLH) potentially regulating flavonoid biosynthesis. (4) Structural analysis predicted 95 long non-coding RNAs (LncRNAs) and 12 588 simple sequence repeats (SSRs), from which 10 545 SSR primer pairs were designed. This study establishes the first high-quality full-length transcriptome database for?R. roxburghii, significantly enhancing the genetic resources for this species. The identification of flavonoid-related genes and transcription factors provides critical insights into the molecular mechanisms underlying bioactive compound synthesis. Furthermore, the SSR markers developed here offer valuable tools for genetic diversity studies, molecular breeding, and germplasm conservation. These findings lay a foundation for future research on metabolic engineering, functional genomics, and the genetic improvement of RRT, supporting its industrial development as a health-promoting crop in Guizhou Province and beyond. |
Key words: Rosa roxburghii, single molecule real-time sequencing technology, bioinformatics analysis |