Issue 3
Mar 2020
Turn off MathJax
Article Contents
Ying CUI, Zelong XU, Jianzhong LI. Identification of nucleosome positioning using support vector machine method based on comprehensive DNA sequence feature[J]. JOURNAL OF MECHANICAL ENGINEERING, 2020, 37(3): 496-501. doi: 10.7507/1001-5515.201911064
Citation: Ying CUI, Zelong XU, Jianzhong LI. Identification of nucleosome positioning using support vector machine method based on comprehensive DNA sequence feature[J]. JOURNAL OF MECHANICAL ENGINEERING, 2020, 37(3): 496-501. doi: 10.7507/1001-5515.201911064

Identification of nucleosome positioning using support vector machine method based on comprehensive DNA sequence feature

doi: 10.7507/1001-5515.201911064
More Information
  • Corresponding author: LI Jianzhong, Email: lijzh@hit.edu.cn
  • Received Date: 23 Nov 2019
  • Rev Recd Date: 22 Feb 2020
  • Publish Date: 17 Mar 2020
  • In this article, based on z-curve theory and position weight matrix (PWM), a model for nucleosome sequences was constructed. Nucleosome sequence dataset was transformed into three-dimensional coordinates, PWM of the nucleosome sequences was calculated and the similarity score was obtained. After integrating them, a nucleosome feature model based on the comprehensive DNA sequences was obtained and named CSeqFM. We calculated the Euclidean distance between nucleosome sequence candidates or linker sequences and CSeqFM model as the feature dataset, and put the feature datasets into the support vector machine (SVM) for training and testing by ten-fold cross-validation. The results showed that the sensitivity, specificity, accuracy and Matthews correlation coefficient (MCC) of identifying nucleosome positioning for S. cerevisiae were 97.1%, 96.9%, 94.2% and 0.89, respectively, and the area under the receiver operating characteristic curve (AUC) was 0.980 1. Compared with another z-curve method, it was found that our method had better identifying effect and each evaluation performance showed better superiority. CSeqFM method was applied to identify nucleosome positioning for other three species, including C. elegans, H. sapiens and D. melanogaster. The results showed that AUCs of the three species were all higher than 0.90, and CSeqFM method also showed better stability and effectiveness compared with iNuc-STNC and iNuc-PseKNC methods, which is further demonstrated that CSeqFM method has strong reliability and good identification performance.

     

  • loading
  • [1]
    Maskell D P, Renault L, Serrao E, et al. Structural basis for retroviral integration into nucleosomes. Nature, 2015, 523(7560): 366-369. doi: 10.1038/nature14495
    [2]
    Taberlay P C, Statham A L, Kelly T K, et al. Reconfiguration of nucleosome-depleted regions at distal regulatory elements accompanies DNA methylation of enhancers and insulators in cancer. Genome Res, 2014, 24(9): 1421. doi: 10.1101/gr.163485.113
    [3]
    Cole H A, Cui F, Ocampo J, et al. Novel nucleosomal particles containing core histones and linker DNA but no histone H1. Nucleic Acids Res, 2016, 44(2): 573-581. doi: 10.1093/nar/gkv943
    [4]
    Buckwalter J M, Norouzi D, Harutyunyan A, et al. Regulation of chromatin folding by conformational variations of nucleosome linker DNA. Nucleic Acids Res, 2017, 45(16): 9372. doi: 10.1093/nar/gkx562
    [5]
    Murugan R. Theory of site-specific DNA-protein interactions in the presence of nucleosome roadblocks. Biophys J, 2018, 114(11): 2516. doi: 10.1016/j.bpj.2018.04.039
    [6]
    Nocetti N, Whitehouse I, et al. Nucleosome repositioning underlies dynamic gene expression. Genes Dev, 2016, 30(6): 660. doi: 10.1101/gad.274910.115
    [7]
    Bai L, Morozov A V. Gene regulation by nucleosome positioning. Trends in Genetics, 2010, 26(11): 476-483. doi: 10.1016/j.tig.2010.08.003
    [8]
    Eaton M L, Kyriaki G, Sukhyun K, et al. Conserved nucleosome positioning defines replication origins. Genes Dev, 2010, 24(8): 748-753. doi: 10.1101/gad.1913210
    [9]
    Hua Y, Epps J, Williams R, et al. Evidence that localized variation in primate sequence divergence arises from an influence of nucleosome placement on DNA repair. Mol Biol Evol, 2010, 27(3): 637-649. doi: 10.1093/molbev/msp253
    [10]
    Bevington S, Boyes J. Transcription-coupled eviction of histones H2A/H2B governs V(D)J recombination. EMBO J, 2013, 32(10): 1381-1392. doi: 10.1038/emboj.2013.42
    [11]
    Xing Y Q, Liu G Q, Zhao X J, et al. An analysis and prediction of nucleosome positioning based on information content. Chromosome Res, 2013, 21(1): 63-74. doi: 10.1007/s10577-013-9338-z
    [12]
    Lieleg C, Krietenstein N, Walker M, et al. Nucleosome positioning in yeasts: methods, maps, and mechanisms. Chromosoma, 2015, 124(2): 131-151. doi: 10.1007/s00412-014-0501-x
    [13]
    Zhang J, Peng W, Wang L, et al. LeNup: Learning nucleosome positioning from DNA sequences with improved convolutional neural networks. Bioinformatics, 2018, 34(10): 1705-1712. doi: 10.1093/bioinformatics/bty003
    [14]
    Huang Xiaolin, Mehrkanoon S, Suykens J A K. Support vector machines with piecewise linear feature mapping. Neurocomputing, 2013, 117: 118-127. doi: 10.1016/j.neucom.2013.01.023
    [15]
    Lee W, Tillo D, Bray N, et al. A high-resolution atlas of nucleosome occupancy in yeast. Nat Genet, 2007, 9(10): 1235-1244.
    [16]
    Tahir M, Hayat M. iNuc-STNC: a sequence-based predictor for identification of nucleosome positioning in genomes by extending the concept of SAAC and Chou’s PseAAC. Mol Biosyst, 2016, 12(8): 2587-2593. doi: 10.1039/C6MB00221H
    [17]
    Chen W, Feng P, Ding H, et al. Using deformation energy to analyze nucleosome positioning in genomes. Genomics, 2016, 107: 69-75. doi: 10.1016/j.ygeno.2015.12.005
    [18]
    Fu Limin, Niu Beifang, Zhu Zhengwei, et al. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics, 2012, 28(23): 3150-3152. doi: 10.1093/bioinformatics/bts565
    [19]
    Guo Shouhui, Deng Enze, Xu Liqin, et al. iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition. Bioinformatics, 2014, 30(11): 1522-1529. doi: 10.1093/bioinformatics/btu083
    [20]
    Zhang R, Zhang C T. A brief review: The Z-curve theory and its application in genome analysis. Curr Genomics, 2014, 15(2): 78-94. doi: 10.2174/1389202915999140328162433
    [21]
    崔颖. 基于 Z 曲线理论的转录因子结合位点的识别研究. 长春: 东北师范大学, 2008.
    [22]
    岁品品, 邢旭东, 王宏, 等. 基于位置权重矩阵的核小体识别及功能分析. 生物信息学, 2016, 14(1): 1-6. doi: 10.3969/j.issn.1672-5565.2016.01.01
    [23]
    Alencar J, Bonates T, Lavor C, et al. An algorithm for realizing Euclidean distance matrices. Electronic Notes in Discrete Mathematics, 2015, 50: 397-402. doi: 10.1016/j.endm.2015.07.066
    [24]
    Wu X, Liu H, Liu H, et al. Z curve theory-based analysis of the dynamic nature of nucleosome positioning in Saccharomyces cerevisiae. Gene, 2013, 530(1): 8-18. doi: 10.1016/j.gene.2013.08.018
  • 加载中

Catalog

    Figures(2)  / Tables(2)

    Article Metrics

    Article views(431) PDF downloads(0) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return