留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于统计分析的蒙汉自然语言的机器翻译

苏依拉 乌尼尔 刘婉婉

苏依拉, 乌尼尔, 刘婉婉. 基于统计分析的蒙汉自然语言的机器翻译[J]. 机械工程学报, 2017, 43(1): 36-42. doi: 10.11936/bjutxb2016070044
引用本文: 苏依拉, 乌尼尔, 刘婉婉. 基于统计分析的蒙汉自然语言的机器翻译[J]. 机械工程学报, 2017, 43(1): 36-42. doi: 10.11936/bjutxb2016070044
SU Yila, WU Nier, LIU Wanwan. Machine Translation of Mongolianand Chinese Natural Language Based on Statistical Analysis[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(1): 36-42. doi: 10.11936/bjutxb2016070044
Citation: SU Yila, WU Nier, LIU Wanwan. Machine Translation of Mongolianand Chinese Natural Language Based on Statistical Analysis[J]. JOURNAL OF MECHANICAL ENGINEERING, 2017, 43(1): 36-42. doi: 10.11936/bjutxb2016070044

基于统计分析的蒙汉自然语言的机器翻译

doi: 10.11936/bjutxb2016070044
基金项目: 国家自然科学基金资助项目(61363052);内蒙古自治区自然科学基金资助项目(2012MS0904,2016MS0605)
详细信息
    作者简介:

    作者简介: 苏依拉(1964—), 男, 教授, 主要从事计算机人工智能与模式识别方面的研究, E-mail:suyila@tsinghua.org.cn

  • 中图分类号: TP391

Machine Translation of Mongolianand Chinese Natural Language Based on Statistical Analysis

  • 摘要: 为改变内蒙古地区蒙汉机器翻译发展相对落后的现状,采用基于统计的机器翻译方法将短语作为翻译的最基本单元,并基于最大熵模型提出了一种分词方法和词对齐方法,通过调序结果来输出译文. 实验结果表明:改进后的翻译系统BLEU值在一定程度上有所提高,所提方法可为蒙汉应用研究提供参考.

     

  • 图  信源信道模型

    Figure  1.  Source channel model

    图  获取翻译模型流程

    Figure  2.  Process of obtaining translation model

    图  对齐矩阵

    Figure  3.  Alignment matrix

    图  词调序模型

    Figure  4.  Tone sequence model

    图  翻译实例

    Figure  5.  Translation examples

    表  1  短语抽取表

    Table  1.   Phrase extraction table

    下载: 导出CSV

    表  2  语料库分布情况

    Table  2.   Distribution of corpus%

    领域 所占比例
    日常用语和短语对话 45.67
    政府和法律文献 20.00
    文学领域 34.33
    下载: 导出CSV

    表  3  不考虑短语长度的情况

    Table  3.   Case without the consideration of phrase length

    依存值阈值 短语翻译概率/% BLEU
    0.0 100 0.2009
    0.6 88 0.2098
    0.8 86 0.2177
    1.5 81 0.2213
    1.7 79 0.2273
    2.1 73 0.2265
    3.2 52 0.2010
    下载: 导出CSV

    表  4  考虑短语长度的情况

    Table  4.   Case with the consideration of phrase length

    短语长度 2 3 4 5 6 7
    依存值阈值 1.3 1.5 4.8 3.1 6.9 13.9
    BLEU 0.2189 0.2208 0.2120 0.2010 0.2190 0.2048
    下载: 导出CSV
  • [1] DAI W C.Machine translation development status at home and abroad[J]. Software and Information Serive, 1994(12): 2-4. (in Chinese)
    [2] LU W L.Machine translation development overview[J]. Journal of Library and Information Sciences in Agriculture, 2002(4): 24-25. (in Chinese)
    [3] LI R Z, WU J C.The rise of machine translation in Western countries[J]. Shanghai Journal of Translators,1992(4): 37-42. (in Chinese)
    [4] LIU S J, LI Z H, LI M, et al.Co-training framework for feature weight optimization of statistic machine translation[J]. Journal of Software, 2012, 23(12): 3101-3114. (in Chinese)
    [5] HE Y Q, ZHANG J S, WANG H L, et al.Combining multiple translations based on words and phrase[J]. Journal of the China Society for Scientific and Technical Information, 2011, 30(12): 1268-1273. (in Chinese)
    [6] LIU Q.Survey on statistical machine translation[J]. Journal of Chinese Information Processing, 2003, 17(4): 1-12. (in Chinese)
    [7] YANG M Y, LI S, ZHAO T J, et al.Research on bilingual annotation for Chinese and English[J]. Journal of the China Society for Scientific and Technical Information, 2000, 19(5): 464. (in Chinese)
    [8] ZHANG Y G.Public space politics in constructing unification of multinational states country[J]. Journal of Beijing Normal University (Social Sciences), 2014(6): 58-64. (in Chinese)
    [9] LI Q, SUN K J, LIU Z, et al.Technical analysis of NiuTrans: open source statistical machine translation system[J]. Programmer, 2012(8): 52-55.
    [10] WU D M Q E, WANG S R G L. Research on numerals automatic translation of Mongolian-Chinese machine translation[J]. Journal of Inner Mongolia Normal University (Natural Science Edition), 2015, 44(3): 368-371. (in Chinese)
    [11] HU L Y.The enlightenment of machine translation theory on Chinese and Russian artificial translation[J]. Foreign Languages Research, 2013(3): 82-86. (in Chinese)
    [12] YANG P, ZHANG J, LI Miao, et al.Morphology-processing in Chinese Mongolian statistical machine translation[J]. Journal of Chinese Information Processing, 2009, 23(1): 50-57. (in Chinese)
    [13] ZHANG R Y, SHI X D, CHEN Y D.Analysis and improvement to IRST language modeling toolkit[J]. Mind and Computation, 2008(1): 8-15. (in Chinese)
    [14] HE Z J, LIU Q, LIN S X, et al.A phrase similarity-based model for statistical machine translation[J]. Chinese High Technology Letters, 2009, 19(4): 337-341. (in Chinese)
    [15] QU Y N.Recursive column search decoding algorithm and its application[D]. Beijing: Beijing University of Technology, 2008. (in Chinese)
    [16] OCH J F, NEY H.Discriminative training and maximum entropy models for statistical machine translation[J]. Machine Learning Philadelphia, 2002, 3(2): 295-302.
    [17] ZHANG J Y.Research on translation model rearrangement based on hierarchical phrases[D]. Shanghai: Shanghai Jiaotong University, 2015. (in Chinese)
    [18] NA B Q.The research of Mongolian and Chinese machine translation system based on statistics[J]. Journal of Inner Mongolia Agricultural University(Natural Science Edition), 2005, 26(4): 151-154. (in Chinese)
    [19] OCH J F, NEY H.Discriminative training and maximum entropy models for statistical machine translation[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL). Philadelphia: Association for Computational Linguistics, 2002: 295-302.
    [20] WANG J Q, GAO J T.BLEU translation evaluation method based on examples[J]. Computer Knowledge and Technology, 2009, 32(5): 9035-9036. (in Chinese)
    [21] YE Y, ZHOU M, LIN C Y.Sentence level machine translation evaluation as a ranking problem: one step aside from BLEU[C]//Proceedings of the Second Workshop on Statistical Machine Translation. Prague: Association for Computational Linguistics, 2007: 240-247.
    [22] LÜ D.Diverse language high quality translation[J]. Chinese Translation, 2010(4): 19. (in Chinese)
    [23] GAO Y.Brief discussion on Russian and Chinese word order contrast[J]. Theory Observation, 2006(5): 128-129. (in Chinese)
    [24] CHEN Y, LÜ Y J, LI S.Research of collocation translation model based on multi features[J]. Journal of Harbin Institute of Technology, 2007, 39(11): 1790-1795. (in Chinese)
  • 加载中
图(5) / 表(4)
计量
  • 文章访问数:  96
  • HTML全文浏览量:  74
  • PDF下载量:  0
  • 被引次数: 0
出版历程
  • 收稿日期:  2016-07-16
  • 网络出版日期:  2022-09-09
  • 刊出日期:  2017-01-01

目录

    /

    返回文章
    返回