In this paper, an approach has been proposed for improving the tone of statistical machine translation system by analyzing the effect of semantic noise parameters on corpus leading to the selection of more informative corpus. As for some specific application nowadays being translation system running on mobile devices, etc., the computation resources are limited and therefore a compact, efficient, and quite informative corpus is desirable, the resulted optimized corpus will then enhance the performance of translation system. In this proposed research work, extensive work on data cleaning for reducing the impact of semantic noise has been carried out. Experimental results show that our proposed strategies work very well. This work is motivated by our attempts to understand the factors which can affect the quality of corpus for statistical machine translation, especially for English–Hindi systems.
CITATION STYLE
Maheshwari, S. (2018). A study on effect of semantic noise parameters on corpus for english–Hindi statistical machine translation. In Advances in Intelligent Systems and Computing (Vol. 696, pp. 525–534). Springer Verlag. https://doi.org/10.1007/978-981-10-7386-1_45
Mendeley helps you to discover research relevant for your work.