Printed romanian modelling: A corpus linguistics based study with orthography and punctuation marks included

Adriana Vlad; Adrian Mitrea; Mihai Mitrea

Conference Proceedings

Printed romanian modelling: A corpus linguistics based study with orthography and punctuation marks included

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2007) 4705 LNCS(PART 1) 409-423

DOI: 10.1007/978-3-540-74472-6_33

10Citations

3Readers

Get full text

Abstract

This paper is part of a larger study dedicated by the authors to the description of printed Romanian language as an information source. Here, the statistical investigation attempts to get an answer concerning the mathematical model of the language with orthography and punctuation marks included into the alphabet. To come out to an accurate result, the authors processed the information obtained out of multiple data sets sampled from a corpus linguistics, by using the following statistical inferences: probability estimation with multiple confidence intervals, test of the hypothesis that the probability belongs to an interval, and test of the equality between two probabilities. The second type statistical error probability involved in the tests was considered. The experimental results, which are new for printed Romanian, refer to the letter, digram and trigram statistical structure in a corpus linguistics of 93 books (about 50 millions characters). © Springer-Verlag Berlin Heidelberg 2007.

Author supplied keywords

Cite

CITATION STYLE

APA

Vlad, A., Mitrea, A., & Mitrea, M. (2007). Printed romanian modelling: A corpus linguistics based study with orthography and punctuation marks included. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4705 LNCS, pp. 409–423). Springer Verlag. https://doi.org/10.1007/978-3-540-74472-6_33

Printed romanian modelling: A corpus linguistics based study with orthography and punctuation marks included

Abstract

Author supplied keywords

Cite

Register to see more suggestions