Syntactic dependency-based n-grams as classification features

Grigori Sidorov; Francisco Velasquez; Efstathios Stamatatos; Alexander Gelbukh; Liliana Chanona-Hernández

Conference Proceedings

Syntactic dependency-based n-grams as classification features

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2013) 7630 LNAI(PART 2) 1-11

DOI: 10.1007/978-3-642-37798-3_1

69Citations

82Readers

Get full text

Abstract

In this paper we introduce a concept of syntactic n-grams (sn-grams). Sn-grams differ from traditional n-grams in the manner of what elements are considered neighbors. In case of sn-grams, the neighbors are taken by following syntactic relations in syntactic trees, and not by taking the words as they appear in the text. Dependency trees fit directly into this idea, while in case of constituency trees some simple additional steps should be made. Sn-grams can be applied in any NLP task where traditional n-grams are used. We describe how sn-grams were applied to authorship attribution. SVM classifier for several profile sizes was used. We used as baseline traditional n-grams of words, POS tags and characters. Obtained results are better when applying sn-grams. © 2013 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., & Chanona-Hernández, L. (2013). Syntactic dependency-based n-grams as classification features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7630 LNAI, pp. 1–11). https://doi.org/10.1007/978-3-642-37798-3_1

Syntactic dependency-based n-grams as classification features

Abstract

Author supplied keywords

Cite

Register to see more suggestions