Abstract
We have created two sets of labels for Hafez1 (1315-1390) poems, using unsupervised learning. Our labels are the only semantic clustering alternative to the previously existing, handlabeled, gold-standard classification of Hafez poems, to be used for literary research. We have cross-referenced, measured and analyzed the agreements of our clustering labels with Houman's chronological classes. Our features are based on topic modeling and word embeddings. We also introduced a similarity of similarities' features, we called homothetic clustering approach that proved effective, in case of Hafez's small corpus of ghazals2. Although all our experiments showed different clusters when compared with Houman's classes, we think they were valid in their own right to have provided further insights, and have proved useful as a contrasting alternative to Houman's classes. Our homothetic clusterer and its feature design and engineering framework can be used for further semantic analysis of Hafez's poetry and other similar literary research.
Cite
CITATION STYLE
Rahgozar, A., & Inkpen, D. (2019). Semantics and homothetic clustering of hafez poetry. In LaTeCH@NAACL-HLT 2019 - 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, Proceedings (pp. 82–90). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w19-2511
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.