Abstract
Thinking is a self-organized dynamical process and, as such, interesting to characterize. However, direct, real-time access to thought at the semantic level is still very limited. The best that can be done is to look at spoken or written expression. The question we address in this research is the following: Is there a characteristic pitch of thought? To begin answering this complex question, we look at text documents from several large corpora at the sentence level – i.e., using sentences as the units of meaning – and considering each document to be the result of a random process in semantic space. Given a large corpus of multi-sentence documents, we build a lexical association network representing associations between words in the corpus. This network is used to induce a semantic similarity metric between sentences, and each document is segmented into multi-sentence semantically coherent blocks (SCBs) with occasional connecting text between the blocks. Based on this segmentation, the process of document generation is modeled as a sticky Markov chain at the sentence level. We show that most documents across all the corpora are sequences of blocks with a very consistent mean length of 6.4 sentences across the corpora. This consistency suggests that a value of 6-7 sentences may be the typical mean length for single coherent thoughts in texts. We have also described several ways of visualizing the semantic structure of documents in space and time.
Author supplied keywords
Cite
CITATION STYLE
Mei, M., Ren, Z., & Minai, A. A. (2018). Mining the Temporal Structure of Thought from Text. In Springer Proceedings in Complexity (pp. 291–298). Springer. https://doi.org/10.1007/978-3-319-96661-8_31
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.