Sign up & Download
Sign in

Visualisation of Learners’ Contributions in Chat Conversations

by S Trausan-Matu, T Rebedea, A Dragan, C Alexandru
Blended Learning (2007)

Abstract

In this paper is presented a novel dialogistic, socio-cultural perspective and an associated software tool, which provide structured visualisation and analysis means of Computer Supported Collaborative Learning chat conversations. The implemented tools use knowledge-based techniques and are based on Bakhtinâs dialogistic paradigm. They visualize the threading of topics and utterances in the conversation and the contributions of the participants in collaborative learning during instant messenger chats. Natural language processing based on the WordNet lexical ontology and semantic distances are used for detecting topics in the chat and their threading. The experiments with the developed application were performed with students at a course on Human-Computer Interaction in Bucharest Politehnica University.

Cite this document (BETA)

Available from Stefan Trausan-Matu and Traian Rebedea's profiles on Mendeley.
Page 1
hidden

Visualisation of Learners’ Contributions in Chat Conversations

Joseph Fong, Fu Lee Wang (Eds): Blended Learning, pp. 217-226, Pearson, 2007.
Workshop on Blended Learning 2007, Edinburgh, United Kingdom.
Visualisation of Learners’
Contributions in Chat Conversations
Stefan Trausan-Matu1,2, Traian Rebedea1,
Alexandru Dragan1, and Catalin Alexandru1

1 “Politehnica” University of Bucharest,
Department of Computer Science and Engineering,
Splaiul Independetei nr. 313, Bucharest, Romania
2Research Institute for Artificial Intelligence of the Romanian Academy
Calea 13 Septembrie nr.13, Bucharest, Romania
trausan@cs.pub.ro, traian@createit.ro, alexd18@yahoo.com, cata@ew.ro
Abstract. In this paper is presented a novel dialogistic, socio-cultural
perspective and an associated software tool, which provide structured
visualisation and analysis means of Computer Supported Collaborative
Learning chat conversations. The implemented tools use knowledge-based
techniques and are based on Bakhtin’s dialogistic paradigm. They visualize the
threading of topics and utterances in the conversation and the contributions of
the participants in collaborative learning during instant messenger chats.
Natural language processing based on the WordNet lexical ontology and
semantic distances are used for detecting topics in the chat and their threading.
The experiments with the developed application were performed with students
at a course on Human-Computer Interaction in Bucharest Politehnica
University.
Keywords: Computer Supported Collaborative Learning, Dialogism, Chat
Conversations, Ontologies, Natural Language Processing
1 Introduction
In recent years, in conjunction with the omnipresence of the Internet and to the
increasing number of collaborative tools like discussion forums and instance
messaging (chat conversations), Computer Supported Collaborative Learning (CSCL)
became an extending and promising way of learning on the Internet, which could
supplement traditional classroom learning. However, CSCL needs particular
supporting tools, for example for tackling and reviewing complex dialog threads in
collaborative learning in small groups using chat conversations. The paper describes
such a tool, that offers visualisation means to analyse the threading of dialog topics
and the contribution of each learner in a chat conversation.
A theoretical background for developing tools for supporting CSCL using chat
conversations is the socio-cultural paradigm, stating that knowledge is socially built
in communities [7] (including the case of small groups of students that learn together).
Page 2
hidden
218 Stefan Trausan-Matu et al.
This new paradigm is imposing itself not only due to technology advances but also
because the individual cognition perspective of classical artificial intelligence (stating
that knowledge should be considered as being in the mind of individuals) did not
fulfil all its expectations [4,14]. However, knowledge-based technology, combined
with natural language processing, has some important applications (e.g. in text mining)
and we should not throw away its potential facilities. Therefore, the approach
presented in this paper is integrating both the knowledge-based (ontology-based),
cognitive paradigm and the socio-cultural one.
Learning paradigms have also changed in a similar way, from Computer-Assisted
Instruction and Intelligent Tutoring Systems to Computer-Supported Collaborative
Learning (CSCL) [4,7]. As a consequence, learning is now conceived as discourse
building, as Sfard remarked: “rather than speaking about ‘acquisition of knowledge,’
many people prefer to view learning as becoming a participant in a certain discourse”
[6]. The way learning is considered has implications on the nature of the computer
tools designed to support it. For example, the tools described in this paper, which
offer the possibility of visualising the discourse in chat conversations, are based on
Bakhtin’s dialogistic theory [1,2], which may be seen as extending Vygotsky’s socio-
cultural ideas [12]. Knowledge-based processing techniques and the lexical ontology
WordNet (http://wordnet.princeton.edu) are used for the identification, delimitation
and visualisation of the inter-animation of the voices of the learners. In addition, an
assessment of the competence of each learner is provided.
There are chat environments for CSCL containing facilities like whiteboards and
explicit referencing. Such an environment is ConcertChat [3], used in this paper.
There are also applications that use natural language processing for abstracting (e.g.
speech acts identification [9] and summarization [10]) or knowledge extraction from
chats and forums. However, these facilities are limited, and one assumption of the
research whose results are presented here is that the limitations are due to the
neglecting of the socio-cultural paradigm.
The experiments for validating the developed application were performed with
students from the final year at the Computer Science Department of the Bucharest
Politehnica University, at a course on Human-Computer Interaction. For performing
the chat conversation, the ConcerChat was used.
The paper continues with a section introducing the socio-cultural and Bakhtin’s
dialogism paradigms. The third section discusses the knowledge-based ideas used in
the present approach. The next section contains the description of the visualisation
tools. The paper ends with conclusions and references.
2 A Dialogical, Socio-Cultural Paradigm of Learning
The socio-cultural paradigm is based on the work of the Russian psychologist Lev
Vygotsky, who emphasized the role of socially established artefacts in
communication and learning [12]. Mikhail Mikhailovici Bakhtin brought a lot of
details to the ideas of Vygotsky, analysing the role of language and discourse, and
especially of speech and dialog. Bakhtin focuses on the idea of dialogism, making it a
fundamental philosophical category, named dialogistic: “… Any true understanding is
Page 3
hidden
Visualisation of Learners’ Contributions in Chat Conversations 219
dialogic in nature.” [11]. Moreover, Lotman considers text as a „thinking device” [13],
determining that: “The semantic structure of an internally persuasive discourse is not
finite, it is open; in each of the new contexts that dialogize it, this discourse is able to
reveal ever new ways to mean” [2].
In forums and chat conversations, group knowledge arises in discourse and is
preserved in linguistic artifacts, whose meaning is co-constructed within group
processes [5], and has a dual nature. Communities of voices, in parallel to the trend
towards unity, have an additional differential, unmerged, character: “The intersection,
consonance, or interference of speeches in the overt dialog with the speeches in the
heroes’ interior dialogs are everywhere present. The specific totality of ideas,
thoughts and words is everywhere passed through several unmerged voices, taking on
a different sound in each” [1]. This dual nature of community and individuality of
voices is expressed by Bakhtin also by the concept of polyphony, that he considers the
invention and one of the main merits of Dostoevsky novels [1]. The relation of
discourse and communities to music was remarked also by Tannen: “Dialogue
combine with repetition to create rhythm. Dialogue is liminal between repetitions and
images: like repetition is strongly sonorous” [8].
In chat conversations, different voices are obvious recognized. However, starting
from Bakhtin’s ideas, in our approach the concept of voices is not only limited to the
number of participants in the chat. A voice is, from this perspective, something said
by a participant in a given moment and that it may be reflected in many subsequent
utterances. Also, each utterance may contain an unlimited number of voices.
3 Knowledge-Based Text Processing
Ontologies like WordNet or FrameNet (http://framenet.icsi.berkeley.edu) are very
successful inheritors of knowledge representation research in artificial intelligence.
They are semantic networks or frame structures built starting from human experience
and, in fact, they are ways of sharing experience. Any collaboration using natural
language, any discourse needs to start from a common vocabulary, a shared ontology.
The word “ontology” is used in philosophy to denote the theory about what is
considered to exist. Any system in philosophy starts from an ontology, that means
from the identification of the concepts and relations considered as fundamental.
Ontologies capture fundamental categories, concepts, their properties and relations.
One very important relation among concepts is the taxonomic one, from a more
general to a more specific concept. This relation may be used as a way of “inheriting”
properties from the more general concepts (“hypernyms”). Other important relations
are “part-whole” (“meronym”), “synonym”, “antonym”.
Viewing knowledge bases as ontologies determines important advantages for
developers of knowledge-based systems. First of all, an ontology is developed as a
coherent framework for the reality and therefore it facilitates knowledge acquisition
and machine learning. A new concept is easy to add in such a framework by finding
one or some more general concepts and defining some differences between the new
concept and the more general ones.
Page 4
hidden
220 Stefan Trausan-Matu et al.
Ontologies are very important in text mining. For these kind of applications they
offer the substrate for semantic analysis and, very important, the possibility of
defining a measure of semantic closeness, based on the graph with concepts from
ontologies as nodes and their relations as arcs. This semantic closeness is very
important in text analysis for example in the retrieval of texts that do not contain a
given word, but they contain a synonym or a semantically related word.
4 Visualization of Users’ Competences
The approach presented here integrates Bakhtin’s socio-cultural ideas with
knowledge-based natural language processing for the visualisation of the
contributions of each learner. The procedure consists in the identification of the topics
discussed in the chat, the separation of the contributions of each participant to a topic
(the voices) and, eventually, the measurement and visualisation of these contributions.
4.1 Identification of Chat Topics
The chat topics are identified in several ways in the present approach. A first method
id to detect the list of concepts (words) that appeared most frequently in the
conversation, by using statistical natural language processing methods. Accordingly,
the importance of a subject is considered related to its frequency in the chat. The first
step in finding the chat subjects is to strip the text of irrelevant words (stop-words),
text emoticons (e.g. “:)”, “:D”, and “:P”), special abbreviations used while chatting
(e.g. “brb”, “np”, and “thx”) and other words considered of no use at this stage.


Fig. 1. A fragment of a chat for a Human-Computer Interaction course, using the ConcerChat
facilities of referencing
The resulted chat text is then tokenised and each different word is considered as a
candidate concept in the analysis. For each of these candidates, WordNet is used for
finding synonyms. If a concept is not found on WordNet, mistypes are searched. If
successful, the synonyms of the suggested word will be retrieved. If no suggestions
are found, the word is considered as being specific to the analyzed chat and the user is
Page 5
hidden
Visualisation of Learners’ Contributions in Chat Conversations 221
asked for details. The last stage for identifying the chat subjects consists of unifying
the candidate concepts discovered in the chat.
In addition to the above method for determining the chat topics, a surface analysis
technique is used. Observing that new topics are generally introduced into a
conversation using some standard expressions such as “let’s talk about email” or
“what about wikis”, a simple and efficient method is used for deducing the topics in a
conversation by searching for the moment when they are first mentioned.
The process of identifying a pattern in an utterance is done using the synset for
each word that has already been extracted from WordNet. This technique will be
improved in a future version of the application by using machine-learning methods
for detecting the patterns specific to the introduction of new topics. Another option is
to consider the extension of the simple patterns described above to more complicated
parsing rules.
4.2 The Graphical Representation of the Conversation
The graphical representation of the chat was designed to permit the best visualization
of the conversation, to facilitate an analysis based on the polyphony theory of Bakhtin,
and to maximize the straightforwardness of following the chat elements. For each
participant in the chat, there is a separate horizontal line in the representation and each
utterance is placed in the line corresponding to the issuer of that utterance, taking into
account its positioning in the original chat file – using the timeline as an horizontal
axis. Each utterance is represented as a rectangle aligned according to the issuer on
the vertical axis and having a horizontal axis length that is proportional with the
dimension of the utterance. The distance between two different utterances is
proportional with the time passed between the utterances. Of course, there is a
minimum and a maximum dimension for each measure in order to restrict anomalies
that could appear in the graphical representation due to extreme cases or chat logging
errors.
The relationships between utterances are represented using coloured lines that
connect these utterances. The explicit references that are known due to the use of the
ConcertChat software are depicted using blue connecting lines, while the implicit
references that are deduced using the method described in this paper are represented
using red lines. The utterances that introduce a new topic in the conversation are
represented with a red margin.
The graphical representation of the chat has a scaling factor that permits an
attentive observation of the details in a conversation, as well as an overview of the
chat. The different visual elements determined by our application – such as utterances
in the same topic, topic introducing utterances and relationships between topics – can
be turned on and off in the graphical representation by use of checkboxes.
At the bottom of the graphical representation of the conversation, after the line
corresponding to the last participant in the chat, there is a special area that represents
the importance of each utterance, considered as a chat voice, in the conversation (see
figure 2). How this importance is determined is presented in a further section.
Page 6
hidden
222 Stefan Trausan-Matu et al.

Fig. 2. The threads of references in the chat
4.3 Discovering the Implicit Voices
Considering each chat utterance as being a voice that has a certain importance in the
conversation, it is obvious that each utterance generally contains more than a single
voice, as it includes the current voice and probably at least one referring voice. As we
are working with ConcertChat transcript files, we acknowledge the voices that are
explicitly pointed out by the chat participants during the conversation, using the
software’s referencing tool. Nevertheless, because users are seldom in a hurry or
simply not attentive enough, part of the utterances do not have any explicit references.
Thus, it is necessary to find a method for discovering the implicit references in an
utterance; in this way, we shall identify more relationships between the utterances in
the chat.
The method proposed here is similar to the one presented above for determining
the introduction of new chat topics. We are using another list of patterns that consists
from a set of words (expressions) and a local subject called the referred word. If we
identify that an utterance matches one of the patterns, we firstly determine what word
in the utterance is the referred word (e.g. “I don’t agree with your assessment”). Then,
we search for this word in the predetermined number of the most recent previous
utterances. If we can find this word in one of these utterances, then we have
discovered an implicit relationship between the two lines, the current utterance
referring to the identified utterance.
We have also implemented two empirical methods, which provide very good
results when utilizing any chat software. One of these empirical methods is based on
the following fact: if between three utterances there are two explicit relationships
from the first to the second and from the second to the third and the second utterance
is a short agreement or disagreement, then between the first and the third utterance
there exists an implicit relationship. For example, consider the following example,
Page 7
hidden
Visualisation of Learners’ Contributions in Chat Conversations 223
where there are explicit references between A and B, respectively B and C, it is
clearly we have an implicit relationship between A and C. In the last utterance, we
have influences from both A and B:

A – I think wikis are the best
(…)
B – I disagree REF A
(…)
C – Maybe we should talk about them anyway REF B
4.4 Determining the Strength Value of an Utterance
Starting from existing references within the analysed conversations, both those
explicit, offered by the used chat environment, as well as those implicit determined by
the program using the previously presented methodology, one could assemble a
conversation graph. This graph may be used both for determining the strength value
of each utterance in the chat considered as a separate voice, as well as for
emphasizing certain subjects (threads) of the conversation.
The importance of an utterance in a conversation can be calculated through its
length and by the number of key (important) words. Another approach was also
investigated: an utterance is important if it influences the subsequent evolution of the
conversation. Using this definition as a starting point, we may infer that an important
utterance will be that utterance which is a reference for as many possible subsequent
utterances.
Even if this approach could be extended to include the types of subsequent
references (implicit or explicit, agreements or disagreements), in the present case we
have preferred a more simplistic approach, without making allowances for the types
of references to the utterance.
Consequently, the importance of an utterance can be considered as a strength value
of an utterance, where an utterance is strong if it influences the future of the
conversation (such as breaking news in the field of news). When determining the
strength of an utterance, the strength of the utterances which refer to it is used. Thus,
if an utterance is referenced by other utterances which are considered important,
obviously that utterance also becomes important.
As a result, for the calculation of the importance of every utterance, the graph is
ran through in the opposite direction of the edges, as a matter of fact in the reverse
order of the moment the utterance was typed. Utterances which do not have
references to themselves (the last utterance of the chat will certainly be one of them)
receive a default importance – taken as the unit. Then, running through the graph in
the reverse order of references, each utterance receives an importance equal to that of
the default plus a quota (subunit) from the sum of the importance of the utterances
referring to the current utterance. Another modality to calculate could be 1 plus the
number of utterances that refer to the present utterance, but this choice seemed less
suitable.
By using this method of calculating the importance of an utterance, the utterances
which have started an important conversation within the chat, as well as those
Page 8
hidden
224 Stefan Trausan-Matu et al.
utterances which begin new topics or mark the passage between topics, are more
easily emphasized. If the explicit relationships were always used and the implicit ones
could be correctly determined in as high a number as possible, then this method of
calculating the importance of a voice would be successful.
4.5 Assessing the Competencies of the Learners in the Conversation
In order to determine the competences of the chat users, we first searched the most
important topics in the analyzed chat conversation. The generated graphics evaluate
the competences of each user, starting from the list of subjects determined as
explained above and using other criteria such as questions, agreement, disagreement
or explicit and implicit referencing. The graphics are generated using a series of
parameters like: implicit and explicit reference factors, bonuses for agreement,
penalties for disagreement, minimum value for a chat utterance, penalty factors for
utterances that agree or disagree with other utterances as these utterances have less
originality than the first ones.


Fig. 3. The evolution of the competence degree
During the first step of the graphics generation, the value of each utterance is
computed by reporting it to an abstract utterance that is built from the most important
concepts in the conversation determined as described above. When constructing this
utterance, we take into account only the concepts whose frequency of appearance is
above a given threshold. Then, all the utterances in the chat are scaled in the interval
0 – 100, by comparing each utterance with the abstract utterance. The comparison is
done using the synsets of each word contained in the utterance. Thus, this process
uses only the horizontal relations from WordNet. An utterance with a score of 0
contains no words from the concepts in the abstract utterance and an utterance with a
score of 100 contains all the concepts from the abstract utterance.
On the Ox axis the graphics hold all the utterances in the chat and on the Oy axis
the value attributed to each participant in the conversation, representing each user’s
Page 9
hidden
Visualisation of Learners’ Contributions in Chat Conversations 225
competence (see figure 3). Accordingly, for each utterance, at least the value of a user
competence is modified – the value for the user that issued that utterance.
For each utterance in the chat, the values of the users’ competences are modified
using the following rules:
1) the user that issued the current utterance receives the score of the utterance,
eventually downgraded if that utterance is an agreement or disagreement in relation to
a previous utterance (in order to encourage originality);
2) all the users that are literally present in the current utterance are rewarded with a
percentage of the utterance value, considering that they have some merit in the value
of this utterance, as being mentioned in the text of the utterance encourages us to
think so;
3) the issuer of the utterance explicitly referred to by the current utterance is
rewarded if this utterance is an agreement and is penalized if the utterance is a
disagreement;
4) the issuer of the utterance explicitly referred to by the current utterance that is
not an agreement or a disagreement, will be rewarded with a fraction of the value of
this utterance; and
5) if the current utterance has a score of 0, the issuer will receive a minimum score
in order to differentiate between the users that actually participate in the chat and
those who do not.
All the percentages and all the other factors used for computing the competence of
each user are used as parameters of the process and can be easily modified in the
application interface. The process described above builds competence function
graphics for each participant in the chat. At the start of the process, each user has a
null competence. It should be mentioned that the competence of a user is not a strictly
increasing function, as users are penalized for utterances that are in disagreement with
the other users’ opinions.
5 Conclusions
The paper presents an application that visualizes the voices (following Bakhtin’s ideas)
of the participants on forums or chat conversations, similarly to music scores. In
addition, some other diagrammatic representations are used for viewing the influence
of a given speaker’s voice.
The application may be used for inspecting what is going on and in what degree
learners are implied in a forum discussion or a chat conversation. Moreover, the
competence of each participant may be measured, that means that learners may be
assessed in collaborative learning on the web.
The application uses the WordNet ontology. Knowledge acquisition for concepts
that are not present in this ontology is provided through dialogs with the user of the
analysis system and by caching the results. Natural language technology is used for
the identification of discussion topics, for segmentation and for identifying implicit
references.
Further work will consider more complex semantic distances (than only
synonymy). Machine learning techniques will be used for the identification of
Page 10
hidden
226 Stefan Trausan-Matu et al.
discourse patterns. New rules for the identification of implicit links are now under
development.

Acknowledgments. The authors wish to express their appreciation to the members of
the Virtual Math Teams research project at Drexel University, whose voices are
present in different ways in the paper. The research presented here has been partially
performed under a Fulbright Scholar post-doc grant (awarded to Stefan Trausan-
Matu), the EU-NCIT EU excellence centre and the CNCSIS project K-Teams. Any
opinions, findings, or recommendations expressed are those of the authors and do not
necessarily reflect the views of the sponsors.
References
1. Bakhtin, M.M., Problems of Dostoevsky’s Poetics, Ardis, (1973)
2. Bakhtin, M.M., The Dialogic Imagination: Four Essays, University of Texas Press, (1981)
3. Holmer, T., Kienle, A., Wessner, M. “Explicit Referencing in Learning Chats: Needs and
Acceptance,” in Innovative Approaches for Learning and Knowledge Sharing, First
European Conference on Technology Enhanced Learning, EC-TEL 2006, Nejdl, W.,
Tochtermann, K., (eds.), Lecture Notes in Computer Science, 4227, Springer, (2006) 170-
184
4. Koshmann, T., Toward a Dialogic Theory of Learning: Bahtin’s Contribution to
Understanding Learning in Settings of Collaboration, in C.Hoadley and J. Roschelle (eds.),
Proceedings of the Computer Support for Collaborative Learning 1999 Conference,
Stanford, Laurence Erlbaum Associates, (1999).
5. Schegloff, E., Discourse As An Interactional Achievement: Some Uses Of 'Uh huh' And
Other Things That Come Between Sentences, in Tannen, D. (ed.), Georgetown University
Roundtable on Languages and Linguistics 1981; Analyzing Discourse: Text and Talk,
Georgetown University Press, Washington D.C. (1981)
6. Sfard, A., On reform movement and the limits of mathematical discourse, Mathematical
Thinking and Learning, 2(3), (2000) 157-189
7. Stahl, G., Group Cognition: Computer Support for Building Collaborative Knowledge,
MIT Press, (2006)
8. Tannen, D., Talking Voices: Repetition, Dialogue, and Imagery in Conversational
Discourse, Cambridge University Press, (1989)
9. Trausan-Matu, S., Chiru, C., Bogdan, R., Identificarea actelor de vorbire în dialogurile
purtate pe chat, in Stefan Trausan-Matu, Costin Pribeanu (Eds.), Interactiune Om-
Calculator 2004, Editura Printech, Bucuresti, (2004) 206-214.
10. Trauşan-Matu, S., Stahl, G., Sarmiento, J., Polyphonic Support for Collaborative Learning,
in Y.A. Dimitriadis et al. (Eds.): CRIWG 2006, Lecture Notes in Computer Science 4154,
Springer, (2006) 132 – 139
11. Voloshinov, Marxism and the Philosophy of Language, New York Seminar Press, (1973)
12. Vygotsky, L., Mind in society, Cambridge, MA: Harvard University Press, (1978)
13. Wertsch, J.V., Voices of the Mind, Harvard University Press, (1991)
14. Winograd, T., Flores, F., Understanding Computers and Cognition, Norwood, N.J.: Ablex,
(1986)

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

2 Readers on Mendeley
by Discipline
 
by Academic Status
 
50% Ph.D. Student
 
50% Professor
by Country
 
50% Romania
 
50% Macedonia