Visual Execution and Data Visualisation in Natural Language Processing
Computer (1997)
Available from kar.kent.ac.uk
or
Abstract
We describe GGI, a visual system that allows the user to execute an automatically generated data flow graph containing code modules that perform natural language processing tasks. These code modules operate on text documents. GGI has a suite of text visualisation tools that allows the user useful views of the annotation data that is produced by the modules in the executable graph. GGI forms part of the GATE natural language engineering system.
Author-supplied keywords
Page 1
Visual Execution and Data Visualisation in Natural Language Processing
Visual Execution and Data Visualisation
in Natural Language Processing
Peter Rodgers, Robert Gaizauskas, Kevin Humphreys, Hamish Cunningham
Department of Computer Science, University of Sheffield, UK
fpeterr,robertg,kwh,hamishg@dcs.shef.ac.uk
Abstract
We describe GGI, a visual system that allows the user
to execute an automatically generated data flow graph con-
taining code modules that perform natural language pro-
cessing tasks. These code modules operate on text docu-
ments. GGI has a suite of text visualisation tools that allows
the user useful views of the annotation data that is produced
by the modules in the executable graph. GGI forms part of
the GATE natural language engineering system.
1. Introduction
The current relationship between visual languages and
natural language processing (NLP) is restricted to translat-
ing graphical languages into natural language [1] or visual
representations of text processing languages [13]. We be-
lieve that there is a great deal of potential for expressing
the execution of NLP systems visually. One reason for this
is the modular nature of NLP algorithms, which mean that
a data flow visual language is a natural way of represent-
ing NLP programs. There is also a great need for generic
tools that allow the visualisation of data associated with tex-
tual documents after they have been analysed by NLP tech-
niques.
This paper concentrates on the visual execution of NLP
tasks using data flow techniques, and visualising the inform-
ation that results. Specifically, the paper describes GGI –
the GATE Graphical Interface. GGI is a tool for visualising
the execution and data of programs integrated into GATE
[5], a natural language engineering environment which aims
to support researchers and developers of NLP systems and
applications by supplying facilities for modular reuse of
NLP software, management of large text collections, and
visualisation of processing results (see Section 2).
While GGI provides a full user interface to GATE, in-
cluding, for example, support for file management, there
are two aspects of it that are of interest here. First,
GGI provides an autogenerating, customisable, graph for
controlling the execution of interdependent NLP modules.
Second, GGI provides a class of generic visualisation tools
for viewing the complex information computed about texts
by NLP modules.
In GATE, execution of all modules is performed in an
executable graph that is a simple form of data flow diagram
in which the nodes are the modules or functions to be ex-
ecuted and the arcs represent data flows. We call this graph
the system graph. The functions that form the nodes have a
large computational granularity and are of comparable com-
putational size to the functions seen in, e.g., ConMan [9].
This graph is less computationally expressive than is typ-
ically found in visual data flow languages [10, 18], as it con-
tains no looping (iteration) or distributor constructs (by dis-
tributor construct we mean that the result of execution of an
upstream module defines which downstream module is to
be executed). However, this simplicity has benefits for the
modular system development architecture that GATE aims
to supply.
In particular, it is possible to autogenerate the data flow
program (system graph) from the declaratively stated pre-
and postconditions that each module in the GATE system
must have. The preconditions define the data that must
be present before a module can be run; the postconditions
define the data that will be present after a module has been
run. Together these permit the dynamic construction of the
execution graph’s arcs and mean that no ‘hard-coding’ of
module connections is required. At run time actual data
flow is mediated by a common database through which all
modules intercommunicate and the execution graph con-
veys the state of the database to the user through the col-
ouring of modules according to a traffic-light metaphor to
indicate their executability.
The autogeneration procedure means that users do not
need to take directly into account the other modules in the
system (or unknown modules that might in the future be ad-
ded to the system) when they integrate a new module into
GATE. It thus helps to realise GATE’s objective of provid-
ing a ‘plug-and-play’ architecture for natural language en-
gineering. The executable graph is described in more detail
in Section 3.
The second aspect of the GGI we describe below is the
set of data visualisation tools it provides. The data produced
from the execution of a module can be viewed directly from
the system graph. Clicking on the module brings up the list
of postcondition data types, i.e., the data that the module
has created. Selecting one launches an appropriate results
in Natural Language Processing
Peter Rodgers, Robert Gaizauskas, Kevin Humphreys, Hamish Cunningham
Department of Computer Science, University of Sheffield, UK
fpeterr,robertg,kwh,hamishg@dcs.shef.ac.uk
Abstract
We describe GGI, a visual system that allows the user
to execute an automatically generated data flow graph con-
taining code modules that perform natural language pro-
cessing tasks. These code modules operate on text docu-
ments. GGI has a suite of text visualisation tools that allows
the user useful views of the annotation data that is produced
by the modules in the executable graph. GGI forms part of
the GATE natural language engineering system.
1. Introduction
The current relationship between visual languages and
natural language processing (NLP) is restricted to translat-
ing graphical languages into natural language [1] or visual
representations of text processing languages [13]. We be-
lieve that there is a great deal of potential for expressing
the execution of NLP systems visually. One reason for this
is the modular nature of NLP algorithms, which mean that
a data flow visual language is a natural way of represent-
ing NLP programs. There is also a great need for generic
tools that allow the visualisation of data associated with tex-
tual documents after they have been analysed by NLP tech-
niques.
This paper concentrates on the visual execution of NLP
tasks using data flow techniques, and visualising the inform-
ation that results. Specifically, the paper describes GGI –
the GATE Graphical Interface. GGI is a tool for visualising
the execution and data of programs integrated into GATE
[5], a natural language engineering environment which aims
to support researchers and developers of NLP systems and
applications by supplying facilities for modular reuse of
NLP software, management of large text collections, and
visualisation of processing results (see Section 2).
While GGI provides a full user interface to GATE, in-
cluding, for example, support for file management, there
are two aspects of it that are of interest here. First,
GGI provides an autogenerating, customisable, graph for
controlling the execution of interdependent NLP modules.
Second, GGI provides a class of generic visualisation tools
for viewing the complex information computed about texts
by NLP modules.
In GATE, execution of all modules is performed in an
executable graph that is a simple form of data flow diagram
in which the nodes are the modules or functions to be ex-
ecuted and the arcs represent data flows. We call this graph
the system graph. The functions that form the nodes have a
large computational granularity and are of comparable com-
putational size to the functions seen in, e.g., ConMan [9].
This graph is less computationally expressive than is typ-
ically found in visual data flow languages [10, 18], as it con-
tains no looping (iteration) or distributor constructs (by dis-
tributor construct we mean that the result of execution of an
upstream module defines which downstream module is to
be executed). However, this simplicity has benefits for the
modular system development architecture that GATE aims
to supply.
In particular, it is possible to autogenerate the data flow
program (system graph) from the declaratively stated pre-
and postconditions that each module in the GATE system
must have. The preconditions define the data that must
be present before a module can be run; the postconditions
define the data that will be present after a module has been
run. Together these permit the dynamic construction of the
execution graph’s arcs and mean that no ‘hard-coding’ of
module connections is required. At run time actual data
flow is mediated by a common database through which all
modules intercommunicate and the execution graph con-
veys the state of the database to the user through the col-
ouring of modules according to a traffic-light metaphor to
indicate their executability.
The autogeneration procedure means that users do not
need to take directly into account the other modules in the
system (or unknown modules that might in the future be ad-
ded to the system) when they integrate a new module into
GATE. It thus helps to realise GATE’s objective of provid-
ing a ‘plug-and-play’ architecture for natural language en-
gineering. The executable graph is described in more detail
in Section 3.
The second aspect of the GGI we describe below is the
set of data visualisation tools it provides. The data produced
from the execution of a module can be viewed directly from
the system graph. Clicking on the module brings up the list
of postcondition data types, i.e., the data that the module
has created. Selecting one launches an appropriate results
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
2 Readers on Mendeley
by Discipline
50% Education
by Academic Status
50% Student (Master)
50% Ph.D. Student
by Country
50% South Africa
50% Australia


