Probability and Statistics in Computational Linguistics, a Brief Review

  • Geman S
  • Johnson M
N/ACitations
Citations of this article
47Readers
Mendeley users who have this article in their library.
Get full text

Abstract

1. Introduction. Computational linguistics studies the computat-ional processes involved in language learning, production, and comprehen-sion. Computational linguists believe that the essence of these processes (in humans and machines) is a computational manipulation of informa-tion. Computational psycholinguistics studies psychological aspects of hu-man language (e.g., the time course of sentence comprehension) in terms of such computational processes. Natural language processing is the use of computers for processing nat-ural language text or speech. Machine translation (the automatic transla-tion of text or speech from one language to another) began with the very earliest computers [Kay et al., 1994]. Natural language interfaces permit computers to interact with humans using natural language, e.g., to query databases. Coupled with speech recognition and speech synthesis, these capabilities will become more important with the growing popularity of portable computers that lack keyboards and large display screens. Other applications include spell and grammar checking and document summa-rization. Applications outside of natural language include compilers, which translate source code into lower-level machine code, and computer vision [Foo, 1974, Foo, 1982]. The notion of a grammar is central to most work in computational linguistics and natural language processing. A grammar is a description of a language; usually it identifies the sentences of the language and pro-vides descriptions of them, e.g., by defining the phrases of a sentence, their inter-relationships, and perhaps also aspects of their meanings. Parsing is the process of recovering a sentence's description from its words, while generation is the process of translating a meaning or some other part of a sentence's description into a grammatical or well-formed sentence. Parsing and generation are major research topics in their own right. Evidently, human use of language involves some kind of parsing and generation pro-cess, as do many natural language processing applications. For example, a machine translation program may parse an input language sentence into a (partial) representation of its meaning, and then generate an output lan-guage sentence from that representation. Although the intellectual roots of modern linguistics go back thousands of years, by the 1950s there was considerable interest in applying the then newly developing ideas about finite-state machines and other kinds of au-tomata, both deterministic and stochastic, to natural language. Automata

Cite

CITATION STYLE

APA

Geman, S., & Johnson, M. (2004). Probability and Statistics in Computational Linguistics, a Brief Review (pp. 1–26). https://doi.org/10.1007/978-1-4419-9017-4_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free