What is the subject of Computational Linguistics

Let’s begin with the practical side of it: the field of Computational Linguistics, generally speaking, encompasses everything to do with language related data and processing it with computer programs — and that /is/ quite substantial. PCs can be found in every office and in most private households nowadays, and thus are used mostly for text processing. Much of what is saved, processed and printed with computers today was input in some natural language, such as german, english, or french.

The task at hand for Computational Linguistics is to develop computer programs that perform certain language related tasks. They might, for example:

  • support the user in text processing tasks – correction of spelling errors, proof-reading sentences for grammatical correctness, hyphenation
  • search for passages of text, not only on the surface level, but also with respect to the meanings of words
  • support authors in the writing of texts, e. g. by finding the correct or terminologically apt expression
  • aid in the translation of texts into other languages or even translate texts automagically
  • offer interaction in the language of the user, in order to make computers availably to a wider audience that isn’t computer literate
  • convert spoken into written language (talk-writer) and vice versa (as in automatic telephone support or braille readers)
  • enhance data, e. g. by indexing literature, making glossaries and content lists, generating abstracts
  • find all sorts of information, ranging from related literature to immediate responses to queries based on large data bases
  • and finally, aid in language acquisition, for example by teaching vocabulary

If and how these problems can be resolved by means of computer programs depends not only upon the state of the art in computer technology, but also upon the characteristics of language. Furthermore, the general aim will be to find procedures suited for all languages, while it is at the same time necessary to specify at least some of the details for each language seperately. A hyphenation software for the english language will hardly be of any use if applied to german texts, because the grammatical principles in this area must vary. This is where Computational Linguistics differs from traditional Computer Science, whose area of expertise is programming in general: it is interested solely in language related parts of a computer program.

The definition of a science, however, is not exhausted in one field of application: it also comprises a theoretical interest. Computers are automatons that manipulate symbols (on the very lowest level, zeros and ones) according to certain rules. Just like numbers, languages are also systems of symbols, even though they are, of course, extremely complex. We are thus naturally inclined to design computer systems that simulate the way man operates language, at least in part. That way we can test linguistic hypotheses with the computer. In that respect, the science of Computational Linguistics is one of linguistics that employs computer simulation as a methodical means to enhance our knowledge of human language.

This approach opens up a wide range of pschological and philosophical questions though: the computer is a machine; language is indivisibly intertwined with our mind. How far can we press this way of “computing language”, as it presents itself? Will computers come to think one day, or does the human mind work like a machine manipulation symbols? What is so fascinating in the computer aided simulation of the use of language is in fact finding its limitations. One interest in the study of computational linguistics can be to find out in which way human communication is different from what computers can achieve. This knowledge is very important if we want to find out which importance society should attribute to computers.

Leave a Reply