csli.util.nlp.lemma
Class DictionaryLemmatiser

java.lang.Object
  extended by csli.util.nlp.Lemmatiser
      extended by csli.util.nlp.lemma.DictionaryLemmatiser
All Implemented Interfaces:
Serializable

public class DictionaryLemmatiser
extends Lemmatiser

Dictionary-based lemmatiser - loads morphological variant lists from a dictionary file into a lookup table. Rule files should be in /util/ext - there's one called oald_penn_morph_lists.txt derived from the Oxford Advanced Learner's Dictionary of English and using the Penn Treebank PoS tagset.

Author:
mpurver
See Also:
Serialized Form

Constructor Summary
DictionaryLemmatiser()
           
 
Method Summary
 Lemma getLemma(String wordString)
          Produce a lemma (pair of root form & part-of-speech) from a word string.
 String getWord(Lemma lemma)
          Generate a morphological word string from a lemma (pair of root form & part-of-speech).
static void main(String[] args)
           
 
Methods inherited from class csli.util.nlp.Lemmatiser
getLemma, getWord
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DictionaryLemmatiser

public DictionaryLemmatiser()
Method Detail

getLemma

public Lemma getLemma(String wordString)
Description copied from class: Lemmatiser
Produce a lemma (pair of root form & part-of-speech) from a word string. Should return NULL on error.

Specified by:
getLemma in class Lemmatiser
Parameters:
wordString - the word string to lemmatise
Returns:
the corresponding lemma

getWord

public String getWord(Lemma lemma)
Description copied from class: Lemmatiser
Generate a morphological word string from a lemma (pair of root form & part-of-speech). Should return NULL on error.

Specified by:
getWord in class Lemmatiser
Parameters:
lemma - the lemma from which to generate
Returns:
the corresponding word string

main

public static void main(String[] args)