csli.util.nlp
Class PoSTagger

java.lang.Object
  extended by csli.util.nlp.PoSTagger
Direct Known Subclasses:
MXPOST, QTag, StanfordTagger

public abstract class PoSTagger
extends Object

An abstract part-of-speech tagger.

Author:
mpurver

Constructor Summary
PoSTagger()
           
 
Method Summary
abstract  String getTagSeparator()
          Get the character this tagger uses to separate words from tags.
protected static String join(String[] words)
          Join an array of words together to make a whitespace-separated sentence
 List<String> tag(List<String> sentences)
          Tag a list of sentences.
 ScoredObject<String> tag(ScoredObject<String> sentence)
          Tag a scored sentence
abstract  String tag(String sentence)
          Tag a string (possibly a sentence containing multiple words separated by whitespace)
 String[] tag(String[] sentences)
          Tag an array of sentences.
 List<String> tagWords(List<String> words)
          Tag a sentence represented as a list of words in linear order.
 String[] tagWords(String[] words)
          Tag a sentence represented as an array of words in linear order.
protected static void test(PoSTagger tagger, String[] args)
          A convenience method for development testing.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PoSTagger

public PoSTagger()
Method Detail

tag

public abstract String tag(String sentence)
Tag a string (possibly a sentence containing multiple words separated by whitespace)

Parameters:
sentence - the string to tag. White space taken to separate words. No punctuation/case normalization is performed here.
Returns:
a corresponding string containing the tagged word(s), or null on error

tag

public String[] tag(String[] sentences)
Tag an array of sentences. The default implementation is just to call the tag(String) method on each member.

Parameters:
sentences - the array of sentences to tag
Returns:
an array of tagged sentences, or null on error

tag

public List<String> tag(List<String> sentences)
Tag a list of sentences. The default implementation is just to call the tag(String) method on each member.

Parameters:
sentences - the list of strings to tag
Returns:
a list of tagged sentences, or null on error

tagWords

public String[] tagWords(String[] words)
Tag a sentence represented as an array of words in linear order. The default implementation is to stringify, call the tag(String) method, then split.

Parameters:
words - the array of words to tag
Returns:
an array of tagged words, or null on error

tagWords

public List<String> tagWords(List<String> words)
Tag a sentence represented as a list of words in linear order. The default implementation is to stringify, call the tag(String) method, then split.

Parameters:
words - the list of words to tag
Returns:
a list of tagged words, or null on error

tag

public ScoredObject<String> tag(ScoredObject<String> sentence)
Tag a scored sentence

Parameters:
sentence - the scored sentence to tag
Returns:
the tagged sentence as a scored string

getTagSeparator

public abstract String getTagSeparator()
Get the character this tagger uses to separate words from tags.

Returns:
the character

join

protected static String join(String[] words)
Join an array of words together to make a whitespace-separated sentence

Parameters:
words - an array of words
Returns:
the sentence with a single space character between each word

test

protected static void test(PoSTagger tagger,
                           String[] args)
A convenience method for development testing.

Parameters:
tagger -
args -