csli.util.stat
Class ClassificationAgreement

java.lang.Object
  extended by csli.util.stat.ClassificationAgreement

public class ClassificationAgreement
extends Object

Calculates agreement and evaluation statistics for classification annotations. Currently, methods support only two-way comparison, though they support any number of possible classes. In cases where a comparison is being made between a reference and hypothesized classification, the first dimension of the matrix should refer to the reference classification, and the second dimension the hypothesized classification.


Constructor Summary
ClassificationAgreement()
           
 
Method Summary
static boolean checkTable(int[][] cTable)
          Checks to see if the contingency table is properly formed, i.e.
static void compareFiles(File f1, File f2)
          Read in two files to compare, with possibly multiple columns of data which will be compared separately
static int correct(int[][] cTable)
          Return the total number of agreeing classifications.
static double error(int[][] cTable)
          Evaluates the error between two n-class classifications.
static double fscore(int[][] cTable)
          Evaluates the positive fscore of a binary classification ref/hyp contingency table.
static double fscore(int[][] cTable, int cls)
          Evaluates the fscore for a particular class in a ref/hyp contingency table.
static int hypClassSum(int[][] cTable, int c)
          Sum the number of classifications the refernce annotation made for a given class.
static double kappaCohen(int[][] cTable)
          Evaluates n-class kappa between two annotations, using the cohen chance calculation.
static double kappaSiegel(int[][] cTable)
          Evaluates n-class kappa between two annotations, using the siegel chance calculation.
static void main(String[] args)
          For testing purposes.
static int[][] makeTable(int[] ref, int[] hyp, int numClasses)
          Make a contingency table from two classification arrays, reference and hypoethsized, with the specified number of possible target classes.
static double precision(int[][] cTable)
          Evaluates positive precision in a binary-classification ref/hyp contingency table.
static double precision(int[][] cTable, int val)
          Evaluates the precision of a specified class in a multi-class classification.
static void printTable(int[][] cTable)
           
static double recall(int[][] cTable)
          Evaluates positive recall in a binary classification ref/hyp contingency table.
static double recall(int[][] cTable, int val)
          Evaluates the recall of particular class in a multi-class classification.
static int refClassSum(int[][] cTable, int c)
          Sum the number of classifications the reference annotation made for the given class.
static int sum(int[][] cTable)
          Calculate the total number of data-points in the table.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ClassificationAgreement

public ClassificationAgreement()
Method Detail

checkTable

public static boolean checkTable(int[][] cTable)
Checks to see if the contingency table is properly formed, i.e. is square and at least 2x2.


correct

public static int correct(int[][] cTable)
Return the total number of agreeing classifications. (n x n) matrix allowed.


error

public static double error(int[][] cTable)
Evaluates the error between two n-class classifications. (n x n) matrix allowed.


fscore

public static double fscore(int[][] cTable)
Evaluates the positive fscore of a binary classification ref/hyp contingency table. (2 x 2) matrix only


fscore

public static double fscore(int[][] cTable,
                            int cls)
Evaluates the fscore for a particular class in a ref/hyp contingency table. (n x n) matrix accepted


hypClassSum

public static int hypClassSum(int[][] cTable,
                              int c)
Sum the number of classifications the refernce annotation made for a given class. (n x n) matrix allowed


kappaCohen

public static double kappaCohen(int[][] cTable)
Evaluates n-class kappa between two annotations, using the cohen chance calculation. See http://www-class.unl.edu/psycrs/handcomp/hckappa.PDF (n x n) matrix allowed.


kappaSiegel

public static double kappaSiegel(int[][] cTable)
Evaluates n-class kappa between two annotations, using the siegel chance calculation. (n x n) marix allowed.


main

public static void main(String[] args)
For testing purposes.


makeTable

public static int[][] makeTable(int[] ref,
                                int[] hyp,
                                int numClasses)
Make a contingency table from two classification arrays, reference and hypoethsized, with the specified number of possible target classes. Returns a (numClasses x numClasses) matrix.


precision

public static double precision(int[][] cTable)
Evaluates positive precision in a binary-classification ref/hyp contingency table. (2 x 2) matrix allowed


precision

public static double precision(int[][] cTable,
                               int val)
Evaluates the precision of a specified class in a multi-class classification. (n x n) matrix allowed


recall

public static double recall(int[][] cTable)
Evaluates positive recall in a binary classification ref/hyp contingency table. (2 x 2) matrix allowed


recall

public static double recall(int[][] cTable,
                            int val)
Evaluates the recall of particular class in a multi-class classification. (n x n) matrix allowed


refClassSum

public static int refClassSum(int[][] cTable,
                              int c)
Sum the number of classifications the reference annotation made for the given class. (n x n) matrix allowed


sum

public static int sum(int[][] cTable)
Calculate the total number of data-points in the table. (n x n) matrix allowed


printTable

public static void printTable(int[][] cTable)

compareFiles

public static void compareFiles(File f1,
                                File f2)
Read in two files to compare, with possibly multiple columns of data which will be compared separately