csli.dialog.app.calo.topic.classification.topicextraction
Class CsliOnlineTopicSegmenterService

java.lang.Object
  extended by csli.dialog.app.calo.topic.classification.topicextraction.CsliOnlineTopicSegmenterService
All Implemented Interfaces:
Service, OnlineTopicSegmenterService
Direct Known Subclasses:
CsliOnlineTopicSegmenterServiceImpl, CsliTopicSegmenterService

public abstract class CsliOnlineTopicSegmenterService
extends Object
implements OnlineTopicSegmenterService


Nested Class Summary
 class CsliOnlineTopicSegmenterService.TopicSegmenterParameters
           
 
Constructor Summary
CsliOnlineTopicSegmenterService()
           
 
Method Summary
abstract  void clear()
          Clears all the cache files that have been generated by the TopicSegmenterService.
 Topic createTopic(String query)
          Creates a topic from a query and synonyms of words in the query
 void createTopicModel(String meetingUID)
          Creates an OPI model containing the TopicDiscussion segmentation derived using getTopicBreaks.
 List<Pair<Double,TopicArea>> csliSearchTopic(ArrayList<Pair<Topic,Double>> topics)
           
 List<Pair<Double,TopicArea>> csliSearchTopic(ArrayList<Pair<Topic,Double>> topics, long queryIdentifier)
           
abstract  List<Pair<Double,TopicArea>> csliSearchTopic(Topic t)
           
 Pair<XSDDateTime,XSDDateTime> extractTopicBoundaries(String meeting, XSDDateTime time)
          Computes the boundaries of the topic surrounding time.
protected  List<Topic> findRelevantTopics(Topic ref, Collection<Topic> topics, int max, boolean useThreshold)
          Order a Collection of Topics by their relevance (= topic similarity) to a reference Topic
 JPanel getGui()
          Get a Swing JPanel which provides a GUI to this service.
 String[] getQueries()
           
 List<Topic> getRelevantTopics(String query, int k, boolean useThreshold)
          Computes the saved topics that are the most relevant to a certain topic or string query.
abstract  Set<XSDDateTime> getTopicBreaks(String meeting)
          Computes the topic segmentation of the meeting.
 boolean isLearning()
           
 boolean isRelevant(List<String> words, List<Double> freqs, Topic topic)
          Decides whether a given Topic is relevant to a certain word distribution, expressed as a word list and corresponding frequency list.
 boolean isRelevant(String query, Topic topic)
          Decides whether a given Topic is relevant to a certain topic or string query.
 void prepareForNewMeeting(String meetingName)
          Performs all necessary precomputations when a new meeting is added to the database.
abstract  List<Pair<Double,Pair<String,Pair<XSDDateTime,XSDDateTime>>>> searchTopic(Topic t)
          Searches for occurences of a topic in the whole corpus.
 void setTopicOptions(boolean learning)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface csli.dialog.app.calo.main.Service
getTitle, isServiceAlive
 

Constructor Detail

CsliOnlineTopicSegmenterService

public CsliOnlineTopicSegmenterService()
Method Detail

getTopicBreaks

public abstract Set<XSDDateTime> getTopicBreaks(String meeting)
Description copied from interface: OnlineTopicSegmenterService
Computes the topic segmentation of the meeting. For OnlineTopicSegmenterService, this will necessarily be based on topic break classification. TopicSegmenterService may base the segmentation on lexical similarity or the MIT generative model depending on the config key topic.segmentation.method

Specified by:
getTopicBreaks in interface OnlineTopicSegmenterService
Parameters:
meeting - the meeting we want to segment
Returns:
the set of topic breaks

getQueries

public String[] getQueries()

extractTopicBoundaries

public Pair<XSDDateTime,XSDDateTime> extractTopicBoundaries(String meeting,
                                                            XSDDateTime time)
Description copied from interface: OnlineTopicSegmenterService
Computes the boundaries of the topic surrounding time.

Specified by:
extractTopicBoundaries in interface OnlineTopicSegmenterService
Parameters:
meeting - the meeting we want to work on.
time - the time localization of our focus on the meeting.
Returns:
a pair of XSDDateTime identifying respectively the beginning and the end of the part of the meeting discussing that topic.

searchTopic

public abstract List<Pair<Double,Pair<String,Pair<XSDDateTime,XSDDateTime>>>> searchTopic(Topic t)
Description copied from interface: OnlineTopicSegmenterService
Searches for occurences of a topic in the whole corpus.

Specified by:
searchTopic in interface OnlineTopicSegmenterService
Returns:
a list of pairs composed of - a double giving the score of that topic zone - a topic zone given by a pair where the topic boundaries are themselves given by a pair of XSDDateTime

getGui

public JPanel getGui()
Description copied from interface: Service
Get a Swing JPanel which provides a GUI to this service.

Specified by:
getGui in interface Service

createTopic

public Topic createTopic(String query)
Description copied from interface: OnlineTopicSegmenterService
Creates a topic from a query and synonyms of words in the query

Specified by:
createTopic in interface OnlineTopicSegmenterService
Parameters:
query - The query we want to search. Stopwords are removed, and the order of the words does not matter.
Returns:
the generated topic.

findRelevantTopics

protected List<Topic> findRelevantTopics(Topic ref,
                                         Collection<Topic> topics,
                                         int max,
                                         boolean useThreshold)
Order a Collection of Topics by their relevance (= topic similarity) to a reference Topic

Parameters:
ref - the reference Topic to find similarity to
topics - the Topics to order by similariy
max - the maximum number to return
useThreshold - if true, only return those with similarity >= topic.recognition.relevanceThreshold; if false, return all
Returns:
a List of Topics sorted most relevant first

getRelevantTopics

public List<Topic> getRelevantTopics(String query,
                                     int k,
                                     boolean useThreshold)
Description copied from interface: OnlineTopicSegmenterService
Computes the saved topics that are the most relevant to a certain topic or string query. Optionally exclude those for which relevance falls beneath a given threshold.

Specified by:
getRelevantTopics in interface OnlineTopicSegmenterService
Parameters:
query - The query we are searching: if this is the name of a Topic in the pool, that is used; otherwise treated as plain text.
k - Number of topics we want to return.
useThreshold - if false, return all topics; if true, only those whose relevance equals or exceeds topic.recognition.relevanceThreshold.
Returns:
the list of at most k topics that are the most relevant to the query, most relevant first.

isRelevant

public boolean isRelevant(String query,
                          Topic topic)
Description copied from interface: OnlineTopicSegmenterService
Decides whether a given Topic is relevant to a certain topic or string query.

Specified by:
isRelevant in interface OnlineTopicSegmenterService
Parameters:
query - The query we are searching: if this is the name of a Topic in the pool, that is used; otherwise treated as plain text.
topic - The topic we are checking for relevance.
Returns:
true if the relevance equals or exceeds topic.recognition.relevance.relevanceThreshold, false otherwise.

isRelevant

public boolean isRelevant(List<String> words,
                          List<Double> freqs,
                          Topic topic)
Description copied from interface: OnlineTopicSegmenterService
Decides whether a given Topic is relevant to a certain word distribution, expressed as a word list and corresponding frequency list.

Specified by:
isRelevant in interface OnlineTopicSegmenterService
Parameters:
words - The vocabulary for the word distribution
freqs - The (possibly weighted) frequency counts for the word distribution
topic - The topic we are checking for relevance.
Returns:
true if the relevance equals or exceeds topic.recognition.relevance.relevanceThreshold, false otherwise.

csliSearchTopic

public List<Pair<Double,TopicArea>> csliSearchTopic(ArrayList<Pair<Topic,Double>> topics,
                                                    long queryIdentifier)

csliSearchTopic

public List<Pair<Double,TopicArea>> csliSearchTopic(ArrayList<Pair<Topic,Double>> topics)

csliSearchTopic

public abstract List<Pair<Double,TopicArea>> csliSearchTopic(Topic t)

prepareForNewMeeting

public void prepareForNewMeeting(String meetingName)
Description copied from interface: OnlineTopicSegmenterService
Performs all necessary precomputations when a new meeting is added to the database.

Specified by:
prepareForNewMeeting in interface OnlineTopicSegmenterService
Parameters:
meetingName - The name of the meeting we add in the corpus.

setTopicOptions

public void setTopicOptions(boolean learning)
Specified by:
setTopicOptions in interface OnlineTopicSegmenterService

isLearning

public boolean isLearning()
Specified by:
isLearning in interface OnlineTopicSegmenterService

clear

public abstract void clear()
Description copied from interface: OnlineTopicSegmenterService
Clears all the cache files that have been generated by the TopicSegmenterService. Mostly for development purpose.

Specified by:
clear in interface OnlineTopicSegmenterService

createTopicModel

public void createTopicModel(String meetingUID)
Description copied from interface: OnlineTopicSegmenterService
Creates an OPI model containing the TopicDiscussion segmentation derived using getTopicBreaks. TopicSegmenterService overrides this method to (a) use its own lexical cohesion getTopicBreaks method; (b) add topic keywords to the TopicDiscussions.

Specified by:
createTopicModel in interface OnlineTopicSegmenterService
Parameters:
meetingUID - the meeting we want to segment