csli.dialog.app.calo.topic.classification.topicextraction
Class CsliTopicSegmenterService

java.lang.Object
  extended by csli.dialog.app.calo.topic.classification.topicextraction.CsliOnlineTopicSegmenterService
      extended by csli.dialog.app.calo.topic.classification.topicextraction.CsliTopicSegmenterService
All Implemented Interfaces:
Service, OnlineTopicSegmenterService, TopicSegmenterService
Direct Known Subclasses:
CsliTopicSegmenterServiceImpl

public abstract class CsliTopicSegmenterService
extends CsliOnlineTopicSegmenterService
implements TopicSegmenterService


Nested Class Summary
 
Nested classes/interfaces inherited from class csli.dialog.app.calo.topic.classification.topicextraction.CsliOnlineTopicSegmenterService
CsliOnlineTopicSegmenterService.TopicSegmenterParameters
 
Constructor Summary
CsliTopicSegmenterService()
           
 
Method Summary
abstract  void clear()
          Clears all the cache files that have been generated by the TopicSegmenterService.
abstract  void createXML(String meetingName)
           
abstract  Pair<Topic,TopicArea> csliExtractTopic(String meeting, Pair<XSDDateTime,XSDDateTime> boundaries)
           
abstract  Pair<Topic,TopicArea> csliExtractTopic(String meeting, XSDDateTime time)
           
abstract  ArrayList<Pair<Topic,TopicArea>> csliGetTopics(String meeting)
          Get the Topics discussed in a meeting (using the default method), before adding to/merging with the pool
abstract  ArrayList<Pair<Topic,TopicArea>> csliGetTopicsByMITSegmentation(String meeting, Integer numShifts)
          Get the Topics discussed in a meeting (forcing segmentation via the MIT generative model), before adding to/merging with the pool
abstract  ArrayList<Pair<Topic,TopicArea>> csliGetTopicsByWordDistribution(String meeting)
          Get the Topics discussed in a meeting (forcing Stephane's lexical similarity method), before adding to/merging with the pool
 Double csliGetTopicSimilarity(Topic a, Topic b)
           
abstract  TopicArea csliLocateTopic(String meeting, Topic t)
           
abstract  List<Pair<Double,TopicArea>> csliSearchTopic(Topic t)
           
 Topic extractTopic(String meeting, Pair<XSDDateTime,XSDDateTime> boundaries)
          Extracts the topic discussed in meeting in a certain segment
 Topic extractTopic(String meeting, XSDDateTime time)
          Extracts the topic discussed in meeting around a specified time.
 String[] getQueries()
           
 List<Topic> getRelevantTopics(String query, String meeting, int k, boolean useThreshold)
          Computes the saved topics for a particular meeting that are the most relevant to a certain query.
abstract  Set<XSDDateTime> getTopicBreaks(String meeting)
          Computes the topic segmentation of the meeting.
 Set<XSDDateTime> getTopicBreaksByMITSegmentation(String meeting)
          Like getTopicBreaks(), but force segmentation via the MIT generative model.
abstract  Set<XSDDateTime> getTopicBreaksByMITSegmentation(String meeting, Integer numShifts)
          Like getTopicBreaks(), but force Stephane's lexical similarity method.
abstract  Set<XSDDateTime> getTopicBreaksByWordDistribution(String meeting)
          Like getTopicBreaks(), but force Stephane's lexical similarity method.
 List<Topic> getTopics(String meeting)
          Finds the named (i.e.
 List<Topic> getTopicsByMITSegmentation(String meeting)
          Like getTopics(), but force MIT's generative model.
 List<Topic> getTopicsByMITSegmentation(String meeting, Integer numShifts)
          Like getTopics(), but force MIT's generative model.
 List<Topic> getTopicsByWordDistribution(String meeting)
          Like getTopics(), but force Stephane's lexical similarity method.
 boolean isLearning()
           
abstract  ArrayList<Pair<Topic,TopicArea>> locateDoc(String meetingName)
           
 Pair<XSDDateTime,XSDDateTime> locateTopic(String meeting, Topic t)
          Locates the best matching area of a topic in the meeting
 void prepareForNewMeeting(String meetingName)
          Performs all necessary precomputations when a new meeting is added to the database.
 void setTopicOptions(boolean learning)
           
abstract  boolean wasDiscussed(Topic t, String meeting)
          Evaluates whether a certain topic was discussed or not during a past meeting.
 
Methods inherited from class csli.dialog.app.calo.topic.classification.topicextraction.CsliOnlineTopicSegmenterService
createTopic, createTopicModel, csliSearchTopic, csliSearchTopic, extractTopicBoundaries, findRelevantTopics, getGui, getRelevantTopics, isRelevant, isRelevant, searchTopic
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface csli.dialog.app.calo.main.services.TopicSegmenterService
clear, extractTopicBoundaries, searchTopic
 
Methods inherited from interface csli.dialog.app.calo.main.services.OnlineTopicSegmenterService
createTopic, createTopicModel, getRelevantTopics, isRelevant, isRelevant
 
Methods inherited from interface csli.dialog.app.calo.main.Service
getGui, getTitle, isServiceAlive
 

Constructor Detail

CsliTopicSegmenterService

public CsliTopicSegmenterService()
Method Detail

getQueries

public String[] getQueries()
Overrides:
getQueries in class CsliOnlineTopicSegmenterService

getTopicBreaks

public abstract Set<XSDDateTime> getTopicBreaks(String meeting)
Description copied from interface: OnlineTopicSegmenterService
Computes the topic segmentation of the meeting. For OnlineTopicSegmenterService, this will necessarily be based on topic break classification. TopicSegmenterService may base the segmentation on lexical similarity or the MIT generative model depending on the config key topic.segmentation.method

Specified by:
getTopicBreaks in interface OnlineTopicSegmenterService
Specified by:
getTopicBreaks in interface TopicSegmenterService
Specified by:
getTopicBreaks in class CsliOnlineTopicSegmenterService
Parameters:
meeting - the meeting we want to segment
Returns:
the set of topic breaks

getTopicBreaksByWordDistribution

public abstract Set<XSDDateTime> getTopicBreaksByWordDistribution(String meeting)
Like getTopicBreaks(), but force Stephane's lexical similarity method.

Parameters:
meeting - the meeting we want to segment
Returns:
the set of topic breaks
See Also:
getTopicBreaks()

getTopicBreaksByMITSegmentation

public Set<XSDDateTime> getTopicBreaksByMITSegmentation(String meeting)
Like getTopicBreaks(), but force segmentation via the MIT generative model. The config key topic.segmentation.numShifts will be used to determine number of shifts (0 for average number, <0 for auto)

Parameters:
meeting - the meeting we want to segment
Returns:
the set of topic breaks
See Also:
getTopicBreaks(), getTopicBreaksByMITSegmentation(String,Integer)

getTopicBreaksByMITSegmentation

public abstract Set<XSDDateTime> getTopicBreaksByMITSegmentation(String meeting,
                                                                 Integer numShifts)
Like getTopicBreaks(), but force Stephane's lexical similarity method.

Parameters:
meeting - the meeting we want to segment
numShifts - if > 0, the fixed number of shifts required; if = 0, fix the number of shifts at the average; if < 0, use the average probability threshold
Returns:
the set of topic breaks
See Also:
getTopicBreaks()

getTopics

public List<Topic> getTopics(String meeting)
Description copied from interface: TopicSegmenterService
Finds the named (i.e. existing in the pool) topics discussed in a meeting. The implementation may base the segmentation used on direct shift classification, lexical similarity or the MIT generative model depending on the config key topic.segmentation.method

Specified by:
getTopics in interface TopicSegmenterService
Parameters:
meeting - The meeting to process.
Returns:
a List of Topics

getTopicsByWordDistribution

public List<Topic> getTopicsByWordDistribution(String meeting)
Like getTopics(), but force Stephane's lexical similarity method.

Parameters:
meeting - The meeting to process.
Returns:
a List of Topics
See Also:
getTopics(String)

getTopicsByMITSegmentation

public List<Topic> getTopicsByMITSegmentation(String meeting)
Like getTopics(), but force MIT's generative model. The config key topic.segmentation.numShifts will be used to determine number of shifts (0 for average number, <0 for auto)

Parameters:
meeting - The meeting to process.
Returns:
a List of Topics
See Also:
getTopics(String)

getTopicsByMITSegmentation

public List<Topic> getTopicsByMITSegmentation(String meeting,
                                              Integer numShifts)
Like getTopics(), but force MIT's generative model.

Parameters:
meeting - The meeting to process.
numShifts - if > 0, the fixed number of shifts required; if = 0, fix the number of shifts at the average; if < 0, use the average probability threshold
Returns:
a List of Topics
See Also:
getTopics(String)

getRelevantTopics

public List<Topic> getRelevantTopics(String query,
                                     String meeting,
                                     int k,
                                     boolean useThreshold)
Description copied from interface: TopicSegmenterService
Computes the saved topics for a particular meeting that are the most relevant to a certain query. Optionally exclude those for which relevance falls beneath a given threshold.

Specified by:
getRelevantTopics in interface TopicSegmenterService
Parameters:
query - The query we are searching.
meeting - The meeting to search.
k - Number of topics we want to return.
useThreshold - if false, return all topics; if true, only those whose relevance exceeds topic.extraction.relevanceThreshold.
Returns:
the list of at most k topics that are the most relevant to the query, most relevant first.

csliGetTopics

public abstract ArrayList<Pair<Topic,TopicArea>> csliGetTopics(String meeting)
Get the Topics discussed in a meeting (using the default method), before adding to/merging with the pool

Parameters:
meeting -
Returns:
a list of Pairs of Topic and TopicArea

csliGetTopicsByWordDistribution

public abstract ArrayList<Pair<Topic,TopicArea>> csliGetTopicsByWordDistribution(String meeting)
Get the Topics discussed in a meeting (forcing Stephane's lexical similarity method), before adding to/merging with the pool

Parameters:
meeting -
Returns:
a list of Pairs of Topic and TopicArea
See Also:
csliGetTopics(String)

csliGetTopicsByMITSegmentation

public abstract ArrayList<Pair<Topic,TopicArea>> csliGetTopicsByMITSegmentation(String meeting,
                                                                                Integer numShifts)
Get the Topics discussed in a meeting (forcing segmentation via the MIT generative model), before adding to/merging with the pool

Parameters:
meeting -
numShifts - if > 0, the fixed number of shifts required; if = 0, fix the number of shifts at the average; if < 0, use the average probability threshold
Returns:
a list of Pairs of Topic and TopicArea
See Also:
csliGetTopics(String)

csliGetTopicSimilarity

public Double csliGetTopicSimilarity(Topic a,
                                     Topic b)

locateTopic

public Pair<XSDDateTime,XSDDateTime> locateTopic(String meeting,
                                                 Topic t)
Description copied from interface: TopicSegmenterService
Locates the best matching area of a topic in the meeting

Specified by:
locateTopic in interface TopicSegmenterService
Parameters:
meeting - the meeting to search in
t - the topic we want to locate
Returns:
a pair of XSDDateTime giving the beginning and the end of the corresponding segment.

csliLocateTopic

public abstract TopicArea csliLocateTopic(String meeting,
                                          Topic t)

extractTopic

public Topic extractTopic(String meeting,
                          XSDDateTime time)
Description copied from interface: TopicSegmenterService
Extracts the topic discussed in meeting around a specified time. It extends the segment around that time to have a bigger coherent zone.

Specified by:
extractTopic in interface TopicSegmenterService
Parameters:
meeting - the meeting we are interested in.
time - the timestamp we focus on
Returns:
a Topic object showing the word repartition on the segment.

extractTopic

public Topic extractTopic(String meeting,
                          Pair<XSDDateTime,XSDDateTime> boundaries)
Description copied from interface: TopicSegmenterService
Extracts the topic discussed in meeting in a certain segment

Specified by:
extractTopic in interface TopicSegmenterService
Parameters:
meeting - the meeting we are interested in.
boundaries - the beginning and end of the segment of which we want to extract the topic.
Returns:
a Topic object showing the word repartition on that segment.

csliExtractTopic

public abstract Pair<Topic,TopicArea> csliExtractTopic(String meeting,
                                                       XSDDateTime time)

csliExtractTopic

public abstract Pair<Topic,TopicArea> csliExtractTopic(String meeting,
                                                       Pair<XSDDateTime,XSDDateTime> boundaries)

wasDiscussed

public abstract boolean wasDiscussed(Topic t,
                                     String meeting)
Description copied from interface: TopicSegmenterService
Evaluates whether a certain topic was discussed or not during a past meeting.

Specified by:
wasDiscussed in interface TopicSegmenterService
Parameters:
t - the topic
meeting - the meeting
Returns:
a boolean answering the question.

csliSearchTopic

public abstract List<Pair<Double,TopicArea>> csliSearchTopic(Topic t)
Specified by:
csliSearchTopic in class CsliOnlineTopicSegmenterService

prepareForNewMeeting

public void prepareForNewMeeting(String meetingName)
Description copied from interface: OnlineTopicSegmenterService
Performs all necessary precomputations when a new meeting is added to the database.

Specified by:
prepareForNewMeeting in interface OnlineTopicSegmenterService
Specified by:
prepareForNewMeeting in interface TopicSegmenterService
Overrides:
prepareForNewMeeting in class CsliOnlineTopicSegmenterService
Parameters:
meetingName - The name of the meeting we add in the corpus.

setTopicOptions

public void setTopicOptions(boolean learning)
Specified by:
setTopicOptions in interface OnlineTopicSegmenterService
Overrides:
setTopicOptions in class CsliOnlineTopicSegmenterService

isLearning

public boolean isLearning()
Specified by:
isLearning in interface OnlineTopicSegmenterService
Overrides:
isLearning in class CsliOnlineTopicSegmenterService

clear

public abstract void clear()
Description copied from interface: OnlineTopicSegmenterService
Clears all the cache files that have been generated by the TopicSegmenterService. Mostly for development purpose.

Specified by:
clear in interface OnlineTopicSegmenterService
Specified by:
clear in interface TopicSegmenterService
Specified by:
clear in class CsliOnlineTopicSegmenterService

locateDoc

public abstract ArrayList<Pair<Topic,TopicArea>> locateDoc(String meetingName)

createXML

public abstract void createXML(String meetingName)