edu.columbia.cs.cg.prdualrank.pattern.extractor
Class SearchPatternExtractor<T extends Document>

java.lang.Object
  extended by edu.columbia.cs.cg.prdualrank.pattern.extractor.SearchPatternExtractor<T>
All Implemented Interfaces:
PatternExtractor<Document>
Direct Known Subclasses:
DocumentSearchPatternExtractor, WindowedSearchPatternExtractor

public abstract class SearchPatternExtractor<T extends Document>
extends java.lang.Object
implements PatternExtractor<Document>

This class is used for our implementation of: "Searching Patterns for Relation Extraction over the Web: Rediscovering the Pattern-Relation Duality" . Y. Fang and K. C.-C. Chang. In WSDM, pages 825-834, 2011. For further information, WSDM 2011 Conference Website .

Description

Gives the interface for any search pattern extractor in PRDualRank. The word "any" relies on all the different kind of Search Patterns that can be implemented.
Two of these Search Patterns can be seen in DocumentSearchPatternExtractor and WindowedSearchPatternExtractor.

Since:
2011-10-07
Version:
0.1
Author:
Pablo Barrio, Goncalo Simoes
See Also:
WSDM 2011 Conference Website , DocumentSearchPatternExtractor, WindowedSearchPatternExtractor

Constructor Summary
SearchPatternExtractor(int ngram, int numberOfPhrases)
           
 
Method Summary
 java.util.Map<Pattern<Document,TokenizedDocument>,java.lang.Integer> extractPatterns(TokenizedDocument document, Relationship relationship, java.util.List<Relationship> matchingRelationships)
          Extract specific patterns from the document in the parameter list for the specified relationship and other matching relationships in the same document.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

SearchPatternExtractor

public SearchPatternExtractor(int ngram,
                              int numberOfPhrases)
Method Detail

extractPatterns

public java.util.Map<Pattern<Document,TokenizedDocument>,java.lang.Integer> extractPatterns(TokenizedDocument document,
                                                                                            Relationship relationship,
                                                                                            java.util.List<Relationship> matchingRelationships)
Description copied from interface: PatternExtractor
Extract specific patterns from the document in the parameter list for the specified relationship and other matching relationships in the same document. The definition of matching used in this project is based on the EntityMatchers contained in the specified relationship.

Specified by:
extractPatterns in interface PatternExtractor<Document>
Parameters:
document - the document to be processed.
relationship - the relationship that the extractor is trying to generate patterns for.
matchingRelationships - the relationships contained in 'document' that match the specified relationship.
Returns:
the map