edu.columbia.cs.cg.prdualrank.pattern.extractor
Class SearchPatternExtractor<T extends Document>
java.lang.Object
edu.columbia.cs.cg.prdualrank.pattern.extractor.SearchPatternExtractor<T>
- All Implemented Interfaces:
- PatternExtractor<Document>
- Direct Known Subclasses:
- DocumentSearchPatternExtractor, WindowedSearchPatternExtractor
public abstract class SearchPatternExtractor<T extends Document>
- extends java.lang.Object
- implements PatternExtractor<Document>
This class is used for our implementation of:
"Searching Patterns for Relation Extraction over the Web: Rediscovering the Pattern-Relation Duality" . Y. Fang and K. C.-C. Chang. In WSDM, pages 825-834, 2011.
For further information, WSDM 2011 Conference Website .
Description
Gives the interface for any search pattern extractor in PRDualRank. The word "any" relies on all the different kind of Search Patterns that can be implemented.
Two of these Search Patterns can be seen in DocumentSearchPatternExtractor
and WindowedSearchPatternExtractor
.
- Since:
- 2011-10-07
- Version:
- 0.1
- Author:
- Pablo Barrio, Goncalo Simoes
- See Also:
- WSDM 2011 Conference Website ,
DocumentSearchPatternExtractor
,
WindowedSearchPatternExtractor
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
SearchPatternExtractor
public SearchPatternExtractor(int ngram,
int numberOfPhrases)
extractPatterns
public java.util.Map<Pattern<Document,TokenizedDocument>,java.lang.Integer> extractPatterns(TokenizedDocument document,
Relationship relationship,
java.util.List<Relationship> matchingRelationships)
- Description copied from interface:
PatternExtractor
- Extract specific patterns from the document in the parameter list for the specified relationship and other matching relationships in the same document. The definition of matching used in
this project is based on the EntityMatchers contained in the specified relationship.
- Specified by:
extractPatterns
in interface PatternExtractor<Document>
- Parameters:
document
- the document to be processed.relationship
- the relationship that the extractor is trying to generate patterns for.matchingRelationships
- the relationships contained in 'document' that match the specified relationship.
- Returns:
- the map