|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.Tokenizer
edu.columbia.cs.cg.prdualrank.index.tokenizer.SpanBasedTokenizer
public class SpanBasedTokenizer
For this Class, Apache Lucene Engine is required.
This class is used for our implementation of:
"Searching Patterns for Relation Extraction over the Web: Rediscovering the Pattern-Relation Duality" . Y. Fang and K. C.-C. Chang. In WSDM, pages 825-834, 2011.
For further information, WSDM 2011 Conference Website .
Description
Tokenizer for Apache Lucene Search Engine based on the already calculated tokens of the element we want either to index or search for.
Nested Class Summary |
---|
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource |
---|
org.apache.lucene.util.AttributeSource.AttributeFactory, org.apache.lucene.util.AttributeSource.State |
Constructor Summary | |
---|---|
SpanBasedTokenizer(Span[] spans,
java.lang.String[] content)
Instantiates a new span based tokenizer. |
Method Summary | |
---|---|
boolean |
incrementToken()
|
Methods inherited from class org.apache.lucene.analysis.Tokenizer |
---|
close, reset |
Methods inherited from class org.apache.lucene.analysis.TokenStream |
---|
end, reset |
Methods inherited from class org.apache.lucene.util.AttributeSource |
---|
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, restoreState, toString |
Methods inherited from class java.lang.Object |
---|
getClass, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public SpanBasedTokenizer(Span[] spans, java.lang.String[] content)
spans
- the spans of the element to be tokenizedcontent
- the content the splitted content of the element to be tokenized. Must match the spans.Method Detail |
---|
public boolean incrementToken() throws java.io.IOException
incrementToken
in class org.apache.lucene.analysis.TokenStream
java.io.IOException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |