|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.columbia.cs.cg.prdualrank.index.Index
public class Index
For this Class, Apache Lucene Engine is required.
This class is used for our implementation of:
"Searching Patterns for Relation Extraction over the Web: Rediscovering the Pattern-Relation Duality" . Y. Fang and K. C.-C. Chang. In WSDM, pages 825-834, 2011.
For further information, WSDM 2011 Conference Website .
Description
Apache Lucene Indexer and Searcher. Used for optimal matching of the search patterns.
Read Algorithm PatternSearch(To,S,E) in Figure 9, Section 5 of the mentioned paper for more detailed information.
Field Summary | |
---|---|
static java.lang.String |
CONTENT
|
Constructor Summary | |
---|---|
Index(TokenBasedAnalyzer myAnalyzer,
boolean lowercase,
java.util.Set<java.lang.String> stopWords)
Instantiates a new index. |
Method Summary | |
---|---|
void |
addDocument(TokenizedDocument document)
Adds the document to the index. |
void |
close()
Closes and optimize the index. |
java.util.List<TokenizedDocument> |
search(org.apache.lucene.search.Query query,
int n)
Search the query in the index. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final java.lang.String CONTENT
Constructor Detail |
---|
public Index(TokenBasedAnalyzer myAnalyzer, boolean lowercase, java.util.Set<java.lang.String> stopWords)
myAnalyzer
- the analyzer to be used to index the content.lowercase
- specifies if the content will be stored in lowercase. No match case will be allowed if true.stopWords
- the set stop words. Empty set if no stop words are considered.Method Detail |
---|
public void addDocument(TokenizedDocument document)
document
- the document to be indexed.public void close()
public java.util.List<TokenizedDocument> search(org.apache.lucene.search.Query query, int n)
query
- the query to be issuedn
- the number of documents to be retrieved.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |