Index

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.columbia.cs.cg.prdualrank.index
Class Index

java.lang.Object
  edu.columbia.cs.cg.prdualrank.index.Index

public class Index
extends java.lang.Object
extends java.lang.Object

For this Class, Apache Lucene Engine is required.
This class is used for our implementation of: "Searching Patterns for Relation Extraction over the Web: Rediscovering the Pattern-Relation Duality" . Y. Fang and K. C.-C. Chang. In WSDM, pages 825-834, 2011. For further information, WSDM 2011 Conference Website .

Description

Apache Lucene Indexer and Searcher. Used for optimal matching of the search patterns.
Read Algorithm PatternSearch(To,S,E) in Figure 9, Section 5 of the mentioned paper for more detailed information.

Since:: 2011-10-07
Version:: 0.1
Author:: Pablo Barrio, Goncalo Simoes
See Also:: Apache Lucene Engine , WSDM 2011 Conference Website

Field Summary
`static java.lang.String`	`CONTENT`

Constructor Summary
`Index(TokenBasedAnalyzer myAnalyzer, boolean lowercase, java.util.Set<java.lang.String> stopWords)` Instantiates a new index.

Method Summary
`void`	`addDocument(TokenizedDocument document)` Adds the document to the index.
`void`	`close()` Closes and optimize the index.
`java.util.List<TokenizedDocument>`	`search(org.apache.lucene.search.Query query, int n)` Search the query in the index.

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

CONTENT

public static final java.lang.String CONTENT

See Also:: Constant Field Values

Constructor Detail

Index

public Index(TokenBasedAnalyzer myAnalyzer,
             boolean lowercase,
             java.util.Set<java.lang.String> stopWords)

Instantiates a new index.

Parameters:: myAnalyzer - the analyzer to be used to index the content.; lowercase - specifies if the content will be stored in lowercase. No match case will be allowed if true.; stopWords - the set stop words. Empty set if no stop words are considered.

Method Detail

addDocument

public void addDocument(TokenizedDocument document)

Adds the document to the index.

Parameters:: document - the document to be indexed.

close

public void close()

Closes and optimize the index.

search

public java.util.List<TokenizedDocument> search(org.apache.lucene.search.Query query,
                                                int n)

Search the query in the index.

Parameters:: query - the query to be issued; n - the number of documents to be retrieved.
Returns:: the list of documents that match the query.

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.columbia.cs.cg.prdualrank.index Class Index

CONTENT

Index

addDocument

close

search

edu.columbia.cs.cg.prdualrank.index
Class Index