edu.columbia.cs.cg.prdualrank.searchengine
Class WebBasedSearchEngine
java.lang.Object
edu.columbia.cs.cg.prdualrank.searchengine.WebBasedSearchEngine
- All Implemented Interfaces:
- SearchEngine
- Direct Known Subclasses:
- BingSearchEngine
public abstract class WebBasedSearchEngine
- extends java.lang.Object
- implements SearchEngine
This class is used for our implementation of:
"Searching Patterns for Relation Extraction over the Web: Rediscovering the Pattern-Relation Duality" . Y. Fang and K. C.-C. Chang. In WSDM, pages 825-834, 2011.
For further information, WSDM 2011 Conference Website .
Description
Abstract class that provides the behavior of any Web search engine used in PRdualRank.
This class is abstract given that many implementation of WebBasedSearchEngines are available:
1.Yahoo Boss API. Please read usage terms.
2.Bing API. Please read usage terms.
The search engine is used in the Algorithm PatternSearch(To,S,E) in Figure 9 on Section 5 of the mentioned paper.
- Since:
- 2011-10-07
- Version:
- 0.1
- Author:
- Pablo Barrio, Goncalo Simoes
- See Also:
- WSDM 2011 Conference Website
Constructor Summary |
WebBasedSearchEngine(RawDocumentLoader loader)
Instantiates a new web based search engine using the instance of the loader to be used to process the documents
from the web. |
Method Summary |
java.util.List<Document> |
search(java.lang.String query,
int k_seed)
Issues query into the search engine and retrieves either k_seed or all the documents in the result set, whichever number is the lower. |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
WebBasedSearchEngine
public WebBasedSearchEngine(RawDocumentLoader loader)
- Instantiates a new web based search engine using the instance of the loader to be used to process the documents
from the web.
- Parameters:
loader
- the loader
search
public java.util.List<Document> search(java.lang.String query,
int k_seed)
- Description copied from interface:
SearchEngine
- Issues query into the search engine and retrieves either k_seed or all the documents in the result set, whichever number is the lower.
- Specified by:
search
in interface SearchEngine
- Parameters:
query
- the query to be issuedk_seed
- the number of documents to be retrieved if lower than the total number of hits. All the results are retrieved otherwise.
- Returns:
- the list of documents contained in the result set.