|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.columbia.cs.cg.prdualrank.pattern.extractor.resource.TupleContext
public class TupleContext
This class is used for our implementation of:
"Searching Patterns for Relation Extraction over the Web: Rediscovering the Pattern-Relation Duality" . Y. Fang and K. C.-C. Chang. In WSDM, pages 825-834, 2011.
For further information, WSDM 2011 Conference Website .
Description
This class represents the text surrounding a tuple or many tuples. Is used in the generation of Search Patterns<\b> using Window Generation or Document Generation respectively.
For Document Search Pattern Generation, since no restriction in the size of arrays surrounding tuples will be provided, all the text except for the tuple attribute values will be considered
in Search Pattern generation.
For instance, given a span between entities larger than 9, a windows size of 10 and a sentence (from Google.com): "When the acquisition is complete, YouTube will retain its distinct brand identity,
strengthening and complementing Google’s own fast-growing video business.
YouTube will continue to be based in San Bruno, CA, and all YouTube employees will remain with the company. With Google’s technology, advertiser relationships and global reach,
YouTube will continue to build on its success as one of the world’s most popular services for video entertainment."
One of the Tuple Context (Considering the first two occurrences of Google and YouTube):
1. ["When","the","acquisition","is","complete"] COMPANY ["will","retain","its","distinct","brand","identity","strengthening","and","complementing"] BUYER ["'s","own","fast-growing","video","business","YouTube","will","continue","to","be"]
For more information, read Definition 1 in Section 3.1 of the mentioned paper.
Constructor Summary | |
---|---|
TupleContext(java.util.List<Span> realSpans)
Instantiates a new tuple context. |
Method Summary | |
---|---|
void |
addWords(java.lang.String[] newWords)
Adds a new sequence of words to the context. |
java.util.Set<java.lang.String[]> |
generateNgrams(int ngram)
Generate ngrams of size lower or equal to 'ngram' based on the sequences of words. |
java.lang.String |
toString()
|
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public TupleContext(java.util.List<Span> realSpans)
realSpans
- the non overlapping text segments of a tuple. Overlapping segments were combined in a previous step.Method Detail |
---|
public void addWords(java.lang.String[] newWords)
newWords
- the new detected sequence of words between attributes.public java.util.Set<java.lang.String[]> generateNgrams(int ngram)
ngram
- the maximum size of ngrams.
public java.lang.String toString()
toString
in class java.lang.Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |