edu.columbia.cs.ref.model
Class Document

java.lang.Object
  extended by edu.columbia.cs.ref.model.Document
All Implemented Interfaces:
Matchable, Writable, java.io.Serializable
Direct Known Subclasses:
TokenizedDocument

public class Document
extends java.lang.Object
implements java.io.Serializable, Writable, Matchable

Representation of a Document.

A Document is defined by its path, the name of the file and a list of Segments that represent the content of the document. Additionally, a document may be annotated with information like entities and relationships.

Since:
2011-09-27
Version:
0.1
Author:
Pablo Barrio, Goncalo Simoes
See Also:
Serialized Form

Constructor Summary
Document(java.util.List<Segment> text)
          From text constructor for documents that are not stored in disk
Document(java.lang.String path, java.lang.String fileName, java.util.List<Segment> text)
          Constructor of the Document
 
Method Summary
 void addEntity(Entity entity)
          Adds a new annotation of an entity present in the document
 void addRelationship(Relationship relationship)
          Adds a new annotation of a relationship present in the document
 boolean equals(java.lang.Object obj)
           
 java.util.Collection<Entity> getEntities()
          Obtains the annotations of entities present in the document
 Entity getEntity(java.lang.String id)
          Obtains the entity annotation with a given id
 java.lang.String getFilename()
          Obtains the name of the file containing the document
 java.lang.String getPath()
          Obtains the path to the file containing the document
 java.util.List<Segment> getPlainText()
          Obtains the segments that represent the content of the document
 Relationship getRelationship(java.lang.String id)
          Obtains the relationship annotation with a given id
 java.util.Collection<Relationship> getRelationships()
          Obtains the annotations of relationships present in the document
 java.lang.String getSubstring(int offset, int length)
          Obtains the substring of the document starting at a given offset and with a given length
 java.lang.String getWritableValue()
          Obtains the name to be used when writing the file
 void setFilename(java.lang.String f)
          Sets the name of the file containing the document
 void setPath(java.lang.String path)
          Sets the path to the file containing the document
 void setPlainText(java.util.List<Segment> text)
          Sets the segments that represent the content of the document
 java.lang.String toString()
          Returns a string representation of the object.
 
Methods inherited from class java.lang.Object
getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Document

public Document(java.util.List<Segment> text)
From text constructor for documents that are not stored in disk

Parameters:
text - list of segments in the document

Document

public Document(java.lang.String path,
                java.lang.String fileName,
                java.util.List<Segment> text)
Constructor of the Document

Parameters:
path - Path to the file containing the document
fileName - Name of the file containing the document
text - Segments that represent the content of the document
Method Detail

getPlainText

public java.util.List<Segment> getPlainText()
Obtains the segments that represent the content of the document

Returns:
the segments that represent the content of the document

setPlainText

public void setPlainText(java.util.List<Segment> text)
Sets the segments that represent the content of the document

Parameters:
text - the segments to set

getEntities

public java.util.Collection<Entity> getEntities()
Obtains the annotations of entities present in the document

Returns:
The collection of entities that are present in the document

addEntity

public void addEntity(Entity entity)
Adds a new annotation of an entity present in the document

Parameters:
new - annotation of an entity

getEntity

public Entity getEntity(java.lang.String id)
Obtains the entity annotation with a given id

Parameters:
id - id of the annotation to be retrieved
Returns:
retrieved annotation

getRelationships

public java.util.Collection<Relationship> getRelationships()
Obtains the annotations of relationships present in the document

Returns:
The collection of relationships that are present in the document

addRelationship

public void addRelationship(Relationship relationship)
Adds a new annotation of a relationship present in the document

Parameters:
new - annotation of a relationship

getRelationship

public Relationship getRelationship(java.lang.String id)
Obtains the relationship annotation with a given id

Parameters:
id - id of the annotation to be retrieved
Returns:
retrieved annotation

toString

public java.lang.String toString()
Returns a string representation of the object.

Overrides:
toString in class java.lang.Object
Returns:
string representation of the object.

getSubstring

public java.lang.String getSubstring(int offset,
                                     int length)
Obtains the substring of the document starting at a given offset and with a given length

Parameters:
offset - start of the substring to be retrieved
length - size of the substring to be retrieved
Returns:
substring retrieved

setPath

public void setPath(java.lang.String path)
Sets the path to the file containing the document

Parameters:
path - the path to set

setFilename

public void setFilename(java.lang.String f)
Sets the name of the file containing the document

Parameters:
f - the filename to set

getFilename

public java.lang.String getFilename()
Obtains the name of the file containing the document

Returns:
The name of the file containing the document

getPath

public java.lang.String getPath()
Obtains the path to the file containing the document

Returns:
The path to the file containing the document

getWritableValue

public java.lang.String getWritableValue()
Obtains the name to be used when writing the file

Specified by:
getWritableValue in interface Writable
Returns:
The name to be used when writing the file

equals

public boolean equals(java.lang.Object obj)
Overrides:
equals in class java.lang.Object