HTMLContentKeeper

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.columbia.cs.ref.tool.preprocessor.impl
Class HTMLContentKeeper

java.lang.Object
  edu.columbia.cs.ref.tool.preprocessor.impl.HTMLContentKeeper

All Implemented Interfaces:: Preprocessor

public class HTMLContentKeeper
extends java.lang.Object
implements Preprocessor
extends java.lang.Object
implements Preprocessor

This class is an implementation of the Preprocessor interface. This Preprocessor is able to extract the content of HTML files.

Since:: 2011-09-27
Version:: 0.1
Author:: Pablo Barrio, Goncalo Simoes

Constructor Summary
`HTMLContentKeeper()`

Method Summary
`java.lang.String`	`process(java.lang.String content)` This method is responsible for processing the content of a document and returns a transformed String that corresponds to a transformed version of the input content.

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

HTMLContentKeeper

public HTMLContentKeeper()

Method Detail

process

public java.lang.String process(java.lang.String content)

This method is responsible for processing the content of a document and returns a transformed String that corresponds to a transformed version of the input content.

This processor can be used to obtain the content of HTML files. To do that, we are calling the method HtmlToText.htmlToPlainText(content) from the Google Data Java Client Library

Specified by:: process in interface Preprocessor

Parameters:: content - the content of document represented as a String
Returns:: the transformed version of the input content

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

edu.columbia.cs.ref.tool.preprocessor.impl Class HTMLContentKeeper

HTMLContentKeeper

process

edu.columbia.cs.ref.tool.preprocessor.impl
Class HTMLContentKeeper