Class NewsJobReuters

java.lang.Object
  extended by CollectJob
      extended by NewsJob
          extended by NewsJobReuters
All Implemented Interfaces:
org.quartz.Job

public class NewsJobReuters
extends NewsJob

News collection job for Reuters news feeds, implementing an HTML parsing routine.

Author:
fhogenboom

Field Summary
private static org.slf4j.Logger _log
           
 
Fields inherited from class NewsJob
file_ext, file_path, max_hash, max_retry, max_title, mhm, mhm_size, num_msgs, rss_name, rss_src, txt_cls, zip_buffer, zip_files
 
Fields inherited from class CollectJob
date, form_date, form_hday, form_time, hday, hour_stop, hour_strt, mnte_stop, mnte_strt, scnd_stop, scnd_strt, time, time_zone
 
Constructor Summary
NewsJobReuters()
          Constructor.
 
Method Summary
protected  org.slf4j.Logger getLogger()
          Retrieves the Logger of the component.
protected  String parseHTML(String url)
          Parses the HTML page of a specified link into plain text.
 
Methods inherited from class NewsJob
collect, configureLocal, loadLocal, storeLocal
 
Methods inherited from class CollectJob
execute
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_log

private static org.slf4j.Logger _log
Constructor Detail

NewsJobReuters

public NewsJobReuters()
Constructor.

Method Detail

getLogger

protected org.slf4j.Logger getLogger()
Retrieves the Logger of the component.

Specified by:
getLogger in class NewsJob
Returns:
Logger used for logging system output.

parseHTML

protected String parseHTML(String url)
Parses the HTML page of a specified link into plain text.

Specified by:
parseHTML in class NewsJob
Parameters:
url - Link to be parsed.
Returns:
String value representing the full text.