Package org.apache.nutch.parse
Class HtmlParseFilters
- java.lang.Object
-
- org.apache.nutch.parse.HtmlParseFilters
-
public class HtmlParseFilters extends Object
Creates and cachesHtmlParseFilter
implementing plugins.
-
-
Field Summary
Fields Modifier and Type Field Description static String
HTMLPARSEFILTER_ORDER
-
Constructor Summary
Constructors Constructor Description HtmlParseFilters(Configuration conf)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description ParseResult
filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
Run all defined filters.
-
-
-
Field Detail
-
HTMLPARSEFILTER_ORDER
public static final String HTMLPARSEFILTER_ORDER
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
HtmlParseFilters
public HtmlParseFilters(Configuration conf)
-
-
Method Detail
-
filter
public ParseResult filter(Content content, ParseResult parseResult, HTMLMetaTags metaTags, DocumentFragment doc)
Run all defined filters.- Parameters:
content
- theContent
for a given responseparseResult
- the result of running on or moreParser
's on the content.metaTags
- a populatedHTMLMetaTags
objectdoc
- aDocumentFragment
(DOM) which can be processed in the filtering process.- Returns:
- a filtered
ParseResult
- See Also:
Parser.getParse(Content)
-
-