Package org.apache.nutch.indexer
Class IndexingFilters
- java.lang.Object
-
- org.apache.nutch.indexer.IndexingFilters
-
public class IndexingFilters extends Object
Creates and cachesIndexingFilter
implementing plugins.
-
-
Field Summary
Fields Modifier and Type Field Description static String
INDEXINGFILTER_ORDER
-
Constructor Summary
Constructors Constructor Description IndexingFilters(Configuration conf)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description NutchDocument
filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks)
Run all defined filters.
-
-
-
Field Detail
-
INDEXINGFILTER_ORDER
public static final String INDEXINGFILTER_ORDER
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
IndexingFilters
public IndexingFilters(Configuration conf)
-
-
Method Detail
-
filter
public NutchDocument filter(NutchDocument doc, Parse parse, Text url, CrawlDatum datum, Inlinks inlinks) throws IndexingException
Run all defined filters. Note, may return null if the the document was filtered- Parameters:
doc
- theNutchDocument
to process with filtersparse
- correspondingParse
object for the documenturl
- correspondingText
url for the documentdatum
- correspondingCrawlDatum
for the documentinlinks
- correspondingInlinks
for the document- Returns:
- the
NutchDocument
, null it the document was filtered - Throws:
IndexingException
- if an error occurs within a filter- See Also:
IndexingFilter.filter(NutchDocument, Parse, Text, CrawlDatum, Inlinks)
-
-