public class SubcollectionIndexingFilter extends Configured implements IndexingFilter
| Modifier and Type | Field and Description |
|---|---|
static String |
FIELD_NAME
Doc field name
|
static org.slf4j.Logger |
LOG
Logger
|
X_POINT_ID| Constructor and Description |
|---|
SubcollectionIndexingFilter() |
SubcollectionIndexingFilter(Configuration conf) |
| Modifier and Type | Method and Description |
|---|---|
NutchDocument |
filter(NutchDocument doc,
String url,
WebPage page)
Adds fields or otherwise modifies the document that will be indexed for a
parse.
|
Collection<WebPage.Field> |
getFields() |
getConf, setConfclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetConf, setConfpublic static final String FIELD_NAME
public static final org.slf4j.Logger LOG
public SubcollectionIndexingFilter()
public SubcollectionIndexingFilter(Configuration conf)
public Collection<WebPage.Field> getFields()
getFields in interface FieldPluggablepublic NutchDocument filter(NutchDocument doc, String url, WebPage page) throws IndexingException
IndexingFilterfilter in interface IndexingFilterdoc - document instance for collecting fieldsurl - page urlIndexingExceptionCopyright © 2015 The Apache Software Foundation