public class ScoringFilters extends Configured implements ScoringFilter
ScoringFilter implementing plugins.X_POINT_ID| Constructor and Description |
|---|
ScoringFilters(Configuration conf) |
| Modifier and Type | Method and Description |
|---|---|
void |
distributeScoreToOutlinks(String fromUrl,
WebPage row,
Collection<ScoreDatum> scoreData,
int allCount)
Distribute score value from the current page to all its outlinked pages.
|
float |
generatorSortValue(String url,
WebPage row,
float initSort)
Calculate a sort value for Generate.
|
Collection<WebPage.Field> |
getFields() |
float |
indexerScore(String url,
NutchDocument doc,
WebPage row,
float initScore)
This method calculates a Lucene document boost.
|
void |
initialScore(String url,
WebPage row)
Calculate a new initial score, used when adding newly discovered pages.
|
void |
injectedScore(String url,
WebPage row)
Calculate a new initial score, used when injecting new pages.
|
void |
updateScore(String url,
WebPage row,
List<ScoreDatum> inlinkedScoreData)
This method calculates a new score during table update, based on the values
contributed by inlinked pages.
|
getConf, setConfclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetConf, setConfpublic ScoringFilters(Configuration conf)
public float generatorSortValue(String url, WebPage row, float initSort) throws ScoringFilterException
generatorSortValue in interface ScoringFilterurl - url of the pageinitSort - initial sort value, or a value from previous filters in chainScoringFilterExceptionpublic void initialScore(String url, WebPage row) throws ScoringFilterException
initialScore in interface ScoringFilterurl - url of the pageScoringFilterExceptionpublic void injectedScore(String url, WebPage row) throws ScoringFilterException
injectedScore in interface ScoringFilterurl - url of the pagerow - new page. Filters will modify it in-place.ScoringFilterExceptionpublic void distributeScoreToOutlinks(String fromUrl, WebPage row, Collection<ScoreDatum> scoreData, int allCount) throws ScoringFilterException
ScoringFilterdistributeScoreToOutlinks in interface ScoringFilterfromUrl - url of the source pagescoreData - A list of OutlinkedScoreDatums for every outlink. These
OutlinkedScoreDatums will be passed to
#updateScore(String, OldWebTableRow, List) for every
outlinked URL.allCount - number of all collected outlinks from the source pageScoringFilterExceptionpublic void updateScore(String url, WebPage row, List<ScoreDatum> inlinkedScoreData) throws ScoringFilterException
ScoringFilterupdateScore in interface ScoringFilterurl - url of the pageScoringFilterExceptionpublic float indexerScore(String url, NutchDocument doc, WebPage row, float initScore) throws ScoringFilterException
ScoringFilterindexerScore in interface ScoringFilterurl - url of the pagedoc - document. NOTE: this already contains all information collected by
indexing filters. Implementations may modify this instance, in
order to store/remove some information.initScore - initial boost value for the Lucene document.ScoringFilterExceptionpublic Collection<WebPage.Field> getFields()
getFields in interface FieldPluggableCopyright © 2015 The Apache Software Foundation