nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] Updated: (NUTCH-240) Scoring API: extension point, scoring filters and an OPIC plugin
Date Mon, 03 Apr 2006 12:21:00 GMT
     [ http://issues.apache.org/jira/browse/NUTCH-240?page=all ]

Andrzej Bialecki  updated NUTCH-240:
------------------------------------

    Attachment: Generator.patch.txt

This patch is an intermediate step towards the simplification of the scoring API. It changes
Generator to use an arbitrary FloatWritable for selecting topN records.

If there are not objections, I'd like to commit this patch first, and then refactor the scoring
API to use this new Generator.

> Scoring API: extension point, scoring filters and an OPIC plugin
> ----------------------------------------------------------------
>
>          Key: NUTCH-240
>          URL: http://issues.apache.org/jira/browse/NUTCH-240
>      Project: Nutch
>         Type: Improvement

>     Versions: 0.8-dev
>     Reporter: Andrzej Bialecki 
>  Attachments: Generator.patch.txt, patch.txt
>
> This patch refactors all places where Nutch manipulates page scores, into a plugin-based
API. Using this API it's possible to implement different scoring algorithms. It is also much
easier to understand how scoring works.
> Multiple scoring plugins can be run in sequence, in a manner similar to URLFilters.
> Included is also an OPICScoringFilter plugin, which contains the current implementation
of the scoring algorithm. Together with the scoring API it provides a fully backward-compatible
scoring.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message