lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <cdor...@gmail.com>
Subject Re: exponential boosts
Date Fri, 24 Apr 2009 10:16:33 GMT
On Fri, Apr 24, 2009 at 12:28 AM, Steven Bethard <bethard@stanford.edu>wrote:

> On 4/23/2009 2:08 PM, Marcus Herou wrote:
> > But perhaps one could use a FieldCache somehow ?
>
> Some code snippets that may help. I add the PageRank value as a field of
> the documents I index with Lucene like this:
>
>    Document document = new Document();
>    double pageRank = this.pageRanks.getCount(article.getId());
>    document.add(new Field(
>        PAGE_RANK_FIELD_NAME, Float.toString((float)pageRank),
>        Field.Store.YES, Field.Index.NOT_ANALYZED));


Note that there's no need to store this field - it is the indexed
value which is  being used.

Also, note an additional approach: page-ranks could be maintained
externally,
conceptually an array: float[] pageRank, where pageRank[docid] is the PR
of that doc. This has the challenge of matching with index docids and so
will not
work well in a dynamic env where docs are deleted and hence docids are
changed.
But, if your setting is static in terms of docids, this would allow you to
update the
PRs without re-indexing the entire collection. To take this path, extend
ValueSource over this array, and construct a ValueSourceQuery over that
value source.
This ValueSourceQuery will now be your pageRankQuery, passed to
CustomScoreQuery.

Doron

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message