lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcus Herou <marcus.he...@tailsweep.com>
Subject Change boost of documents / single fields / external scoring ?
Date Thu, 23 Apr 2009 20:01:58 GMT
Hi.

Confusing subject eh ? Trying to become a little clearer in a few sentences.

We have a Solr/Lucene index where each document is a Blog Entry. We have
just implemented the PageRank algorithm for Blogs and are about to add a
column to the index called score and perhaps adjust the document boost.

We have as well decided that it is the blog itself and not the individual
pages that are to be ranked so all entries belonging to one blog will
receive the same score.

I have not found a way to apply a document score without actually
re-indexing all fields in the affected entries (could very well be 100% at
every PageRank recalculation) and this will of course take hell of a long
time to reindex which effectively will render the process useless since it
would take a week or of reindexing as of current and will take more and more
time. (100M blog entries as of current and rapidly increasing).

Guess we have run into the issue where we have some "static" data which we
do not want to touch at all but we want to update certain "dynamic" fields.

Lucene is not a database I know but is there a way to implement external
search-time scoring or update individual fields ? Would there be a
possibilty to do some kind of join (parallell searches separate index types)
? or send the result to a separate sorting algorithm ? Hmmm.... Perhaps a
subclass of Sort ? Grasping at straws here folks...

Hope anyone of the core experts can help us.

Cheers

//Marcus Herou



-- 
Marcus Herou CTO and co-founder Tailsweep AB
+46702561312
marcus.herou@tailsweep.com
http://www.tailsweep.com/
http://blogg.tailsweep.com/

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message