lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "过佳" <ntts...@gmail.com>
Subject Re: How to add PageRank score with lucene's relevant score in sorting
Date Wed, 28 May 2008 13:45:44 GMT
I think this is not suitable for my system since the num of pages is very
large that will cost much time for reindex

2008/5/28, Ian Lea <ian.lea@gmail.com>:
>
> Yes.  But you'd have to do that anyway if you are storing pagerank in the
> index.
>
> One point on your 20s response time for sorting - is that for the
> first sort or subsequent ones?
> I believe that the first one will usually be substantially slower.
> But sorting is always likely to be slower than not sorting.
>
>
> --
> Ian.
>
>
> On Wed, May 28, 2008 at 12:47 PM, 过佳 <nttstar@gmail.com> wrote:
> > thanks lan, but this means that i must reindex these pages while the
> > pagerank score changed?
> >
> > 在08-5-28,Ian Lea <ian.lea@gmail.com> 写道:
> >>
> >> Hi
> >>
> >>
> >> Maybe you could use the pagerank score, possibly modified, as document
> >> boost at indexing time.  From the javadocs for
> >> Document.setBoost(boost)
> >>
> >> "Sets a boost factor for hits on any field of this document. This
> >> value will be multiplied into the score of all hits on this document"
> >>
> >> so will give you P * R rather than P + R.  Should be quick, though.
> >>
> >>
> >> --
> >> Ian.
> >>
> >>
> >> On Wed, May 28, 2008 at 11:02 AM, 过佳 <nttstar@gmail.com> wrote:
> >> > hi all ,
> >> >     I have a problem that how to "combine" two score to sort the
> search
> >> > result documents.
> >> >     for example I  have 10 million pages in lucene index , and i know
> >> their
> >> > pagerank scores. i give a query to it , every docs returned have a
> >> > lucene-score, mark it as R (relevant score), and  i  also  have its
> >> > pagerank score, mark it as P,  what i need is i want to sort the
> search
> >> > result base on the value "P+R".  You know if i store the pagerank
> score
> >> in
> >> > index and get it every search time , then compute P+R , then sort it ,
> >> this
> >> > way is too slow. in my system , when the search hits 500000 result ,
> the
> >> > sort may cost about 20s.
> >> >   Sorry for my poor english.  Anyone has a good idea?
> >> >
> >> > Best
> >> > Jarvis
> >> >
> >>
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message