lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhang, Lisheng" <Lisheng.Zh...@BroadVision.com>
Subject RE: Boost more recent document
Date Thu, 01 Dec 2011 06:36:50 GMT
Hi Simon,

Sorry I found that I cannot use payload for this purpose because payload
can be accessed only through term positions but we did not use timestamp
for query. Ideally it would be great if we can have some doc-level "payload"
accessible through docId?

Then your initial suggestion to use CustomScoreQuery would be our solution,
from source code I see sort is implemented by FieldCache and its performance
seems OK even though we didnot cache reader. So we will use CustomeScoreQuery
without cache for now (cutting time stamp to hour or day may help), if too
slow we may consider selected cache.

Thanks very much for all your great helps, please point out if you see wrong
in above statements?

Best regards, Lisheng

-----Original Message-----
From: Zhang, Lisheng [mailto:Lisheng.Zhang@BroadVision.com]
Sent: Wednesday, November 30, 2011 1:40 PM
To: java-user@lucene.apache.org; simon.willnauer@gmail.com
Subject: RE: Boost more recent document


Hi,

Thanks for the very interesting idea!

Currently we use lucene 2.3.2 and we just use default merge policy (at
any time we have a few segments and after some accumulation small segments
are merged into big ones). I need to double check if docId can reflect doc
age.

But I have one concern: docId may not reflect true age interval, like docId
difference by 2 may reflect 2m or 1h. If no better choice I may just use
payload and adapt a few query classes?

Thanks very much for helps, Lisheng

-----Original Message-----
From: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
Sent: Wednesday, November 30, 2011 1:02 PM
To: java-user@lucene.apache.org
Subject: Re: Boost more recent document


If you use LogMergePolicy ie. do merges in order you could use the
absolute docID as a relative age value. Smaller docIDs mean younger
documents. Maybe this works for you?

simon

On Wed, Nov 30, 2011 at 9:08 PM, Zhang, Lisheng
<Lisheng.Zhang@broadvision.com> wrote:
> Thanks very much for your helps! I got the point, only problem is that
> I cannot afford to to use FieldCache because in our app we have many
> lucene index data folders, is there another simple way?
>
> Thanks again, Lisheng
>
> -----Original Message-----
> From: Simon Willnauer [mailto:simon.willnauer@googlemail.com]
> Sent: Wednesday, November 30, 2011 11:40 AM
> To: java-user@lucene.apache.org
> Subject: Re: Boost more recent document
>
>
> On Wed, Nov 30, 2011 at 6:59 PM, Zhang, Lisheng
> <Lisheng.Zhang@broadvision.com> wrote:
>> Hi,
>>
>> We need to boost document which is more recent (each doc has time stamp attribute).
It seems that
>> we cannot use doc boost at index time because it will be condensed into one byte
(cannot differentiate
>> 365 days), so we may use payload (save time stamp as payload) to boost at search
time.
>>
>> In our app we let user enter query at browser and use QueryParser to generate query,
the query can
>> be different types (TermQuery, BooleanQuery, WildcardQuery, ...), then it seems we
need to create
>> each customized query class similar to PayloadTermQuery, is there another simpler
way?
>
> you can simply index your timestamp (untokenzied) and wrap your query
> in a CustomScoreQuery. This query accepts your user query and a
> ValueSource. During search CustomScoreQuery calls your valuesource for
> each document that the user query scores and multiplies the result of
> the ValueSource into the score. Inside your valuesource you can simply
> get the timestamps from the FieldCache and calculate your custom
> boost...
>
> hope that helps
>
> simon
>>
>> Thanks very much for helps, Lisheng
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Mime
View raw message