lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Can I do boosting based on term postions?
Date Thu, 02 Aug 2007 07:22:55 GMT
Cedric,

SpanFirstQuery could be a solution without payloads.
You may want to give it your own Similarity.sloppyFreq() .

Regards,
Paul Elschot

On Thursday 02 August 2007 04:07, Cedric Ho wrote:
> Thanks for the quick response =)
> 
> On 8/1/07, Shailendra Sharma <shailendra.sharma@gmail.com> wrote:
> > Yes, it is easily doable through "Payload" facility. During indexing 
process
> > (mainly tokenization), you need to push this extra information in each
> > token. And then you can use BoostingTermQuery for using Payload value to
> > include Payload in the score. You also need to implement Similarity for 
this
> > (mainly scorePayload method).
> 
> If I store, say a custom boost factor as Payload, does it means that I
> will store one more byte per term per document in the index file? So
> the index file would be much larger?
> 
> >
> > Other way can be to extend SpanTermQuery, this already calculates the
> > position of match. You just need to do something to use this position 
value
> > in the score calculation.
> 
> I see that SpanTermQuery takes a TermPositions from the indexReader
> and I can get the term position from there. However I am not sure how
> to incorporate it into the score calculation. Would you mind give a
> little more detail on this?
> 
> >
> > One possible advantage of SpanTermQuery approach is that you can play
> > around, without re-creating indices everytime.
> >
> > Thanks,
> > Shailendra Sharma,
> > CTO, Ver se' Innovation Pvt. Ltd.
> > Bangalore, India
> >
> > On 8/1/07, Cedric Ho <cedric.ho@gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > I was wondering if it is possible to do boosting by search terms'
> > > position in the document.
> > >
> > > for example:
> > > search terms appear in the first 100 words, or first 10% words, or in
> > > first two paragraphs would be given higher score.
> > >
> > > Is it achievable through using the new Payload function in lucene 2.2?
> > > Or are there any easier ways to achieve these ?
> > >
> > >
> > > Regards,
> > > Cedric
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
> 
> Thanks,
> Cedric
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message