lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Elschot <paul.elsc...@xs4all.nl>
Subject Re: Can I do boosting based on term postions?
Date Fri, 03 Aug 2007 14:49:07 GMT
Cedric,

You can choose the end limit for SpanFirstQuery yourself.

Regards,
Paul Elschot


On Friday 03 August 2007 05:38, Cedric Ho wrote:
> Hi Paul,
> 
> Isn't SpanFirstQuery only match those with position less than a
> certain end position?
> 
> I am rather looking for a query that would score a document higher for
> terms appear near the start but not totally discard those with terms
> appear near the end.
> 
> Regards,
> Cedric
> 
> On 8/2/07, Paul Elschot <paul.elschot@xs4all.nl> wrote:
> > Cedric,
> >
> > SpanFirstQuery could be a solution without payloads.
> > You may want to give it your own Similarity.sloppyFreq() .
> >
> > Regards,
> > Paul Elschot
> >
> > On Thursday 02 August 2007 04:07, Cedric Ho wrote:
> > > Thanks for the quick response =)
> > >
> > > On 8/1/07, Shailendra Sharma <shailendra.sharma@gmail.com> wrote:
> > > > Yes, it is easily doable through "Payload" facility. During indexing
> > process
> > > > (mainly tokenization), you need to push this extra information in each
> > > > token. And then you can use BoostingTermQuery for using Payload value

to
> > > > include Payload in the score. You also need to implement Similarity 
for
> > this
> > > > (mainly scorePayload method).
> > >
> > > If I store, say a custom boost factor as Payload, does it means that I
> > > will store one more byte per term per document in the index file? So
> > > the index file would be much larger?
> > >
> > > >
> > > > Other way can be to extend SpanTermQuery, this already calculates the
> > > > position of match. You just need to do something to use this position
> > value
> > > > in the score calculation.
> > >
> > > I see that SpanTermQuery takes a TermPositions from the indexReader
> > > and I can get the term position from there. However I am not sure how
> > > to incorporate it into the score calculation. Would you mind give a
> > > little more detail on this?
> > >
> > > >
> > > > One possible advantage of SpanTermQuery approach is that you can play
> > > > around, without re-creating indices everytime.
> > > >
> > > > Thanks,
> > > > Shailendra Sharma,
> > > > CTO, Ver se' Innovation Pvt. Ltd.
> > > > Bangalore, India
> > > >
> > > > On 8/1/07, Cedric Ho <cedric.ho@gmail.com> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > I was wondering if it is possible to do boosting by search terms'
> > > > > position in the document.
> > > > >
> > > > > for example:
> > > > > search terms appear in the first 100 words, or first 10% words, or

in
> > > > > first two paragraphs would be given higher score.
> > > > >
> > > > > Is it achievable through using the new Payload function in lucene

2.2?
> > > > > Or are there any easier ways to achieve these ?
> > > > >
> > > > >
> > > > > Regards,
> > > > > Cedric
> > > > >
> > > > > 
---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > > > >
> > > > >
> > > >
> > >
> > > Thanks,
> > > Cedric
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> 
> -- 
> 愛@上.Keyboard
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message