lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From scott w <scottbl...@gmail.com>
Subject Re: Question about how to speed up custom scoring
Date Sun, 11 Oct 2009 02:24:42 GMT
Haven't tried it yet but looking at it closer it looks like it's not
something I can plug in on top of my original query. I am definitely happy
using an approximation for the sake of performance but I do need to be able
to have the original results stay the same.

On Fri, Oct 9, 2009 at 5:32 PM, Jake Mannix <jake.mannix@gmail.com> wrote:

> Great Scott (hah!) - please do report back, even if it just works fine and
> you have no more questions, I'd like to know whether this really is
> what you were after and actually works for you.
>

> Note that the FieldCache is kinda "magic" - it's lazy (so the first query
> will
> be slow and you should fire one off just to warm it up after every time
> you reload an IndexReader), and kinda leaky: the entries in the FieldCache
> stick around until all references to the IndexReader they're keyed on
> get garbage collected (there's a WeakHashMap in the background), so
> don't accidentally hang onto references to those IndexReaders past
> when needed.
>

Good to know.

>
>  -jake
>
> On Fri, Oct 9, 2009 at 3:52 PM, scott w <scottblanc@gmail.com> wrote:
>
> > Thanks Jake! I will test this out and report back soon in case it's
> helpful
> > to others. Definitely appreciate the help.
> >
> > Scott
> >
> > On Fri, Oct 9, 2009 at 3:33 PM, Jake Mannix <jake.mannix@gmail.com>
> wrote:
> >
> > > On Fri, Oct 9, 2009 at 3:07 PM, scott w <scottblanc@gmail.com> wrote:
> > >
> > > > Example Document:
> > > > model_1_score = 0.9
> > > > model_2_score = 0.3
> > > > model_3_score = 0.7
> > > >
> > > > I want to be able to pass in the following map at query time:
> > > > {model_1_score=0.4, model_2_score=0.7} and have that map get used as
> > > input
> > > > to a custom score function that would look like: 0.9*0.4 + 0.3*0.7,
> so
> > > that
> > > > it is summing over the specified fields and multipling the indexed
> > weight
> > > > by
> > > > the query time weight.
> > > >
> > >
> > > Ok, now I think I get it.  You do have some index-time floats, and
> > > query-time
> > > boosts.
> > >
> > > You should be able to do a normal BooleanQuery with three OR'ed
> together
> > > ValueSourceQueries boosted by input weights, it seems, as that is the
> way
> > > they work - they take the values in different fields and use them as
> the
> > > score,
> > > and the normal boosting technique will use query time weigting.
> > >
> > > To get good performance with a ValueSourceQuery, I'd suggest using a
> > > FieldCacheSource for speedier processing:
> > >
> > > -----------
> > >  String field = "model_1_score";
> > >  float runTimeBoost = 0.5;
> > >
> > >  ValueSource model1Source = new FloatFieldSource(field,
> > >    new FieldCache.FloatParser() {
> > >      public float parseFloat(String s) { return Float.parseFloat(s); }
> > >    });
> > >
> > >  Query model1Q = new ValueSourceQuery(model1Source);
> > >
> > >  model1Q.setBoost(runTimeBoost);
> > >
> > > // do this for model2 and model3 as well...
> > >
> > >  BooleanQuery bq = new BooleanQuery();
> > >  bq.add(model1Q, Occur.SHOULD);
> > >  bq.add(model2Q, Occur.SHOULD);
> > >  bq.add(model3Q, Occur.SHOULD);
> > > ---------------
> > >
> > >  I haven't tried this code, but it seems like this is what you are
> trying
> > > to do...
> > >
> > >  -jake
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message