lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jake Mannix <jake.man...@gmail.com>
Subject Re: Question about how to speed up custom scoring
Date Sat, 10 Oct 2009 00:32:59 GMT
Great Scott (hah!) - please do report back, even if it just works fine and
you have no more questions, I'd like to know whether this really is
what you were after and actually works for you.

Note that the FieldCache is kinda "magic" - it's lazy (so the first query
will
be slow and you should fire one off just to warm it up after every time
you reload an IndexReader), and kinda leaky: the entries in the FieldCache
stick around until all references to the IndexReader they're keyed on
get garbage collected (there's a WeakHashMap in the background), so
don't accidentally hang onto references to those IndexReaders past
when needed.

  -jake

On Fri, Oct 9, 2009 at 3:52 PM, scott w <scottblanc@gmail.com> wrote:

> Thanks Jake! I will test this out and report back soon in case it's helpful
> to others. Definitely appreciate the help.
>
> Scott
>
> On Fri, Oct 9, 2009 at 3:33 PM, Jake Mannix <jake.mannix@gmail.com> wrote:
>
> > On Fri, Oct 9, 2009 at 3:07 PM, scott w <scottblanc@gmail.com> wrote:
> >
> > > Example Document:
> > > model_1_score = 0.9
> > > model_2_score = 0.3
> > > model_3_score = 0.7
> > >
> > > I want to be able to pass in the following map at query time:
> > > {model_1_score=0.4, model_2_score=0.7} and have that map get used as
> > input
> > > to a custom score function that would look like: 0.9*0.4 + 0.3*0.7, so
> > that
> > > it is summing over the specified fields and multipling the indexed
> weight
> > > by
> > > the query time weight.
> > >
> >
> > Ok, now I think I get it.  You do have some index-time floats, and
> > query-time
> > boosts.
> >
> > You should be able to do a normal BooleanQuery with three OR'ed together
> > ValueSourceQueries boosted by input weights, it seems, as that is the way
> > they work - they take the values in different fields and use them as the
> > score,
> > and the normal boosting technique will use query time weigting.
> >
> > To get good performance with a ValueSourceQuery, I'd suggest using a
> > FieldCacheSource for speedier processing:
> >
> > -----------
> >  String field = "model_1_score";
> >  float runTimeBoost = 0.5;
> >
> >  ValueSource model1Source = new FloatFieldSource(field,
> >    new FieldCache.FloatParser() {
> >      public float parseFloat(String s) { return Float.parseFloat(s); }
> >    });
> >
> >  Query model1Q = new ValueSourceQuery(model1Source);
> >
> >  model1Q.setBoost(runTimeBoost);
> >
> > // do this for model2 and model3 as well...
> >
> >  BooleanQuery bq = new BooleanQuery();
> >  bq.add(model1Q, Occur.SHOULD);
> >  bq.add(model2Q, Occur.SHOULD);
> >  bq.add(model3Q, Occur.SHOULD);
> > ---------------
> >
> >  I haven't tried this code, but it seems like this is what you are trying
> > to do...
> >
> >  -jake
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message