lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Clarke <jcla...@basistech.com>
Subject Re: questions on PerFieldSimilarityWrapper
Date Fri, 09 Nov 2012 20:24:05 GMT
I'm still confused by the semantics of PerFieldSimilarityWrapper. How can
the
queryNorm be independent of the Similarity? (at least in our case it isn't)

>From my understanding the current PerFieldSimilarityWrapper implementation
limits us to using Similarities that have the same queryNorm
implementation. In
our case we want to use two different similarities for different field
types;
our similarities perform normalization differently.

In fact the current PerFieldSimilarityWrapper implementation means that a
user
cannot use DefaultSimilarity and BM25Similarity; and expect correct results.

Our current workaround involves cloning the IndexSearcher and setting the
similarity directly (via our PerFieldSimilarityWrapper#get). We do this
before
every query as our application is multi-threaded.

  PerFieldSimilarityWrapper similarity = (PerFieldSimilarityWrapper)
searcher.getSimilarity();
  clonedSearcher = new IndexSearcher(searcher.getIndexReader());
  clonedSearcher.setSimilarity(similarity.get(queryField));

This looks like a hack and involves touching our codebase in multiple
areas. Ideally we'd like to avoid this approach. Should we file a JIRA for
this?
Bug? Improvement? New Feature?

Thanks,

James Clarke
Basis Technology Corp.


On Fri, Nov 9, 2012 at 6:08 AM, Ian Lea <ian.lea@gmail.com> wrote:

> Feels a bit of a hack, but you might be able to make it work by
> storing the field name when MyPerFieldxxx.get(name) is called and
> using that in MyPerFieldxxx.queryNorm() and coord() calls to do the
> right thing, either inline or via the relevant Similarity subclass,
> identified by the name.
>
>
> --
> Ian.
>
> On Thu, Nov 8, 2012 at 5:46 PM, Joel Barry <jmb236@gmail.com> wrote:
> >> coord() and queryNorm() work on the query as a whole, which may span
> >> multiple fields.
> >
> > Thanks for the response, but I'm still confused.  In our use case, our
> > documents have two distinct types of fields, e.g.
> >
> > Document:
> >   A-field1
> >   A-field2
> >   A-field3
> >   B-field1
> >   B-field2
> >   B-field3
> >
> > In our application, we know that queries will hit only A-fields or
> > B-fields, never both in the same query.  But we *do* want to have
> > different behavior for queryNorm() and coord() for the A and B
> > queries.
> >
> > Could you suggest a way to do this?
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message