lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Payloads and TrieRangeQuery
Date Wed, 10 Jun 2009 21:03:55 GMT
On Wed, Jun 10, 2009 at 4:04 PM, Earwin Burrfoot<earwin@gmail.com> wrote:

> And then, when you merge segments indexed with different Trie*
> settings, you need to convert them to some common form.
> Sounds like something too complex and with minimum returns.

Oh yeah... tricky.  So... there are various situations to handle with
trie:

  * Was the field even indexed w/ Trie, or indexed as "simple text"?
    It's useful to know this "automatically" at search time, so eg a
    RangeQuery can do the right thing by default.  FieldInfos seems
    like the natural place to store this.  It's basically Lucene's
    per-segment write-once schema.  Eg we use this to record "did any
    token in this field have a Payload?", which is analogous.

  * How did you tune your payload-vs-trie-range setting.  OK, I agree:
    this is most similar to "you changed your analyzer in an
    incompatible way, so, you have to reindex".  Plus, during merging
    we can't [easily] translate this.  So we shouldn't try to keep
    track of this.

  * We have a bug (or an important improvement) in how Trie encodes
    terms that we need to fix.  This one is not easy to handle, since
    such a change could alter the term order, and merging segments
    then becomes problematic.  Not sure how to handle that.  Yonik,
    has Solr ever had to make a change to NumberUtils?

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message