lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sheng <sheng...@gmail.com>
Subject Re: dv field is too large
Date Wed, 06 Jul 2016 22:53:49 GMT
You misunderstand. I have many fields, and unfortunately a few of them are
quite big, i.e. exceeding the 32k limit. In order to make these "big"
fields sortable, they have to be stored as SortedDocValueField. Or that is
wrong, one can actually sort the search result by a "big" field without
indexing it to a SortedDocValueField. Suggestion ?

On Wednesday, July 6, 2016, Erick Erickson <erickerickson@gmail.com> wrote:

> bq: In this case, we
> have to index a particular data structure which has bunch of fields and
> each of them is promised to be searchable and search-sortable to the user
>
> If I'm reading this right, you have some structure. You say
> "each of them is promised to be searchable and search-sortable"
>
> It _sounds_ like what you want to do is break these fields out
> into separate fields each of which is searchable and sortable
> independently. But from what you've described, putting the entire
> thing into a single DV field isn't useful.
>
> Best,
> Erick
>
>
>
> On Wed, Jul 6, 2016 at 3:10 PM, Sheng <shengcer@gmail.com <javascript:;>>
> wrote:
> > To be clear, the "field" is indeed tokenized, which is accompanied with a
> > SortedDocValueField so that it is sortable too. Am I making the wrong
> > assumption here ?
> >
> > On Wednesday, July 6, 2016, Sheng <shengcer@gmail.com <javascript:;>>
> wrote:
> >
> >> Hi Eric,
> >>
> >> I am refactoring a legacy system. One of the most annoying things is I
> >> have to keep the old feature even though it makes little sense. In this
> >> case, we have to index a particular data structure which has bunch of
> >> fields and each of them is promised to be searchable and
> search-sortable to
> >> the user. Turns out one field is notoriously large. I think the old
> >> implementation uses some quite clumsy way to make it happen. But since
> we
> >> decide to refactor the system with all the goodies from Lucene, we want
> to
> >> do the sorting right, and here we are at this issue... :-(
> >>
> >> On Wednesday, July 6, 2016, Erick Erickson <erickerickson@gmail.com
> <javascript:;>
> >> <javascript:_e(%7B%7D,'cvml','erickerickson@gmail.com <javascript:;>');>>
> wrote:
> >>
> >>> Is this an "XY" problem? Meaning, why do you need DV fields larger than
> >>> 32K?
> >>>
> >>> You can't search it as text as it's not tokenized. Faceting and sorting
> >>> by a 32K
> >>> field doesn't seem very useful. You may have a perfectly valid reason,
> >>> but it's
> >>> not obvious what use-case you're serving from this thread so far....
> >>>
> >>> Nobody has yet put forth a compelling use-case for such large fields,
> >>> perhaps
> >>> this would be one.
> >>>
> >>> Best,
> >>> Erick
> >>>
> >>> On Wed, Jul 6, 2016 at 2:24 PM, Sheng <shengcer@gmail.com
> <javascript:;>> wrote:
> >>> > Mike - Thanks for the prompt response. Is there a way to bypass this
> >>> > constraint for SortedDocValueField ? Or we have to live with it,
> >>> meaning no
> >>> > fix even in future release?
> >>> >
> >>> > On Wednesday, July 6, 2016, Michael McCandless <
> >>> lucene@mikemccandless.com <javascript:;>>
> >>> > wrote:
> >>> >
> >>> >> I believe only binary DVs can be larger than 32K bytes.
> >>> >>
> >>> >> Mike McCandless
> >>> >>
> >>> >> http://blog.mikemccandless.com
> >>> >>
> >>> >> On Wed, Jul 6, 2016 at 10:31 AM, Sheng <shengcer@gmail.com
> <javascript:;>
> >>> <javascript:;>>
> >>> >> wrote:
> >>> >>
> >>> >> > Hi,
> >>> >> >
> >>> >> > I am getting an IAE indicating one of the SortedDocValueField
is
> too
> >>> >> large,
> >>> >> > > 32k
> >>> >> >
> >>> >> > I googled a bit, and it seems like #Lucene-4583 has addressed
this
> >>> issue
> >>> >> in
> >>> >> > 4.5 and 6.0, while I am currently using Lucene 6.1. Do I miss
or
> >>> >> > misunderstand anything ?
> >>> >> >
> >>> >> > Thanks,
> >>> >> >
> >>> >>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> <javascript:;>
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> <javascript:;>
> >>>
> >>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> <javascript:;>
> For additional commands, e-mail: java-user-help@lucene.apache.org
> <javascript:;>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message