lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: dv field is too large
Date Wed, 06 Jul 2016 22:24:07 GMT
Yes, or you could get the utf8 bytes yourself client side and check that
length.

Mike McCandless

http://blog.mikemccandless.com

On Wed, Jul 6, 2016 at 6:16 PM, Sheng <shengcer@gmail.com> wrote:

> Is 32k / MAX_UTF8_BYTES_PER_CHAR an accurate limit for the number of
> characters a payload string can carry?
>
> On Wednesday, July 6, 2016, Michael McCandless <lucene@mikemccandless.com>
> wrote:
>
> > Maybe you could simply truncate the user-supplied values at 32 KB?
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> > On Wed, Jul 6, 2016 at 5:55 PM, Sheng <shengcer@gmail.com
> <javascript:;>>
> > wrote:
> >
> > > Hi Eric,
> > >
> > > I am refactoring a legacy system. One of the most annoying things is I
> > have
> > > to keep the old feature even though it makes little sense. In this
> case,
> > we
> > > have to index a particular data structure which has bunch of fields and
> > > each of them is promised to be searchable and search-sortable to the
> > user.
> > > Turns out one field is notoriously large. I think the old
> implementation
> > > uses some quite clumsy way to make it happen. But since we decide to
> > > refactor the system with all the goodies from Lucene, we want to do the
> > > sorting right, and here we are at this issue... :-(
> > >
> > > On Wednesday, July 6, 2016, Erick Erickson <erickerickson@gmail.com
> > <javascript:;>>
> > > wrote:
> > >
> > > > Is this an "XY" problem? Meaning, why do you need DV fields larger
> than
> > > > 32K?
> > > >
> > > > You can't search it as text as it's not tokenized. Faceting and
> sorting
> > > by
> > > > a 32K
> > > > field doesn't seem very useful. You may have a perfectly valid
> reason,
> > > but
> > > > it's
> > > > not obvious what use-case you're serving from this thread so far....
> > > >
> > > > Nobody has yet put forth a compelling use-case for such large fields,
> > > > perhaps
> > > > this would be one.
> > > >
> > > > Best,
> > > > Erick
> > > >
> > > > On Wed, Jul 6, 2016 at 2:24 PM, Sheng <shengcer@gmail.com
> > <javascript:;>
> > > <javascript:;>>
> > > > wrote:
> > > > > Mike - Thanks for the prompt response. Is there a way to bypass
> this
> > > > > constraint for SortedDocValueField ? Or we have to live with it,
> > > meaning
> > > > no
> > > > > fix even in future release?
> > > > >
> > > > > On Wednesday, July 6, 2016, Michael McCandless <
> > > > lucene@mikemccandless.com <javascript:;> <javascript:;>>
> > > > > wrote:
> > > > >
> > > > >> I believe only binary DVs can be larger than 32K bytes.
> > > > >>
> > > > >> Mike McCandless
> > > > >>
> > > > >> http://blog.mikemccandless.com
> > > > >>
> > > > >> On Wed, Jul 6, 2016 at 10:31 AM, Sheng <shengcer@gmail.com
> > <javascript:;>
> > > > <javascript:;> <javascript:;>>
> > > > >> wrote:
> > > > >>
> > > > >> > Hi,
> > > > >> >
> > > > >> > I am getting an IAE indicating one of the SortedDocValueField
is
> > too
> > > > >> large,
> > > > >> > > 32k
> > > > >> >
> > > > >> > I googled a bit, and it seems like #Lucene-4583 has addressed
> this
> > > > issue
> > > > >> in
> > > > >> > 4.5 and 6.0, while I am currently using Lucene 6.1. Do I
miss or
> > > > >> > misunderstand anything ?
> > > > >> >
> > > > >> > Thanks,
> > > > >> >
> > > > >>
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > <javascript:;>
> > > > <javascript:;>
> > > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > <javascript:;>
> > > > <javascript:;>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message