lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: dv field is too large
Date Wed, 06 Jul 2016 22:23:04 GMT
bq: In this case, we
have to index a particular data structure which has bunch of fields and
each of them is promised to be searchable and search-sortable to the user

If I'm reading this right, you have some structure. You say
"each of them is promised to be searchable and search-sortable"

It _sounds_ like what you want to do is break these fields out
into separate fields each of which is searchable and sortable
independently. But from what you've described, putting the entire
thing into a single DV field isn't useful.

Best,
Erick



On Wed, Jul 6, 2016 at 3:10 PM, Sheng <shengcer@gmail.com> wrote:
> To be clear, the "field" is indeed tokenized, which is accompanied with a
> SortedDocValueField so that it is sortable too. Am I making the wrong
> assumption here ?
>
> On Wednesday, July 6, 2016, Sheng <shengcer@gmail.com> wrote:
>
>> Hi Eric,
>>
>> I am refactoring a legacy system. One of the most annoying things is I
>> have to keep the old feature even though it makes little sense. In this
>> case, we have to index a particular data structure which has bunch of
>> fields and each of them is promised to be searchable and search-sortable to
>> the user. Turns out one field is notoriously large. I think the old
>> implementation uses some quite clumsy way to make it happen. But since we
>> decide to refactor the system with all the goodies from Lucene, we want to
>> do the sorting right, and here we are at this issue... :-(
>>
>> On Wednesday, July 6, 2016, Erick Erickson <erickerickson@gmail.com
>> <javascript:_e(%7B%7D,'cvml','erickerickson@gmail.com');>> wrote:
>>
>>> Is this an "XY" problem? Meaning, why do you need DV fields larger than
>>> 32K?
>>>
>>> You can't search it as text as it's not tokenized. Faceting and sorting
>>> by a 32K
>>> field doesn't seem very useful. You may have a perfectly valid reason,
>>> but it's
>>> not obvious what use-case you're serving from this thread so far....
>>>
>>> Nobody has yet put forth a compelling use-case for such large fields,
>>> perhaps
>>> this would be one.
>>>
>>> Best,
>>> Erick
>>>
>>> On Wed, Jul 6, 2016 at 2:24 PM, Sheng <shengcer@gmail.com> wrote:
>>> > Mike - Thanks for the prompt response. Is there a way to bypass this
>>> > constraint for SortedDocValueField ? Or we have to live with it,
>>> meaning no
>>> > fix even in future release?
>>> >
>>> > On Wednesday, July 6, 2016, Michael McCandless <
>>> lucene@mikemccandless.com>
>>> > wrote:
>>> >
>>> >> I believe only binary DVs can be larger than 32K bytes.
>>> >>
>>> >> Mike McCandless
>>> >>
>>> >> http://blog.mikemccandless.com
>>> >>
>>> >> On Wed, Jul 6, 2016 at 10:31 AM, Sheng <shengcer@gmail.com
>>> <javascript:;>>
>>> >> wrote:
>>> >>
>>> >> > Hi,
>>> >> >
>>> >> > I am getting an IAE indicating one of the SortedDocValueField is
too
>>> >> large,
>>> >> > > 32k
>>> >> >
>>> >> > I googled a bit, and it seems like #Lucene-4583 has addressed this
>>> issue
>>> >> in
>>> >> > 4.5 and 6.0, while I am currently using Lucene 6.1. Do I miss or
>>> >> > misunderstand anything ?
>>> >> >
>>> >> > Thanks,
>>> >> >
>>> >>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message