lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Wang <welshw...@gmail.com>
Subject Re: DocValues questions
Date Thu, 04 Apr 2013 21:03:23 GMT
Given the new Lucene 4.2 DocValues API, it seems no matter it is byte,
short, int, or long, they are all stored as NumericDocValuesField. Does
this mean "long" values are always stored regardless of the initial type?
If so, do we still save space if the value range is small? Do we need to
give some hint to NumericDocValuesField to save space?

On Thu, Apr 4, 2013 at 11:53 AM, Wei Wang <welshwang@gmail.com> wrote:

> Hi Adrien,
>
> Thanks for the clarification. It is very helpful. Will try Lucene 4.2 and
> AtomicReader API.
>
> Wei
>
>
> On Thu, Apr 4, 2013 at 11:22 AM, Adrien Grand <jpountz@gmail.com> wrote:
>
>> Hi,
>>
>> On Thu, Apr 4, 2013 at 10:30 AM, Wei Wang <welshwang@gmail.com> wrote:
>> > A few quick questions about DocValues:
>> >
>> > 1. If only small number of documents have a ShortDocValueField defined,
>> > should each document in the index has this field filled with some value?
>> > The add() function of Document seems not enforce a DocValues field is
>> > always added to each document.
>>
>> Given the name of the fied you are referring to, I assume that you are
>> using Lucene 4.0 or 4.1. I would highly recommend to upgrade to Lucene
>> 4.2 since the API has been completely refactored (but the disk format
>> is compatible) and should hopefully be a little clearer.
>>
>> You are right that there is nothing that enforces that every document
>> has a value : Lucene will give a default value to documents: 0 for
>> numeric doc values and an empty byte array for binary doc values.
>>
>> > 2. Is there any examples to show how DocValues are stored and
>> retrieved? It
>> > seems JavaDoc only shows how to add it, and no complete examples are out
>> > there.
>>
>> This should be transparent if you use doc values for eg. sorting.
>> Otherwise, just call getNumericDocValues(field), getBinaryDocValues or
>> getSortedDocValues on an AtomicReader.
>>
>> I hope this helps.
>>
>> --
>> Adrien
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message