lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: DocValues memory usage
Date Tue, 26 Mar 2013 16:30:19 GMT
DiskDocValuesFormat is the right thing to use: it loads certain things
into RAM, eg the compressed bits that tell it the addresses of the
bytes on disk, but then leaves the actual bytes on disk.

I believe the old DirectSource was more extreme as it left the
addresses on disk too, so there were 2 seeks to load a value.

Mike McCandless

http://blog.mikemccandless.com

On Tue, Mar 26, 2013 at 11:55 AM, Duke <duke.dai.007@gmail.com> wrote:
> I made the same experiment and got same result. Then I used per-field codec with DiskDocValuesFormat,
it works like DirectSource in 4.0.0, but I'm not feeling confident with this usage. Anyone
can say more about removing DirectSource API?
>
>
>
> On 2013-3-26, at 22:59, Peter Keegan <peterlkeegan@gmail.com> wrote:
>
>> Inspired by this presentation of DocValues:
>> http://www.slideshare.net/lucenerevolution/willnauer-simon-doc-values-column-stride-fields-in-lucene
>> I decided to try them out in 4.2. I created a 1M document index with one
>> DocValues field:
>>
>> BinaryDocValuesField conceptsDV = new BinaryDocValuesField("concepts",new
>> BytesRef(byteArray(4000)));
>> d.add(conceptsDV);
>> writer.addDocument(d);
>>
>> I searched the index and fetched the DocValues field:
>>
>> TopDocs docs = searcher.search(new TermQuery(new Term("guid", val)), 1);
>> int docId = docs.scoreDocs[0].doc;
>> BinaryDocValues conceptValues =
>> MultiDocValues.getBinaryValues(r,"concepts");
>> BytesRef result = new BytesRef();
>> conceptValues.get(docId,result);
>>
>> However, the first call to MultiDocValues.getBinaryValues reads in the
>> values for the entire index:
>>
>> Lucene42DocValuesProducer.loadBinary // loads DocValues for entire index
>>
>> My hope was to take advantage of faster disk access than stored fields and
>> less RAM than FieldCache, but this is using too much memory. Are my
>> assumptions and my usage correct?
>>
>> Thanks,
>> Peter
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message