lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen ...@statsbiblioteket.dk>
Subject Re: heap memory issues when sorting by a string field
Date Mon, 14 Dec 2009 14:09:41 GMT
On Fri, 2009-12-11 at 14:53 +0100, Michael McCandless wrote:
> How long does Lucene take to build the ords for the toplevel reader?
> 
> You should be able to just time FieldCache.getStringIndex(topLevelReader).
>
> I think your 8.5 seconds for first Lucene search was with the
> StringIndex computed per segment?

Cold disk-cache (directly after reboot):
[2009-12-14 14:44:10,914] Requesting StringIndex for field sort_title
[2009-12-14 14:44:20,326] Got StringIndex of length 2916008 in 9
seconds, 412 ms

Warm disk-cache (3 minutes after first test):
[2009-12-14 14:44:10,914] Requesting StringIndex for field sort_title
[2009-12-14 14:44:20,326] Got StringIndex of length 2916008 in 8
seconds, 414 ms

The response time for the first sorted search was about 8,5 seconds, but
that was after 6 non-sorted searches without the use of explicit field
cache, so some amount of warm-up was performed.

Caveat: I must stress that this is very much ad hoc testing.


----------------- FieldCache test code

    // Meant for testing
    private FieldCache.StringIndex getStringIndex(
            IndexReader reader, String field) {
        log.info("Requesting StringIndex for field " + field);
        Profiler profiler = new Profiler();
        FieldCache.StringIndex stringIndex;
        try {
            stringIndex = FieldCache.DEFAULT.getStringIndex(reader,
field);
        } catch (IOException e) {
            log.error("Could not retrieve StringIndex", e);
            return null;
        }
        log.info("Got StringIndex of length " + stringIndex.order.length
                 + " in " + profiler.getSpendTime());
        return stringIndex;
    }

----------------- Lucene 2.4 index

ls -l index/sb/20091201-115941/lucene/

-rw-rw-r-- 1 summatst summatst 12840211452 Dec  2 11:21 _0.cfx
-rw-rw-r-- 1 summatst summatst   361027455 Dec  2 11:19 _32.cfs
-rw-rw-r-- 1 summatst summatst   373374178 Dec  2 11:19 _65.cfs
-rw-rw-r-- 1 summatst summatst   438076782 Dec  2 11:21 _98.cfs
-rw-rw-r-- 1 summatst summatst   463141239 Dec  2 11:19 _cb.cfs
-rw-rw-r-- 1 summatst summatst  1862427706 Dec  2 11:19 _rm.cfs
-rw-rw-r-- 1 summatst summatst         203 Dec  2 11:21 segments_3
-rw-rw-r-- 1 summatst summatst          20 Dec  2 11:18 segments.gen

-----------------

Regards,
Toke Eskildsen


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message