From kato-dev-return-295-apmail-incubator-kato-dev-archive=incubator.apache.org@incubator.apache.org Tue Jan 12 18:17:52 2010
Return-Path: <kato-dev-return-295-apmail-incubator-kato-dev-archive=incubator.apache.org@incubator.apache.org>
Delivered-To: apmail-incubator-kato-dev-archive@minotaur.apache.org
Received: (qmail 47720 invoked from network); 12 Jan 2010 18:17:52 -0000
Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3)
  by minotaur.apache.org with SMTP; 12 Jan 2010 18:17:52 -0000
Received: (qmail 13949 invoked by uid 500); 12 Jan 2010 18:17:52 -0000
Delivered-To: apmail-incubator-kato-dev-archive@incubator.apache.org
Received: (qmail 13930 invoked by uid 500); 12 Jan 2010 18:17:52 -0000
Mailing-List: contact kato-dev-help@incubator.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:kato-dev-help@incubator.apache.org>
List-Unsubscribe: <mailto:kato-dev-unsubscribe@incubator.apache.org>
List-Post: <mailto:kato-dev@incubator.apache.org>
List-Id: <kato-dev.incubator.apache.org>
Reply-To: kato-dev@incubator.apache.org
Delivered-To: mailing list kato-dev@incubator.apache.org
Received: (qmail 13920 invoked by uid 99); 12 Jan 2010 18:17:52 -0000
Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136)
    by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jan 2010 18:17:52 +0000
X-ASF-Spam-Status: No, hits=1.2 required=10.0
	tests=SPF_HELO_PASS,SPF_NEUTRAL
X-Spam-Check-By: apache.org
Received-SPF: neutral (athena.apache.org: local policy)
Received: from [212.227.126.171] (HELO moutng.kundenserver.de) (212.227.126.171)
    by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Jan 2010 18:17:43 +0000
Received: from [9.20.183.172] (gbibp9ph1--blueice3n1.emea.ibm.com [195.212.29.83])
	by mrelayeu.kundenserver.de (node=mreu1) with ESMTP (Nemesis)
	id 0M9cEd-1Nb2h42gN5-00DA7N; Tue, 12 Jan 2010 19:17:21 +0100
Message-ID: <4B4CBCB1.3070009@stoo.me.uk>
Date: Tue, 12 Jan 2010 18:17:21 +0000
From: Stuart Monteith <stukato@stoo.me.uk>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.6) Gecko/20091206 SeaMonkey/2.0.1
MIME-Version: 1.0
To: kato-dev@incubator.apache.org
Subject: Re: Processing huge heap dump.
References: <4B47835B.2030904@intercel.com.pl> <f889c3071001100023p111dd7a5u43da9a4f022ddc77@mail.gmail.com> <4B4A317C.2040900@stoo.me.uk> <4B4B9140.5090106@intercel.com.pl> <4B4CB631.1010503@intercel.com.pl>
In-Reply-To: <4B4CB631.1010503@intercel.com.pl>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Provags-ID: V01U2FsdGVkX18eKdOKr4rLUMV8AwbHZAt9e82bRGD5LXymiIa
 qyLa9OZPcg6faOnpofmQ2Ovpjjt3wnSVJI8dXfYjydhlVaUkIC
 DyYmAxCaaQon/dDX9O7WPJQgXgLVKT+

Hi,


You are quite right, 4 bytes for the record length. If we had the file 
format wrong for 64 bit
hprof dumps we'd not be able to read any dump, so it must be an overflow 
of the record length.
In that case, the only possible solution I can think of would be for us 
to change to code to search
for the HEAP DUMP END record (tagged 0x2c) to detect the end of the HEAP 
DUMP.

Of course, there was some puzzling over the difference between HEAP DUMP 
SEGMENT (0x1c) and HEAP DUMP
(0x0c) before.

I'd expect we'd get:

     HEAP DUMP
         heap dump records....
     HEAP DUMP END

or

     HEAP DUMP SEGMENT
         heap dump records
     HEAP DUMP SEGMENT
         heap dump records
        :      :
     HEAP DUMP END


If we ignore the record size for the HEAP DUMP and calculate it 
ourselves, we might have more luck.

Regards,
     Stuart


Lukasz wrote:
> Hello,
>
> Since performance looks not bad after changing array size, today I 
> have tried to process my 60GB heap dump. Unfortunately, very fast I've 
> ended with exception:
> Exception in thread "main" java.lang.IllegalArgumentException: range 
> -1293366370 is less than 0
> at 
> org.apache.kato.common.SubsetDataProvider.<init>(SubsetDataProvider.java:29) 
>
> at 
> org.apache.kato.hprof.datalayer.HProfFile$HPROFRecordProvider.getCurrentElement(HProfFile.java:487) 
>
> at 
> org.apache.kato.hprof.datalayer.HProfFile$HPROFRecordProvider.getCurrentElement(HProfFile.java:437) 
>
> at 
> org.apache.kato.common.BitMaskMappingArray.get(BitMaskMappingArray.java:56) 
>
> at 
> org.apache.kato.hprof.datalayer.HProfFile.getRecord(HProfFile.java:784)
> ... (my code) ...
>
> It looks like range of HeapDumpHProfRecord had value: -1293366370.
>
> Following is the part of the dump which I believes is responsible for 
> that (B2 E8 CB 9E):
> 0055DD84 25 00 00 2A │ B7 AE 22 06 │ 80 00 00 00 │ 01 00 00 2A │ B7 AE 
> 20 0B │ 88 02 00 00 │ 00 00 00 00 │ 00 18 00 00 │ 23 26 00 00
> 0055DDA8 2A B7 B0 A1 │ 64 28 00 00 │ 00 01 00 00 │ 2A B7 B0 33 │ EA F8 
> 0C 00 │ 00 00 00 B2 │ E8 CB 9E 20 │ 00 00 2A B7 │ B0 DE AF C8
> 0055DDCC 00 00 00 01 │ 00 00 2A B7 │ AE 22 0F 20 │ 00 00 2A AA │ BE 21 
> 3D 28 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 2A AA │ BE 25 B7 D8
>
> What is interesting hprof file format reserves only 4 bytes for length 
> of a record, which allows to express record whose size is at most 4GB 
> (assuming unsigned 4 bytes).
>
> My dump's version is: JAVA PROFILE 1.0.1
>
> Regards
> Lukasz
>
>
> Lukasz wrote:
>> Hi Stuart, Steve,
>>
>> I've taken deeper look into code. I still didn't trace carefully 
>> index calculation in classes BitMaskMappingArray and 
>> ArrayBitMaskMappingStrategy, but I managed to improve performance by 
>> increasing arrays size in those classes (which is set in HProfFile 
>> class).
>>
>> If I understand code correctly, when capacity of BitMaskMappingArray 
>> will be exhausted bucketSize is doubled, which in turn causes that 
>> more reads (even cached) is required to set position of 
>> IDataProvider/IArrayEntryProvider.
>>
>> Following are loading time results for default array size (1000) and 
>> increased (1000000). Test ran against generated dump file (5000000 
>> instances of Data).
>> Default (1000):
>> HeapSubRecord: 100000 (866ms, 4215kB)
>> HeapSubRecord: 200000 (1716ms, 7879kB)
>> HeapSubRecord: 300000 (2833ms, 11263kB)
>> HeapSubRecord: 400000 (3889ms, 14283kB)
>> HeapSubRecord: 500000 (3893ms, 17319kB)
>> HeapSubRecord: 600000 (7248ms, 20479kB) (here probably buckedSize was 
>> doubled)
>> HeapSubRecord: 700000 (7721ms, 23531kB)
>> HeapSubRecord: 800000 (7729ms, 26567kB)
>> HeapSubRecord: 900000 (7731ms, 29671kB)
>> HeapSubRecord: 1000000 (7704ms, 32731kB)
>> ... (I didn't wait until end)
>>
>> Increased(1000000):
>> HeapSubRecord: 100000 (622ms, 17809kB)
>> HeapSubRecord: 200000 (309ms, 20345kB)
>> HeapSubRecord: 300000 (283ms, 23861kB)
>> HeapSubRecord: 400000 (274ms, 27921kB)
>> HeapSubRecord: 500000 (269ms, 29957kB)
>> HeapSubRecord: 600000 (264ms, 31993kB)
>> HeapSubRecord: 700000 (272ms, 36097kB)
>> HeapSubRecord: 800000 (288ms, 37739kB)
>> HeapSubRecord: 900000 (263ms, 39835kB)
>> HeapSubRecord: 1000000 (259ms, 41931kB)
>> HeapSubRecord: 1100000 (300ms, 44773kB)
>> HeapSubRecord: 1200000 (283ms, 46901kB)
>> HeapSubRecord: 1300000 (291ms, 49029kB)
>> HeapSubRecord: 1400000 (328ms, 53801kB)
>> HeapSubRecord: 1500000 (259ms, 53801kB)
>> HeapSubRecord: 1600000 (272ms, 58125kB)
>> HeapSubRecord: 1700000 (264ms, 60293kB)
>> HeapSubRecord: 1800000 (264ms, 62473kB)
>> HeapSubRecord: 1900000 (361ms, 61373kB)
>> HeapSubRecord: 2000000 (274ms, 63105kB)
>> ...
>> HeapSubRecord: 9000000 (284ms, 231969kB)
>> HeapSubRecord: 9100000 (272ms, 233597kB)
>> HeapSubRecord: 9200000 (281ms, 236357kB)
>> HeapSubRecord: 9300000 (274ms, 240469kB)
>> HeapSubRecord: 9400000 (279ms, 244541kB)
>> HeapSubRecord: 9500000 (269ms, 246549kB)
>> HeapSubRecord: 9600000 (279ms, 250565kB)
>> HeapSubRecord: 9700000 (265ms, 252573kB)
>> HeapSubRecord: 9800000 (279ms, 256629kB)
>> HeapSubRecord: 9900000 (265ms, 258669kB)
>> HeapSubRecord: 10000000 (463ms, 263997kB)
>> (end)
>>
>> i.e. my 60GB dump file contains more than 1 100 000 000 of objects 
>> (if I remember correctly).
>>
>> Regards
>> Lukasz
>>
>>
>> Stuart Monteith wrote:
>>> The hprof dump reader spends a lot of time reading the whole file, 
>>> for various reason.
>>> The indices it has in memory are constructed through an initial 
>>> read, and this is also
>>> the source of the memory usage. In addition, there is some 
>>> correlation to be done which
>>> also takes up time, and induces yet more reading.
>>>
>>> I'm sure some work could be done to improve the performance further, 
>>> but we'll have to
>>> look at the tradeoff between diskspace and memory usage. The hprof 
>>> file format itself
>>> is what it is, however, and we have no influence over that. The 
>>> CJVMTI agent is has lots of
>>> room for improvement, but I suspect its potential for improvement is 
>>> unlikely to be much better
>>> than existing hprof implementations. The built-in JVM hprof dumper 
>>> will probably be a hard act
>>> to follow.
>>>
>>> The HProf implementation is not thread-safe. Realistically, I think 
>>> it is something that ought to
>>> be considered once things are more mature. There will be algorithms 
>>> that can deal with the JVM
>>> structure sensible.
>>>
>>> And thanks Lukasz, it's great to have your input.
>>>
>>> Regards,
>>> Stuart
>>
>

-- 
Stuart Monteith
http://blog.stoo.me.uk/