incubator-kato-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Poole <spoole...@googlemail.com>
Subject Re: Processing huge heap dump.
Date Fri, 15 Jan 2010 08:14:53 GMT
Hi Likasz  - good news that you can get through the whole dump now :-)  I'd
like to get a list of the changes you made to the reader if that's
possible?



On Thu, Jan 14, 2010 at 5:21 PM, Lukasz <flower@intercel.com.pl> wrote:

> Hi Steve, Stuart,
>
> In last two days I was trying to move further.
>
> I tried to prepare some workaround for incorrect length.
> First idea was to utilize the fact that integer overflow probably took
> place, so I tried to search for 0x2c just at overflow points (probably quite
> brittle approach). But it turned out, that in my heap dumps (both real one
> and prepared for development purposes) there is no 0x2c byte after/at the
> end of the dump.
> I've noticed however that for both dumps the heap record ends at the end of
> the file. Probably it is not general rule, but for my purposes it should be
> enough, so I changed kato sources slightly to utilize this and decided to
> move further.
>
> So far I have some code which extracts static data from dump (strings,
> classes) and searches for instances of given class.
> I have run it against real dump. It took around 40 min (to get static data)
> and 50 min (to find instances), which is not bad I think.
> I didn't made careful reasoning about performance but CPU (not I/O) looked
> to be a bottleneck. Performance also looked to be quite stable across whole
> dump.
>
> 60GB dump was extracted for core file using jmap (I don't remember exactly
> from which version of java this jmap came from, but for sure it was Java
> 1.6).
> Following JVM version was dumped to core file:
> java version "1.6.0_10"
> Java(TM) SE Runtime Environment (build 1.6.0_10-b33)
> Java HotSpot(TM) 64-Bit Server VM (build 11.0-b15, mixed mode)
>
> I was also suspecting that heap dump segments were introduced to handle
> +4GB heap sizes, but it looks like sometimes it can not be a case.
>
> Anyway, it is nice that I was able to go through whole dump without any
> error.
>
> Regards
> Lukasz
>
>
> Steve Poole wrote:
>
>>  Hi Lukasz,
>>
>>  You say your  hprof format is 1.0.1   so , according to the information I
>> have about hprof format,  that means your won't have any heap dump
>> segments
>> in your file.  Apparently they are in the  1.0.2 format which is a Java 6
>> addition and is only used when the heap is too big for the old format to
>> cope with.
>>
>> What JVM and OS are you using to produce this dump?
>>
>> On Tue, Jan 12, 2010 at 6:17 PM, Stuart Monteith <stukato@stoo.me.uk>
>> wrote:
>>
>>
>>
>>> Hi,
>>>
>>>
>>> You are quite right, 4 bytes for the record length. If we had the file
>>> format wrong for 64 bit
>>> hprof dumps we'd not be able to read any dump, so it must be an overflow
>>> of
>>> the record length.
>>> In that case, the only possible solution I can think of would be for us
>>> to
>>> change to code to search
>>> for the HEAP DUMP END record (tagged 0x2c) to detect the end of the HEAP
>>> DUMP.
>>>
>>> Of course, there was some puzzling over the difference between HEAP DUMP
>>> SEGMENT (0x1c) and HEAP DUMP
>>> (0x0c) before.
>>>
>>> I'd expect we'd get:
>>>
>>>   HEAP DUMP
>>>       heap dump records....
>>>   HEAP DUMP END
>>>
>>> or
>>>
>>>   HEAP DUMP SEGMENT
>>>       heap dump records
>>>   HEAP DUMP SEGMENT
>>>       heap dump records
>>>      :      :
>>>   HEAP DUMP END
>>>
>>>
>>> If we ignore the record size for the HEAP DUMP and calculate it
>>> ourselves,
>>> we might have more luck.
>>>
>>> Regards,
>>>   Stuart
>>>
>>>
>>>
>>> Lukasz wrote:
>>>
>>>
>>>
>>>> Hello,
>>>>
>>>> Since performance looks not bad after changing array size, today I have
>>>> tried to process my 60GB heap dump. Unfortunately, very fast I've ended
>>>> with
>>>> exception:
>>>> Exception in thread "main" java.lang.IllegalArgumentException: range
>>>> -1293366370 is less than 0
>>>> at
>>>>
>>>> org.apache.kato.common.SubsetDataProvider.<init>(SubsetDataProvider.java:29)
>>>>
>>>> at
>>>>
>>>> org.apache.kato.hprof.datalayer.HProfFile$HPROFRecordProvider.getCurrentElement(HProfFile.java:487)
>>>>
>>>> at
>>>>
>>>> org.apache.kato.hprof.datalayer.HProfFile$HPROFRecordProvider.getCurrentElement(HProfFile.java:437)
>>>>
>>>> at
>>>>
>>>> org.apache.kato.common.BitMaskMappingArray.get(BitMaskMappingArray.java:56)
>>>> at
>>>> org.apache.kato.hprof.datalayer.HProfFile.getRecord(HProfFile.java:784)
>>>> ... (my code) ...
>>>>
>>>> It looks like range of HeapDumpHProfRecord had value: -1293366370.
>>>>
>>>> Following is the part of the dump which I believes is responsible for
>>>> that
>>>> (B2 E8 CB 9E):
>>>> 0055DD84 25 00 00 2A │ B7 AE 22 06 │ 80 00 00 00 │ 01 00 00 2A │
B7 AE
>>>> 20
>>>> 0B │ 88 02 00 00 │ 00 00 00 00 │ 00 18 00 00 │ 23 26 00 00
>>>> 0055DDA8 2A B7 B0 A1 │ 64 28 00 00 │ 00 01 00 00 │ 2A B7 B0 33 │
EA F8
>>>> 0C
>>>> 00 │ 00 00 00 B2 │ E8 CB 9E 20 │ 00 00 2A B7 │ B0 DE AF C8
>>>> 0055DDCC 00 00 00 01 │ 00 00 2A B7 │ AE 22 0F 20 │ 00 00 2A AA │
BE 21
>>>> 3D
>>>> 28 │ 00 00 00 00 │ 00 00 00 00 │ 00 00 2A AA │ BE 25 B7 D8
>>>>
>>>> What is interesting hprof file format reserves only 4 bytes for length
>>>> of
>>>> a record, which allows to express record whose size is at most 4GB
>>>> (assuming
>>>> unsigned 4 bytes).
>>>>
>>>> My dump's version is: JAVA PROFILE 1.0.1
>>>>
>>>> Regards
>>>> Lukasz
>>>>
>>>>
>>>> Lukasz wrote:
>>>>
>>>>
>>>>
>>>>> Hi Stuart, Steve,
>>>>>
>>>>> I've taken deeper look into code. I still didn't trace carefully index
>>>>> calculation in classes BitMaskMappingArray and
>>>>> ArrayBitMaskMappingStrategy,
>>>>> but I managed to improve performance by increasing arrays size in those
>>>>> classes (which is set in HProfFile class).
>>>>>
>>>>> If I understand code correctly, when capacity of BitMaskMappingArray
>>>>> will
>>>>> be exhausted bucketSize is doubled, which in turn causes that more
>>>>> reads
>>>>> (even cached) is required to set position of
>>>>> IDataProvider/IArrayEntryProvider.
>>>>>
>>>>> Following are loading time results for default array size (1000) and
>>>>> increased (1000000). Test ran against generated dump file (5000000
>>>>> instances
>>>>> of Data).
>>>>> Default (1000):
>>>>> HeapSubRecord: 100000 (866ms, 4215kB)
>>>>> HeapSubRecord: 200000 (1716ms, 7879kB)
>>>>> HeapSubRecord: 300000 (2833ms, 11263kB)
>>>>> HeapSubRecord: 400000 (3889ms, 14283kB)
>>>>> HeapSubRecord: 500000 (3893ms, 17319kB)
>>>>> HeapSubRecord: 600000 (7248ms, 20479kB) (here probably buckedSize was
>>>>> doubled)
>>>>> HeapSubRecord: 700000 (7721ms, 23531kB)
>>>>> HeapSubRecord: 800000 (7729ms, 26567kB)
>>>>> HeapSubRecord: 900000 (7731ms, 29671kB)
>>>>> HeapSubRecord: 1000000 (7704ms, 32731kB)
>>>>> ... (I didn't wait until end)
>>>>>
>>>>> Increased(1000000):
>>>>> HeapSubRecord: 100000 (622ms, 17809kB)
>>>>> HeapSubRecord: 200000 (309ms, 20345kB)
>>>>> HeapSubRecord: 300000 (283ms, 23861kB)
>>>>> HeapSubRecord: 400000 (274ms, 27921kB)
>>>>> HeapSubRecord: 500000 (269ms, 29957kB)
>>>>> HeapSubRecord: 600000 (264ms, 31993kB)
>>>>> HeapSubRecord: 700000 (272ms, 36097kB)
>>>>> HeapSubRecord: 800000 (288ms, 37739kB)
>>>>> HeapSubRecord: 900000 (263ms, 39835kB)
>>>>> HeapSubRecord: 1000000 (259ms, 41931kB)
>>>>> HeapSubRecord: 1100000 (300ms, 44773kB)
>>>>> HeapSubRecord: 1200000 (283ms, 46901kB)
>>>>> HeapSubRecord: 1300000 (291ms, 49029kB)
>>>>> HeapSubRecord: 1400000 (328ms, 53801kB)
>>>>> HeapSubRecord: 1500000 (259ms, 53801kB)
>>>>> HeapSubRecord: 1600000 (272ms, 58125kB)
>>>>> HeapSubRecord: 1700000 (264ms, 60293kB)
>>>>> HeapSubRecord: 1800000 (264ms, 62473kB)
>>>>> HeapSubRecord: 1900000 (361ms, 61373kB)
>>>>> HeapSubRecord: 2000000 (274ms, 63105kB)
>>>>> ...
>>>>> HeapSubRecord: 9000000 (284ms, 231969kB)
>>>>> HeapSubRecord: 9100000 (272ms, 233597kB)
>>>>> HeapSubRecord: 9200000 (281ms, 236357kB)
>>>>> HeapSubRecord: 9300000 (274ms, 240469kB)
>>>>> HeapSubRecord: 9400000 (279ms, 244541kB)
>>>>> HeapSubRecord: 9500000 (269ms, 246549kB)
>>>>> HeapSubRecord: 9600000 (279ms, 250565kB)
>>>>> HeapSubRecord: 9700000 (265ms, 252573kB)
>>>>> HeapSubRecord: 9800000 (279ms, 256629kB)
>>>>> HeapSubRecord: 9900000 (265ms, 258669kB)
>>>>> HeapSubRecord: 10000000 (463ms, 263997kB)
>>>>> (end)
>>>>>
>>>>> i.e. my 60GB dump file contains more than 1 100 000 000 of objects (if
>>>>> I
>>>>> remember correctly).
>>>>>
>>>>> Regards
>>>>> Lukasz
>>>>>
>>>>>
>>>>> Stuart Monteith wrote:
>>>>>
>>>>>
>>>>>
>>>>>> The hprof dump reader spends a lot of time reading the whole file,
for
>>>>>> various reason.
>>>>>> The indices it has in memory are constructed through an initial read,
>>>>>> and this is also
>>>>>> the source of the memory usage. In addition, there is some correlation
>>>>>> to be done which
>>>>>> also takes up time, and induces yet more reading.
>>>>>>
>>>>>> I'm sure some work could be done to improve the performance further,
>>>>>> but
>>>>>> we'll have to
>>>>>> look at the tradeoff between diskspace and memory usage. The hprof
>>>>>> file
>>>>>> format itself
>>>>>> is what it is, however, and we have no influence over that. The CJVMTI
>>>>>> agent is has lots of
>>>>>> room for improvement, but I suspect its potential for improvement
is
>>>>>> unlikely to be much better
>>>>>> than existing hprof implementations. The built-in JVM hprof dumper
>>>>>> will
>>>>>> probably be a hard act
>>>>>> to follow.
>>>>>>
>>>>>> The HProf implementation is not thread-safe. Realistically, I think
it
>>>>>> is something that ought to
>>>>>> be considered once things are more mature. There will be algorithms
>>>>>> that
>>>>>> can deal with the JVM
>>>>>> structure sensible.
>>>>>>
>>>>>> And thanks Lukasz, it's great to have your input.
>>>>>>
>>>>>> Regards,
>>>>>> Stuart
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>> --
>>> Stuart Monteith
>>> http://blog.stoo.me.uk/
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
>


-- 
Steve

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message