harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Blewitt <alex.blew...@gmail.com>
Subject Re: [classlib][pack200] Decoupling I/O and processing for unpacking scenario
Date Fri, 18 Jul 2008 20:36:08 GMT
The order is vital - the output of a decompressor must be identical  
for every run, regardless of implementation. The property pack -  
unpack - pack must hold, because signatures are calculated on the post  
unpacked file (which is then repacked). If you allowed arbitrary  
sorting of files between segments, they may differ in subsequent runs  
and this produce different signatures.

Alex

Sent from my (new) iPhone

On 18 Jul 2008, at 21:21, "Andrew Cornwall" <andrew.pack200@gmail.com>  
wrote:

> 1. I'm not 100% sure about JAR file order, but I can see  
> circumstances when
> it will matter (for instance, when there's a JAR file with more than  
> one
> class of the same name in it). I don't know if some JAR files are  
> ordered
> explicitly for performance reasons. I guess I'd hate to give up  
> identical
> ordering if we don't have to.
>
> 2. My current VM doesn't have verbose GC handling in it. Hacking it to
> provide some info shows that I do about 5200 global GCs in handling  
> all the
> files.
>
> 3. I'm using a set of 466 Eclipse / Sametime plugins that are packed  
> with
> default packing. The average size of the pack.gz files is around  
> 42517, with
> a standard deviation of 146240. The median size is 7309.
>
>
> On Fri, Jul 18, 2008 at 12:30 PM, Aleksey Shipilev <
> aleksey.shipilev@gmail.com> wrote:
>
>> Andrew,
>>
>> I forgot to discuss three things:
>>
>> 1. The order of segments is not preserved in MT (multithreaded)
>> version. Should we care about that?
>>
>> 2. MT version exposes GC problems since there are no more room for
>> separate GC thread like it was in ST (single-threaded). Can you print
>> -verbose:gc for your tests and see how much time spent in GC?  
>> Parallel
>> GC should help too. Larger heap should also help.
>>
>> 3. I had tested MT version on single 50 Mb .pack file, and I don't
>> know the performance profile for smaller files. What are the sizes  
>> for
>> your case?
>>
>> Thanks,
>> Aleksey.
>>
>> On Fri, Jul 18, 2008 at 9:20 PM, Andrew Cornwall
>> <andrew.pack200@gmail.com> wrote:
>>> I've been playing with HEAD + HARMONY-5916 + HARMONY-5918 on a  
>>> dual-core
>>> machine (which is probably what the majority will have at least  
>>> for now).
>> On
>>> my 467-file test case, it takes 57 seconds (vs 38 for the  
>>> nonthreaded
>>> version).
>>>
>>> It also looks as if it's doing something funny with resources (and
>> possibly
>>> even some .class files). I see many more differences in output  
>>> classes
>> than
>>> I see with the nonthreaded version. (This may be a difference in  
>>> output
>>> order of the JAR file: my diff tool is pretty limited).
>>>
>>> On Fri, Jul 18, 2008 at 8:46 AM, Sian January <
>> sianjanuary@googlemail.com>
>>> wrote:
>>>
>>>> According to the spec, "The value #archive_size is either zero or
>> declares
>>>> the number of bytes in the archive segment, starting immediately  
>>>> after
>>>> #archive_size_lo and before #archive_next_count and ending with  
>>>> the last
>>>> band, the *file_bits band. (That is, a non-zero size includes the  
>>>> size
>> of
>>>> #archive_next_count, *file_bits, and everything in between.) "
>>>>
>>>> So you'll need to minus a few bytes for the values you've already  
>>>> read
>> from
>>>> the second half of the header.
>>>>
>>>>
>>>> On 18/07/2008, Aleksey Shipilev <aleksey.shipilev@gmail.com> wrote:
>>>>>
>>>>> Sian,
>>>>>
>>>>> On Fri, Jul 18, 2008 at 5:29 PM, Sian January
>>>>> <sianjanuary@googlemail.com> wrote:
>>>>>>> Awesome! Am I understanding correctly: this value determines
the
>> size
>>>>>>> of segment? If yes, can you point me how to access this value?
 
>>>>>>> Is
>>>>>>> there API in current implementation?
>>>>>> Yes - use SegmentHeader.getArchiveSize()
>>>>>
>>>>> Does spec cover any alignment/padding constraints for segments?
>>>>> What exactly archive size specify?
>>>>>
>>>>> I'm doing this one [1]:
>>>>> 1. Reading the header of segment (moved from readSegment).
>>>>> 2. Check the field value, then either
>>>>> 3a. Read the segment into byte array and wrap it with BAIS, then
>>>>> read from BAIS
>>>>> 3b. Read the segment from global input stream
>>>>>
>>>>> I can only read first segment, second fails to read with the "bad
>>>>> header" exception.
>>>>>
>>>>> Thanks,
>>>>> Aleksey.
>>>>>
>>>>> [1]
>>>>>   void unpackRead(InputStream in) throws IOException,
>> Pack200Exception {
>>>>>       if (!in.markSupported())
>>>>>           in = new BufferedInputStream(in);
>>>>>
>>>>>       header = new SegmentHeader(this);
>>>>>       header.read(in);
>>>>>
>>>>>       int size = (int)header.getArchiveSize();
>>>>>
>>>>>       if (size != 0) {
>>>>>           byte[] data = new byte[size];
>>>>>           in.read(data);
>>>>>           bin = new ByteArrayInputStream(data);
>>>>>
>>>>>           readSegment(bin);
>>>>>       } else {
>>>>>           readSegment(in);
>>>>>       }
>>>>>   }
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Unless stated otherwise above:
>>>> IBM United Kingdom Limited - Registered in England and Wales with  
>>>> number
>>>> 741598.
>>>> Registered office: PO Box 41, North Harbour, Portsmouth,  
>>>> Hampshire PO6
>> 3AU
>>>>
>>>
>>

Mime
View raw message