Yes, the order of items in the jar must be exactly the order of the segments. Rather than read the entire stream in to memory, could you not have some kind of random access file that knows specifically about archive size and then sets up a buffered stream from a specific point in the stream and triggers a load in a background thread? After all, you want the data to start loading asyncronously. What about passing in an InputStrean that wraps a memory mapped file to do the data caching instead? The public pack APIs are Java5; it's only the internal one that is 1.4 compliant. Memory is a problem with the pack200 spec generally - you effectively have multiple lookup tables and have to recreate the known set of all files in a segment before writing them (in fact, one of the points of having segments was to avoid memory blowouts for large jars), although a compressor could opt to write out all classes in one segment, all text files in another, all gifs in another etc so that locality of similar data would enable a better post-pass compression. If there is an MT option, there probably needs to be a way of turning it off (at least avoiding in-men caching of files) because e.g. Eclipse will be calling the unpack repeatedly in the same process from other threads and memory exhaustion is a possibility. Or launching from a JavaWebStart app that has a handful of pack files that might be decoded in the same process. Alex Just a few thoughts ... Alex Sent from my (new) iPhone On 18 Jul 2008, at 20:30, "Aleksey Shipilev" wrote: > Andrew, > > I forgot to discuss three things: > > 1. The order of segments is not preserved in MT (multithreaded) > version. Should we care about that? > > 2. MT version exposes GC problems since there are no more room for > separate GC thread like it was in ST (single-threaded). Can you print > -verbose:gc for your tests and see how much time spent in GC? Parallel > GC should help too. Larger heap should also help. > > 3. I had tested MT version on single 50 Mb .pack file, and I don't > know the performance profile for smaller files. What are the sizes for > your case? > > Thanks, > Aleksey. > > On Fri, Jul 18, 2008 at 9:20 PM, Andrew Cornwall > wrote: >> I've been playing with HEAD + HARMONY-5916 + HARMONY-5918 on a dual- >> core >> machine (which is probably what the majority will have at least for >> now). On >> my 467-file test case, it takes 57 seconds (vs 38 for the nonthreaded >> version). >> >> It also looks as if it's doing something funny with resources (and >> possibly >> even some .class files). I see many more differences in output >> classes than >> I see with the nonthreaded version. (This may be a difference in >> output >> order of the JAR file: my diff tool is pretty limited). >> >> On Fri, Jul 18, 2008 at 8:46 AM, Sian January > > >> wrote: >> >>> According to the spec, "The value #archive_size is either zero or >>> declares >>> the number of bytes in the archive segment, starting immediately >>> after >>> #archive_size_lo and before #archive_next_count and ending with >>> the last >>> band, the *file_bits band. (That is, a non-zero size includes the >>> size of >>> #archive_next_count, *file_bits, and everything in between.) " >>> >>> So you'll need to minus a few bytes for the values you've already >>> read from >>> the second half of the header. >>> >>> >>> On 18/07/2008, Aleksey Shipilev wrote: >>>> >>>> Sian, >>>> >>>> On Fri, Jul 18, 2008 at 5:29 PM, Sian January >>>> wrote: >>>>>> Awesome! Am I understanding correctly: this value determines >>>>>> the size >>>>>> of segment? If yes, can you point me how to access this value? Is >>>>>> there API in current implementation? >>>>> Yes - use SegmentHeader.getArchiveSize() >>>> >>>> Does spec cover any alignment/padding constraints for segments? >>>> What exactly archive size specify? >>>> >>>> I'm doing this one [1]: >>>> 1. Reading the header of segment (moved from readSegment). >>>> 2. Check the field value, then either >>>> 3a. Read the segment into byte array and wrap it with BAIS, then >>>> read from BAIS >>>> 3b. Read the segment from global input stream >>>> >>>> I can only read first segment, second fails to read with the "bad >>>> header" exception. >>>> >>>> Thanks, >>>> Aleksey. >>>> >>>> [1] >>>> void unpackRead(InputStream in) throws IOException, >>>> Pack200Exception { >>>> if (!in.markSupported()) >>>> in = new BufferedInputStream(in); >>>> >>>> header = new SegmentHeader(this); >>>> header.read(in); >>>> >>>> int size = (int)header.getArchiveSize(); >>>> >>>> if (size != 0) { >>>> byte[] data = new byte[size]; >>>> in.read(data); >>>> bin = new ByteArrayInputStream(data); >>>> >>>> readSegment(bin); >>>> } else { >>>> readSegment(in); >>>> } >>>> } >>>> >>> >>> >>> >>> -- >>> Unless stated otherwise above: >>> IBM United Kingdom Limited - Registered in England and Wales with >>> number >>> 741598. >>> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire >>> PO6 3AU >>> >>