harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexei Fedotov" <alexei.fedo...@gmail.com>
Subject Re: [classlib][luni] optimized InputStream reader (HARMONY-5522)
Date Fri, 14 Mar 2008 17:30:57 GMT
I just want to have the following written somewhere until I forget: in
one of hot spots Eclipse guys parse the whole manifest to get two main
attributes, and then drop the object. There is no need to parse 100k
of ICU manifst to populate entry signatures for that use case. In
other words one may think of delaying parsing of other sections then
the main one.

On Fri, Mar 14, 2008 at 8:22 PM, Alexei Fedotov
<alexei.fedotov@gmail.com> wrote:
> Tim, thank you for reading and understanding. I like when my code
>  passes a review and I am eager to try arguments I've created for
>  myself. I thought of your variant and liked it because it would be
>  possible to JIT to inline your version of ByteStream.readFully()
>  method.
>
>  There is some complication caused with a public API exposed by
>  Manifest class: the constructor and read methods accept a general
>  InputStream as a source for Manifest. Other applications, such as
>  Eclipse, read a file themselves and then call Manifest API directly to
>  get few main attributes. There is a better chance that external
>  developers would use use documented ByteArrayInputStream to wrap their
>  bytes than "our possibly optimized internal child" of that class.
>
>  I also believe that I decrease overall project astonishment by
>  encapsulating more complex technique in one place instead of spreading
>  less conventional things around. Reflection is hidden in a class with
>  conventional API, and no one would question when reading
>  ZipFile#getInputStream(ZipEntry) code why he should wrap a resulted
>  byte buffer in "our possibly optimized internal child" instead of
>  ByteArrayInputStream.
>
>  Finally two these methods may be combined: in addition to
>  ByteArrayInputStream one may check in readFully(InputStream) if "our
>  possibly optimized internal child" is supplied an use a quick path
>  without reflection. I wonder what you and other java gurus think about
>  this.
>
>
>
>
>  On Fri, Mar 14, 2008 at 7:47 PM, Tim Ellison <t.p.ellison@gmail.com> wrote:
>  > Alexei Fedotov wrote:
>  >  > Let me write something more understandable. Breathe, Alexei, breathe slowly.
>  >
>  >  Code is good ;-)
>  >
>  >
>  >  > 1. Our native code returns us Zip entries in a form of a byte array
>  >  > which is later wrapped in ByteArrayInputSteam object.
>  >
>  >  So this is my point.  While I see that you are writing the ByteStream
>  >  util to reach into a regular ByteArrayInputStream to get the backing
>  >  array -- my question is why not change the
>  >  ZipFile#getInputStream(ZipEntry) implementation to return a ByteStream
>  >  (which will be a subclass of ByteArrayInputStream that will simply
>  >  return the backing array when asked without reflection).
>  >
>  >  I agree with the rest of the arguments below.
>  >
>  >  Regards,
>  >  Tim
>  >
>  >
>  >
>  >  > 2. Instead of copying portions of this ByteArrayInputSteam via
>  >  > read(buffer) call it is quicker to get a reference to the underlying
>  >  > buffer using reflection and work with this array.
>  >  > 3. This big buffer is useful for further manifest verification. This
>  >  > is a subject of separate [1]. Instead of copying manifest sections
>  >  > into smaller byte buffers, we maintain a list of relative positions in
>  >  > the big buffer, and update the digests using these positions.
>  >  > Generally, storing two relative positions is much cheaper than
>  >  > allocating memory and populating it with bytes. That is why I refer to
>  >  > this change as to implementation cleanup, and not an optimization.
>  >  > 4.  I wrote ideas of possible further cleanup in TODO comments. They
>  >  > could be described as follows: I removed two of six copies (acts of
>  >  > copying) of a manifest byte before it becomes accessible as a key or
>  >  > value in a java hash table. Another two copies are easy to be removed,
>  >  > so there is enough place for improvement here.
>  >  >
>  >  > If we would only know the algorithm which will be used for digesting,
>  >  > we may replace a manifest buffer with few signatures which might be
>  >  > updated while manifest is read. After algorithm is known the manifest
>  >  > may be verified in one pass.
>  >  >
>  >  >
>  >  >
>  >  > On Thu, Mar 13, 2008 at 6:03 PM, Tim Ellison <t.p.ellison@gmail.com>
wrote:
>  >  >> Alexei Fedotov wrote:
>  >  >>  > I see the question. The general InoouttStream cannot be optimized,
but
>  >  >>  > our implementation returns ByteArrayInputStream. We can choose
a fast
>  >  >>  > path without using arraycopy  if we are first to read from this
>  >  >>  > ByteArrayInputStream object. We just get get a reference to the
>  >  >>  > underlying buffer and return it without copying the contents. This
>  >  >>  > works only for unmodified ByteArrayInputStream object, but this
is
>  >  >>  > exactly our case.
>  >  >>
>  >  >>  Yep, I see that.  Now how will you use it (i.e. can you show the code
>  >  >>  that references ByteStream)?  I can see where you are going but want
to
>  >  >>  see the whole picture.
>  >  >>
>  >  >>  Regards,
>  >  >>  Tim
>  >  >>
>  >  >>
>  >  >
>  >  >
>  >  >
>  >
>
>
>
>  --
>  With best regards,
>  Alexei
>



-- 
With best regards,
Alexei

Mime
View raw message