db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From TomohitoNakayama <tomon...@basil.ocn.ne.jp>
Subject Re: [jira] Updated: (DERBY-326) Improve streaming of large objects for network server and client
Date Mon, 06 Mar 2006 13:49:18 GMT
Hello Bryan.
Thank you for your reading the patch.


I will answer your questions...

> 1) In DDMWriter.writeScalarStream(), at about line 702, there is the

<snip>

>    Do you know how to make that code be triggered?

> 2) In DRDAConnThread.writeEXTDTA(), at about line 7487, there is the

<snip>

>    Do you know how to make writeEXTDTA() be called with an object other
>    than a EXTDTAInputStream? 


I don't know both of them.
I just read the callee modules and respected original construction.

I think it is matter of taking risk to remove those code which may be 
not used now ....

At least, I think it is needed to try derbyall with some more sanity 
checking code to remove them.


> 3) I had two thoughts about the code which wraps the various stream 
> objects
>    in BufferedInputStreams in order to use the mark() and reset() 
> methods: 

>    a) It seems that there are two different places where we perform the
>       wrapping of the streams: once in DDMWriter.writeScalarStream, and
>       once in EXTDTAInputStream.openInputStreamAgain(). That seemed a bit
>       unfortunate, and I was wondering if you thought we could arrange
>       things so that there was exactly one place where we did this stream
>       wrapping. 

I see. I will reconsider how to accomplish.


>    b) It seemed that the streams that were being wrapped were all streams
>       of our own implementation, such as the ReEncodedInputStream, and 
> the
>       streams which are returned by Blob.getBinaryStream and by
>       Clob.getAsciiStream(). Rather than wrapping all of these streams in
>       BufferedInputStreams, could we not just implement the mark() and
>       reset() methods on our streams, and avoid the extra overhead of the
>       stream wrapping? 

Hmm....
I don't want to make mark/reset implementation other than implemented in 
java.io.BufferInputStream already ....
Wrapping those our streams when they are made would be considerable ...


> 4) I didn't really understand the method openInputStreamAgain() in
>    EXTDTAInputStream. It seems that, for a Blob or Clob, we open the
>    stream to the Blob/Clob, then read a single byte from the stream just
>    to see if we will get end-of-file or not, and then if we didn't get
>    end-of-file, we close and re-open the stream.
>
>    Am I understanding this code properly? If so, it seems rather awkward.
>    Three ideas presented themselves to me:
>
>    a) Perhaps there is a method that we can call on the Blob/Clob object
>       to figure out if it is null or not, other than reading the first
>       byte from the stream?
>    b) Or, if we have to read the first byte, maybe we could just hang on
>       to that first byte in a local buffer, rather than having to close
>       and re-open the stream in order to be able to re-read that byte.
>    c) Or, since we are going to need to have a markSupported() stream
>       anyway, perhaps we could wait until we have constructed the 
> markSupported
>       stream, and then use the mark/reset support to peek at the first 
> byte
>       to see if the Blob/Clob is empty or not. 

About a) I found that length() method of Blob/Clob have problem in 
performance because they read up all of the stream and return length.
About b) it sounds like what is markSupported ....
Then c) would be the best answer.
//Considering with 3)-b) , wrapping stream with BufferInputStream when 
they are made seems to be right way ...


> 5) It seems that the streaming implementation in DDMWriter always does an
>    actual I/O (that is, calls sendBytes()) for each 32K segment. While 
> that
>    is certainly correct, I found myself wondering how much of a 
> performance
>    impact it was having for *very* large data transfers. If users are 
> using
>    Derby to manage blobs which are 10's or 100's of megabytes in size, 
> then
>    I wonder if there are still more performance benefits that could be
>    realized by batching up multiple 32K DSS Layer B segments and then 
> sending
>    them with a single call to sendBytes(). For example, maybe we could
>    only do a hard I/O call on each 4th or 8th segment, and give the 
> network
>    a chance to work with 128K or 256K worth of bytes at a time.
>
>    I don't know if this would make a real difference or not, and it would
>    obviously remove some of the memory reduction benefit of the rest 
> of your
>    changes, but it occurred to me and I wanted to mention it. 

I think it would be nice to place buffer, which was configurable when to 
be flushed, before sending stream to .


> 6) Lastly, it would be nice to add a comment to the method
>    EXTDTAInputStream.initInputStream(), specifically to explain the
>    meaning of the return value from that method, since it's a bit subtle. 

I see.
I will add comments for them.


Now, I'm running measurement of the performance .
It will take for some more.

And then, I will start 1)-6).


Best regards.

-- 
/*

        Tomohito Nakayama
        tomonaka@basil.ocn.ne.jp
        tomohito@rose.zero.ad.jp
        tmnk@apache.org

        Naka
        http://www5.ocn.ne.jp/~tomohito/TopPage.html

*/ 


Mime
View raw message