db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From TomohitoNakayama <tomon...@basil.ocn.ne.jp>
Subject Re: [jira] Updated: (DERBY-721) State of InputStream retrieved from resultset is not clean , if there exists previous InputStream .
Date Wed, 07 Dec 2005 11:12:38 GMT
Hello.

Mike Matrigali wrote:

>For embedded I was worried about your description of changes, that made
>it sound like you somehow were going to buffer the blob in memory.  I
>see from your changes you basically added reset calls if the underlying
>stream was resetable.  What I don't know is what happens in the 2 gig
>blob/clob cases, either you will have to investigate or maybe someone
>on the list knows?
>
As you saw, the patch does not make new cache, just resets the stream.
I tested just case of 1 mega lob and confirmed lob was streamed from the 
beginning from 2nd InputStream.
I don't think there exists qualitative difference in behavior between 1M 
and  1G, though I'm not completely confirmed it.


However, I understand your opinion that this patch will implicitly 
restrict implementation of network client ,
that entire information streamed from server have to be stored , because 
streaming between server and client are performed only once.

Now I think it is preferable to throw Exception
when 2nd Reader/InputStream for same value in result was retrieved or
when Reader/InputStream was retrieved in different order as in sql.
// I hope other's opinion around restriction not to allow user to 
retrieve Reader/InputStream for result columns in different order as in 
sql .

Once, I think the restriction may be too hard for user ,
however I conclude the restriction is reasonable because ResultSet is 
not cache for set of result and
have characteristic of  Stream (especially when lob was used ).
If user needs cache , the cache should be developed as separated from 
ResultSet .

Thank you for your suggestion.
I didn't realize this Stream like characteristics of ResultSet .

Best regards.


Mike Matrigali wrote:

>I don't have enough information to completely answer, but will
>try to state my opinion on the issue.
>
>I think the goals should be:
>1) provide same behavior in embedded and network server mode.
>2) provide same behavior whether the blob is "small" or "large".
>3) optimize the standard case of getting the column once in jdbc,
>   as the spec allows.
>4) If at all possible when selecting a blob/clob as a stream it should
>   not be necessary to materialize the entire stream in memory.
>
>For embedded I was worried about your description of changes, that made
>it sound like you somehow were going to buffer the blob in memory.  I
>see from your changes you basically added reset calls if the underlying
>stream was resetable.  What I don't know is what happens in the 2 gig
>blob/clob cases, either you will have to investigate or maybe someone
>on the list knows?
>
>For embedded it is theoretically possible for the reset of the stream
>to go all the way back to store and read it again from the beginning.
>For network client this seems even more complicated to do in an
>optimized way (I believe you are looking at improving the streaming
>behavior of large objects to network client so I defer to you in
>how hard this may be).
>
>My opinion would be to make the second reference throw an error, to
>make that behavior consistent in network server, embedded, long and
>short blob/clob streams.  And to document that behavior.
>
>Having said that, I am not against the code working as you are moving
>toward as long as it does not cause a memory/runtime performance issue
>for the normal single get stream case.
>
>TomohitoNakayama wrote:
>
>  
>
>>Hello Daniel and Mike .
>>
>>Do you think it is preferable not to allow user to call getXXXXStream
>>twice from one row ,
>>in order to make a room for releasing memory for cache in ResultSet as
>>soon as possible ?
>>
>>Best regards.
>>
>>
>>Daniel John Debrunner wrote:
>>
>>    
>>
>>>Mike Matrigali wrote:
>>>
>>> 
>>>
>>>      
>>>
>>>>Is there anything in the standard that says what the second call to
>>>>the get the stream has to do?  Imagine the case where the first
>>>>stream reads 1 gig of a 2 gig blob, does the second call to
>>>>getBinaryStream() have to return the 1st gig again?
>>>>  
>>>>        
>>>>
>>>Yes & no.
>>>
>>>Nothing in the JDBC spec doc, but the javadoc for java.sql.ResultSet has
>>>always had:
>>>
>>>" For maximum portability, result set columns within each row should be
>>>read in left-to-right order, and each column should be read only once."
>>>
>>>Thus, Derby could thrown an exception if there was a second getXXXStream
>>>call on the same column.
>>>
>>> 
>>>
>>>      
>>>
>>>>Any change that tries to cache the bytes returned by the first
>>>>getBinaryStream either in local client or network client code is
>>>>going to be a performance/memory drain.
>>>>  
>>>>        
>>>>
>>>Agreed, we need to be careful here, we need to optmise the frequent
>>>case, getting the column's value once as-per JDBC.
>>>
>>>Dan.
>>>
>>>
>>>
>>>
>>> 
>>>
>>>      
>>>
>
>
>  
>

-- 
/*

        Tomohito Nakayama
        tomonaka@basil.ocn.ne.jp
        tomohito@rose.zero.ad.jp
        tmnk@apache.org

        Naka
        http://www5.ocn.ne.jp/~tomohito/TopPage.html

*/ 


Mime
View raw message