harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Zhang <zhanghuang...@gmail.com>
Subject Re: [classlib] charset decoding
Date Wed, 26 Apr 2006 11:56:26 GMT
Vladimir Strigun wrote:
> On 4/26/06, Andrew Zhang <zhanghuangzhu@gmail.com> wrote:

> I try to find best solution for Harmony-166 and during fix preparation
> I've found current issue.
> Method test_read from tests.api.java.io.InputStreamReaderTest failed
> and first fix I've created was with in.available() but I agree with
> Mikhail that it's possible not the best solution.

Yes, the spec says "The available method for class InputStream always 
returns 0.". But it also says "This method should be overridden by 
subclasses.". :)

Here's the available description:
"Returns the number of bytes that can be read (or skipped over) from 
this input stream without blocking by the next caller of a method for 
this input stream."

If someone writes a subclass of InputStream that available returns 0 
even if there are some data available, then he should take the result of 
cheating or contradicting with spec.
> In current implementation of InputStreamReader endOfInput variable
> sets to true if reader can't read less that 8192 bytes. When
> InputStreamReader try to read one character from
> LimitedByteArrayInputStream (A ByteArrayInputStream that only returns
> a single byte per read) true as a parameter passed to charset decoder,
> nevertheless we still have further input from
> LimitedByteArrayInputStream.
> 2 methods from tests.api.java.io.OutputStreamWriterTest also failed
> because of read() method implementation of InputStreamReader.
> IMO, we have 2 ways to fix Harmony-166 :
> 1. Don't change the behavior of CharsetDecoder, and use in.available()
> method for endOfInput variable. If in.available() > 0 pass false to
> decoder, read additional one byte from input stream and try to decode
> again. If decoding operation can decode remaining portion + one
> additional byte to character, stop here, else try to read addition
> byte again. Function fillBuf will invoke with the next invocation of
> read() method and next portion of 8192 bytes will be read from input
> stream.
It's reasonable.
After all, streaming decode is useful and helpful to users.
IMO, streaming decode is a highlight compared with some RI :)

> 2. Change the behavior of CharsetDecoder and operate with
> bytes.remaining() for calculating endOfInput. In this case, we always
> pass false with first invocation of decode method. Algorithm further
> is almost the same as above, but we stop the cycle if all bytes
> decoded successfully (i.e if bytes.hasRemaining() is false).
> What do you think about it?
> Thanks.
> Vladimir.


Andrew Zhang
China Software Development Lab, IBM

>> Any comment?
>> Thanks !
>> Vladimir Strigun commented on HARMONY-410:
>> ------------------------------------------
>> Hi Richard,
>> Thanks for the clarification, I agree that streaming decode is good thing,
>> but I'd like to explain my understanding of specification :)
>> According to specification of CharsetDecoder decoding operation should
>> process by the following steps:
>> "
>> 2. Invoke the decode method zero or more times, as long as additional input
>> may be available, passing false for the endOfInput argument and filling the
>> input buffer and flushing the output buffer between invocations;
>> 3. Invoke the decode method one final time, passing true for the endOfInput
>> argument; and then
>> "
>> spec also says:
>> "The buffers' positions will be advanced to reflect the bytes read and the
>> characters written, but their marks and limits will not be modified"
>> I understand these sentences in the next way:
>> invoke decode with endOfInput = false if you have additional input, then
>> fill the buffer (i.e. add to buffer some additional input), invoke decode
>> again and pass correct endOfInput parameter dependent of availability of
>> input.
>> Example you provided is very useful and, of course, 1st option looks better,
>> but what I'm suggest here is to reflect actual processed bytes in input.
>> After first invocation of decode, not all bytes processed actually, i.e.
>> almost all bytes processed, but some stored in decoder to further operation.
>> Is it possible to store bytes in decoder, support streaming decoding, and,
>> at the same time sets correct position in input buffer after each operation?
>> Thanks.
>> Vladimir.
>>> method decode(ByteBuffer, CharBuffer, boolean) should set correct position
>> in ByteBuffer
>> ----------------------------------------------------------------------------------------
>>>          Key: HARMONY-410
>>>          URL: http://issues.apache.org/jira/browse/HARMONY-410
>>>      Project: Harmony
>>>         Type: Bug
>>>   Components: Classlib
>>>     Reporter: Vladimir Strigun
>>>     Assignee: Mikhail Loenko
>>>     Priority: Minor
>>>  Attachments: Harmony-410_patch.txt, harmony-410_test.txt
>>> When ByteBuffer contain incomplete sequence of bytes for successful
>> decoding, position in ByteBuffer should be set to latest successful byte. I
>> will attach testcase and patch soon.
>> --
>> This message is automatically generated by JIRA.
>> -
>> If you think it was sent incorrectly contact one of the administrators:
>>  http://issues.apache.org/jira/secure/Administrators.jspa
>> -
>> For more information on JIRA, see:
>>  http://www.atlassian.com/software/jira
>> --
>> Andrew Zhang
>> China Software Development Lab, IBM
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

View raw message