harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Strigun" <vstri...@gmail.com>
Subject [classlib] charset decoding
Date Wed, 26 Apr 2006 10:20:44 GMT
On 4/26/06, Andrew Zhang <zhanghuangzhu@gmail.com> wrote:
> Hi Vladimir,
> I think we both understand the spec :)
> Here I propose two solutions:
> 1. Set the ByteBuffer to the limit, and store the remaining ByteBuffer in
> decoder. Invokers doesn't care the position of in. If the result is
> UNDERFLOW and there're furthur input, just pass the new input to the
> decoder. It's a typical streaming decoder.  That's what Harmony does
> currently.
> 2. Decoder doesn't store remaining ByteBuffer. Position of "in" is set to
> indicate the remaining ByteBuffer. Invoker should take care to generate new
> input ByteBuffer for next invocation.  RI acts in this way.
> Both are acceptable.
> However, I think solution 1 is more reasonable.
> "Is it possible to store bytes in decoder, support streaming decoding, and,
> at the same time sets correct position in input buffer after each operation?
> "
> Yes.  In fact, your patch will make Harmony act as the description above.
> However, I don't think it solve the problem. Maybe invoker is more
> confusable and may think:
> "Is the remaining bytebuffer maintained in decoder or in ?  Shall I get the
> remaining buffer from in and pass it for next invocation?"
> Anyway, we need a decision on this compatibility issue.
> By the way, Vladimir, does solution one cause any problem on other classlib
> implementation?

I try to find best solution for Harmony-166 and during fix preparation
I've found current issue.

Method test_read from tests.api.java.io.InputStreamReaderTest failed
and first fix I've created was with in.available() but I agree with
Mikhail that it's possible not the best solution.

In current implementation of InputStreamReader endOfInput variable
sets to true if reader can't read less that 8192 bytes. When
InputStreamReader try to read one character from
LimitedByteArrayInputStream (A ByteArrayInputStream that only returns
a single byte per read) true as a parameter passed to charset decoder,
nevertheless we still have further input from

2 methods from tests.api.java.io.OutputStreamWriterTest also failed
because of read() method implementation of InputStreamReader.

IMO, we have 2 ways to fix Harmony-166 :

1. Don't change the behavior of CharsetDecoder, and use in.available()
method for endOfInput variable. If in.available() > 0 pass false to
decoder, read additional one byte from input stream and try to decode
again. If decoding operation can decode remaining portion + one
additional byte to character, stop here, else try to read addition
byte again. Function fillBuf will invoke with the next invocation of
read() method and next portion of 8192 bytes will be read from input

2. Change the behavior of CharsetDecoder and operate with
bytes.remaining() for calculating endOfInput. In this case, we always
pass false with first invocation of decode method. Algorithm further
is almost the same as above, but we stop the cycle if all bytes
decoded successfully (i.e if bytes.hasRemaining() is false).

What do you think about it?


> Any comment?
> Thanks !
> Vladimir Strigun commented on HARMONY-410:
> ------------------------------------------
> Hi Richard,
> Thanks for the clarification, I agree that streaming decode is good thing,
> but I'd like to explain my understanding of specification :)
> According to specification of CharsetDecoder decoding operation should
> process by the following steps:
> "
> 2. Invoke the decode method zero or more times, as long as additional input
> may be available, passing false for the endOfInput argument and filling the
> input buffer and flushing the output buffer between invocations;
> 3. Invoke the decode method one final time, passing true for the endOfInput
> argument; and then
> "
> spec also says:
> "The buffers' positions will be advanced to reflect the bytes read and the
> characters written, but their marks and limits will not be modified"
> I understand these sentences in the next way:
> invoke decode with endOfInput = false if you have additional input, then
> fill the buffer (i.e. add to buffer some additional input), invoke decode
> again and pass correct endOfInput parameter dependent of availability of
> input.
> Example you provided is very useful and, of course, 1st option looks better,
> but what I'm suggest here is to reflect actual processed bytes in input.
> After first invocation of decode, not all bytes processed actually, i.e.
> almost all bytes processed, but some stored in decoder to further operation.
> Is it possible to store bytes in decoder, support streaming decoding, and,
> at the same time sets correct position in input buffer after each operation?
> Thanks.
> Vladimir.
> > method decode(ByteBuffer, CharBuffer, boolean) should set correct position
> in ByteBuffer
> >
> ----------------------------------------------------------------------------------------
> >
> >          Key: HARMONY-410
> >          URL: http://issues.apache.org/jira/browse/HARMONY-410
> >      Project: Harmony
> >         Type: Bug
> >   Components: Classlib
> >     Reporter: Vladimir Strigun
> >     Assignee: Mikhail Loenko
> >     Priority: Minor
> >  Attachments: Harmony-410_patch.txt, harmony-410_test.txt
> >
> > When ByteBuffer contain incomplete sequence of bytes for successful
> decoding, position in ByteBuffer should be set to latest successful byte. I
> will attach testcase and patch soon.
> --
> This message is automatically generated by JIRA.
> -
> If you think it was sent incorrectly contact one of the administrators:
>  http://issues.apache.org/jira/secure/Administrators.jspa
> -
> For more information on JIRA, see:
>  http://www.atlassian.com/software/jira
> --
> Andrew Zhang
> China Software Development Lab, IBM

Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

View raw message