Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A6721F664 for ; Mon, 22 Apr 2013 13:57:16 +0000 (UTC) Received: (qmail 19733 invoked by uid 500); 22 Apr 2013 13:57:16 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 19615 invoked by uid 500); 22 Apr 2013 13:57:16 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 19510 invoked by uid 99); 22 Apr 2013 13:57:16 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Apr 2013 13:57:16 +0000 Date: Mon, 22 Apr 2013 13:57:16 +0000 (UTC) From: "Sebb (JIRA)" To: issues@commons.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (IO-356) CharSequenceInputStream#reset() behaves incorrectly in case when buffer size is not dividable by data size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/IO-356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13637676#comment-13637676 ] Sebb edited comment on IO-356 at 4/22/13 1:56 PM: -------------------------------------------------- testIO_356 is also broken if readFirst > 0. That's because the initial read fills the byte buffer. The mark therefore saves the position after the first n chars have been read from the input. data1 gets the initial buffer load; data2 gets the next n chars. [later] I now think the test does make sense. Even though the individual bytes may be part of a multi-byte character, if the class is to support mark, it ought to do so as if it held plain bytes. If the mark is placed mid-character encoding, the returned bytes might not make much sense, but that's a problem for the application. For some cases, it would be possible to support mark/reset purely by adjusting the byte buffer pointers. However, if the byte buffer has been refilled, that won't work, and it becomes necessary to regenerate the byte buffer contents afresh. One way to do this would be to keep track of the of where the char buffer was just before the byte buffer was filled, as well as keeping track of the position in the byte buffer. In theory reset can then just re-encode the char buffer and update the byte buffer pointer. There may need to be some special processing at the start of the encoding. was (Author: sebb@apache.org): testIO_356 is also broken if readFirst > 0. That's because the initial read fills the byte buffer. The mark therefore saves the position after the first n chars have been read from the input. data1 gets the initial buffer load; data2 gets the next n chars. I'm not sure what the purpose of readFirst is. Anyway it makes little sense to read single bytes from an encoding that generates multiple bytes per char. > CharSequenceInputStream#reset() behaves incorrectly in case when buffer size is not dividable by data size > ---------------------------------------------------------------------------------------------------------- > > Key: IO-356 > URL: https://issues.apache.org/jira/browse/IO-356 > Project: Commons IO > Issue Type: Bug > Components: Streams/Writers > Affects Versions: 2.4 > Reporter: Dmitry Katsubo > Attachments: CharSequenceInputStreamTest.java > > > The size effect happens when buffer size of input stream is not dividable by requested data size. The bug is hidden in {{CharSequenceInputStream#reset()}} method which should also call (I think) {{bbuf.limit(0)}} otherwise next call to {{CharSequenceInputStream#read()}} will return the remaining tail which {{bbuf}} has accumulated. > In the attached test case the test fails, if {{dataSize = 13}} (not dividable by 10) and runs OK if {{dataSize = 20}} (dividable by 10). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira