hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Donofrio (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8323) Revert HADOOP-7940
Date Fri, 27 Apr 2012 12:07:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13263592#comment-13263592

Jim Donofrio commented on HADOOP-8323:

> Clear call ought to clear memory and thats what this change actually intends to do (though
the test case may lead you astray - filed HADOOP-8324).
> The current javadocs indeed cover the getBytes usage behavior as you've pointed out.
So if you'd like to keep the size in its increased state, why clear() it?

     * Resets the <code>count</code> field of this byte array output
     * stream to zero, so that all currently accumulated output in the
     * output stream is discarded. The output stream can be used again,
     * reusing the already allocated buffer space.
     * @see     java.io.ByteArrayInputStream#count
    public synchronized void reset() {
        count = 0;

I would like to clear it so that the next time I call append, it will start at the beginning
of the internal array without having to scale up the size of the array again. The ByteArrayOutputStream
class in java.io which is every similar to Text has a reset method similar to clear that just
sets the internal length to 0 instead of freeing the allocated internal buffer.

However, I understand the need to free memory so why not leave clear as it is and add a clearBytes
method which sets the length to 0 and sets the bytes to EMPTY_BYTES.
> Revert HADOOP-7940
> ------------------
>                 Key: HADOOP-8323
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8323
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 2.0.0
>            Reporter: Harsh J
>            Assignee: Harsh J
>            Priority: Critical
> Per [~jdonofrio]'s comments on HADOOP-7940, we should revert it as it has caused a performance
regression (for scenarios where Text is reused, popular in MR).
> The clear() works as intended, as the API also offers a current length API.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message