hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Rosenstrauch (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6298) BytesWritable#getBytes is a bad name that leads to programming mistakes
Date Mon, 13 Dec 2010 21:04:02 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12971034#action_12971034
] 

David Rosenstrauch commented on HADOOP-6298:
--------------------------------------------

Yeesh.  I just got bit on this same bug, but from a different direction.

Calling BytesWritable.getBytes() returns a reference to the BytesWritable's internal byte
array.  I was calling that, and then using that byte array in subsequent processing.  Problem
is that the BytesWritable was also still holding onto a copy of that array, and later modifying
it - thus modifying my copy as well.  This was a really subtle bug that was hard to find,
and I wasted a lot of time on it.

I realize there's a need to get access to a BytesWriteable's internal byte storage without
performing an array copy.  But again, I think there needs to be some additional *safe* method
to retrieve a byte array that's a *copy* of a ByteWriteable's contents.  There's just too
many potential pitfalls for developers if the situation is just left as is.

> BytesWritable#getBytes is a bad name that leads to programming mistakes
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-6298
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6298
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.20.1
>            Reporter: Nathan Marz
>
> Pretty much everyone at Rapleaf who has worked with Hadoop has misused BytesWritable#getBytes
at some point, not expecting the byte array to be padded. I think we can completely alleviate
these programming mistakes by deprecating and renaming this method (again) to be more descriptive.
I propose "getPaddedBytes()" or "getPaddedValue()". It would also be helpful to have a helper
method "getNonPaddedValue()" that makes a copy into a non-padded byte array. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message