hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jothi Padmanabhan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5779) KeyFieldBasedPartitioner would lost data if specifed field not exist, and it should encode free not only support utf8
Date Sun, 17 May 2009 11:03:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12710191#action_12710191
] 

Jothi Padmanabhan commented on HADOOP-5779:
-------------------------------------------

Some minor comments:

* I think it would be better to have the instance check for the keys in a single if block
{code}
if (key instanceof BytesWritable) {
// Handle BytesWritable
}
else if (key instanceof Text) {
// Handle Text
}
else {
// error
}
{code}

* A test case to test for Text and BytesWritable keys would be good to have for this patch.
It could either be a new test case or could modify TestStreamDataProtocol. Also, if the test
case can demonstrate the fix for ArrayOutOfBoundsException -- it should fail without this
patch and run with this patch, it would be really nice.

> KeyFieldBasedPartitioner would lost data if specifed field not exist, and it should encode
free not only support utf8
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5779
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5779
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: ZhuGuanyin
>             Fix For: 0.21.0
>
>         Attachments: encode-free-KeyFieldBasedPartitioner-v1.patch, encode-free-KeyFieldBasedPartitioner.patch
>
>
> 1) Currently,  KeyFieldBasedPartitioner only support utf8 encoded recored,  we should
use text or byteswriteable data types.
> 2) when using KeyFieldBasedPartitioner, if the record doesn't contain the specified field,
the endChar would equal with array.length, which throw ArrayOutOfIndex exception, losting
that record!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message