hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1549) All remaining UTF8 data structures in HDFS code should be removed
Date Mon, 28 Apr 2008 18:47:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12592872#action_12592872

Chris Douglas commented on HADOOP-1549:

There doesn't appear to be an easy way to "update" the Writables in this patch. If you wanted
to create new Writables and deprecate these- VectorWritable instead of ArrayWritable, for
example- then that might be a way to effect this, but the UTF8 and Text binaries are incompatible
and their differences- at first glance- are undetectable (i.e. unsigned shorts and vints cannot
be distinguished). Similarly, the change to WritableName is incompatible with SequenceFile::Reader,
and- again- SequenceFiles written prior to this patch must be distinguished by a version bump
and appropriate handling code. With a version bump, the changes to RPC, FileSplit, JobProfile,
JobStatus, Task, TaskStatus, and TaskTrackerStatus are probably fine, since they're meant
to be transient.

The tests are clearly OK, as replacements for ArrayWritable and ObjectWritable would be, but
the transient stuff will require some extra testing/attention. This seems like a fairly risky
incompatible change. It might be a good idea to spread some of this across several patches/JIRAs.

> All remaining UTF8 data structures in HDFS code should be removed
> -----------------------------------------------------------------
>                 Key: HADOOP-1549
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1549
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: Edward J. Yoon
>             Fix For: 0.18.0
>         Attachments: 1549.patch, 1549_v02.patch
> The UTF8 data structure is deprecated. HADOOP-1283 addressed part of the problem. All
remaining UTF8 data structurs in HDFS should be removed.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message