hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1053) Make Record I/O functionally modular from the rest of Hadoop
Date Mon, 05 Mar 2007 18:18:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478116
] 

Doug Cutting commented on HADOOP-1053:
--------------------------------------

Milind, you're right: if we implement some things in the io package as records then we'll
have a circular package structure: these packages would no longer be well layered.  I don't
see how that would cost us much, but it is true.

If we want to (1) define some things that are currently in the io package as records, (2)
not duplicate code, and (3) keep things well layered, then we'd need to restructure things.
 The io runtime (e.g., readVInt, compareBytes, Writable, etc.), would need to be split into
a separate package from classes that we might define using the record package, like IntWritable
and BytesWritable, so that the package layering might be ioruntime > record > iostructs.

But what would be the point?  We could probably decompose nearly every package into well-layered
sub-packages, but that is disruptive, since it is not back-compatible. Occasionally it is
warranted, when packages get too big and poorly defined, and we have other reasons to change
public APIs.  For example, I would like to someday re-organize mapred into several sub-packages
(e.g., client, protocol, tasktracker, jobtracker), to rename org.apache.hadoop.dfs to be org.apache.hadoop.fs.hdfs,
to make the util package smaller, etc., but we don't want to rush into such changes lightly.

In summary, I still fail to see an overwhelming argument for making org.apache.hadoop.record
independent of org.apache.hadoop.io.  What am I missing?

> Make Record I/O functionally modular from the rest of Hadoop
> ------------------------------------------------------------
>
>                 Key: HADOOP-1053
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1053
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: record
>    Affects Versions: 0.11.2
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.13.0
>
>         Attachments: jute-patch.txt
>
>
> This issue has been created to separate one proposal originally included in HADOOP-941,
for which no consensus could be reached. For earlier discussion about the issue, please see
HADOOP-941.
> I will summarize the proposal here.  We need to provide a way for some users who want
to use record I/O framework outside of Hadoop.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message