hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Bowen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-941) Enhancements to Hadoop record I/O - Part 1
Date Thu, 01 Mar 2007 05:52:50 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12476834
] 

David Bowen commented on HADOOP-941:
------------------------------------


One more comment: the variable int serialization unnecessarily uses 8 bytes for any number
less than -112.  This would penalize an app that uses a lot of negative shorts or ints.

I think there is a simple fix.  Instead of clearing and setting the sign bit for negative
numbers, flip all the bits.  I.e. in the writer change           

  i &= 0x7FFFFFFFFFFFFFFFL;

to:

  i ^= 0xFFFFFFFFFFFFFFFFL;

and do the same operation in the reader to get the negative number back.





> Enhancements to Hadoop record I/O - Part 1
> ------------------------------------------
>
>                 Key: HADOOP-941
>                 URL: https://issues.apache.org/jira/browse/HADOOP-941
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: record
>    Affects Versions: 0.10.1
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
>             Fix For: 0.12.0
>
>         Attachments: jute-patch.txt
>
>
> Hadoop record I/O can be used effectively outside of Hadoop. It would increase its utility
if developers can use it without having to import hadoop classes, or having to depend on Hadoop
jars. Following changes to the current translator and runtime are proposed.
> Proposed Changes:
> 1. Use java.lang.String as a native type for ustring (instead of Text.)
> 2. Provide a Buffer class as a native Java type for buffer (instead of BytesWritable),
so that later BytesWritable could be implemented as following DDL:
> module org.apache.hadoop.io {
>   record BytesWritable {
>     buffer value;
>   }
> }
> 3. Member names in generated classes should not have prefixes 'm' before their names.
In the above example, the private member name would be 'value' not 'mvalue' as it is done
now.
> 4. Convert getters and setters to have CamelCase. e.g. in the above example the getter
will be:
>   public Buffer getValue();
> 5. Generate clone() methods for records in Java i.e. the generated classes should implement
Cloneable.
> 6. Make generated Java codes for maps and vectors use Java generics.
> These are the proposed user-visible changes. Internally, the translator will be restructured
so that it is easier to plug-in translators for different targets.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message