hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milind Bhandarkar (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1031) Enhancements to Hadoop record I/O - Part 2
Date Tue, 06 Mar 2007 19:59:24 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478546

Milind Bhandarkar commented on HADOOP-1031:

I would like to add another proposed change to record I/O here. Currently hadoop.record.RecordReader
and RecordWriter act as factories for various InputArcchive and OutputArchive recently. In
the original design, this was done in order to have tight control over various serialization
formats. This has proven to be counterproductive. For wider usage of record I/O one should
be able to use their own serialization formats. The proposed changes make it possible. They
are as follows:

1. Eliminate current record.RecordReader and record.RecordWriter.

2. rename InputArchive as RecordReader, and OutputArchive as RecordWriter.

3. rename various archives accordingly. e..g. BinaryInputArchive -> BinaryRecordReader

> Enhancements to Hadoop record I/O - Part 2
> ------------------------------------------
>                 Key: HADOOP-1031
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1031
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: record
>    Affects Versions: 0.11.2
>         Environment: All
>            Reporter: Milind Bhandarkar
>         Assigned To: Milind Bhandarkar
> Remaining planned enhancements to Hadoop record I/O:
> 5. Provide a 'swiggable' C binding, so that processing the generated C code with swig
allows it to be used in scripting languages such as Python and Perl. 
> 7. Optimize generated write() and readFields() methods, so that they do not have to create
BinaryOutputArchive or BinaryInputArchive every time these methods are called on a record.

> 8. Implement ByteInStream and ByteOutStream for C++ runtime, as they will be needed for
using Hadoop Record I/O with forthcoming C++ MapReduce framework (currently, only FileStreams
are provided.) 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message