hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom White <...@cloudera.com>
Subject Re: Why not making InputSplit implements interface Writable ?
Date Sat, 06 Feb 2010 17:01:54 GMT
Hi Jeff,

InputSplit in the new MapReduce API (in the o.a.h.mapreduce package)
does not implement Writable since splits can be serialized using any
serialization framework - e.g. Java object serialization. You can see
where splits are serialized at JobSplitWriter.writeNewSplits() and
deserialized on the task node at MapTask.getSplitDetails(). This is in
contrast to the old API which mandated that InputSplits had to be


On Fri, Feb 5, 2010 at 11:09 PM, Jeff Zhang <zjffdu@gmail.com> wrote:
> Hi all,
> I look at the source code of Hadoop, and found that the InputSplit did not
> implements Writable. As my understanding, InputSplit will been transfered to
> each TT and then deserialized. So it should implement the Writable
> interface. And I check each implementation of InputSplit, actually all the
> sub-classes implement the Writable interface. So I think it would be better
> to to let the abstract class InputForamt implement the Writable, then users
> won't forget to implement the method write(DataOutput out) and
> readFields(DataInput in) if he wants to write a customized InputSplit.
> --
> Best Regards
> Jeff Zhang

View raw message