hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Implement Writable which de-serializes to disk
Date Thu, 25 Nov 2010 10:47:27 GMT

I need to implement a Writable, which contains a lot of data, and
unfortunately I cannot break it down to smaller pieces. The output of a
Mapper is potentially a large record, which can be of any size ranging from
few 10s of MBs to few 100s of MBs.

Is there a way for me to de-serialize the Writable into a location on the
file system? Writable.readFields receives a DataInput only, which suggests I
should de-serialize it into RAM. If I could get a handle to the job/task's
output/temp directory, or just the temp directory, it'd be great - I could
de-serialize it there and read it in my Mapper/Reducer directly from the
file system.

I'm not sure I can use System.getProperty("java.io.tmpdir") - will that
work? Or is there a FileSystem API I should use instead?


View raw message