hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerry Lam <chiling...@gmail.com>
Subject Re: Streaming value of (200MB) from a SequenceFile
Date Sun, 31 Mar 2013 18:51:49 GMT
Hi Sandy:

Thank you for the advice. It sounds a logical way to resolve this issue. I will look into
the writable interface and see how I can stream the value from HDFS in a MapFileInputFormat.

I'm a bit concern when no one discussed about this issue because it might mean that I'm not
using hdfs the right way.



On 2013-03-31, at 14:10, Sandy Ryza <sandy.ryza@cloudera.com> wrote:

> Hi Jerry,
> I assume you're providing your own Writable implementation? The Writable readFields method
is given a stream.  Are you able to perform you able to perform your processing while reading
the it there?
> -Sandy
> On Sat, Mar 30, 2013 at 10:52 AM, Jerry Lam <chilinglam@gmail.com> wrote:
> Hi everyone,
> I'm having a problem to stream individual key-value pair of 200MB to 1GB from a MapFile.
> I need to stream the large value to an outputstream instead of reading the entire value
before processing because it potentially uses too much memory.
> I read the API for MapFile, the next(WritableComparable key, Writable val) does not return
an input stream.
> How can I accomplish this? 
> Thanks,
> Jerry

View raw message