hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Cen <cenyo...@gmail.com>
Subject Re: Splitting a big file into pieces with Hadoop Streaming?
Date Fri, 20 Mar 2009 14:10:10 GMT
i have a similar problem earlyer, and i just use the split and awk to split
the file.

2009/3/20 Akira Kitada <akitada@gmail.com>

> Hi,
> Can I split a input file into pieces based on the key? (Probably the
> hash value of the key)
> Considering Hadoop streaming is a kind of shell pipelines,
> it seems to be impossible to do this, but I wanted to double-check
> this to be sure.
> Background: The output(an index file) is so large (more than 10G) that
> it slows down my applications using that file without splitting it into
> pieces.
> Thanks in advance.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message