hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 丛林 <congli...@gmail.com>
Subject Re: How to create a SequenceFile more faster?
Date Thu, 12 May 2011 11:05:58 GMT
Dear Harsh,

Will you please explain how to create a sequence file in the way of mapreduce?

Suppose that all 32G little file stored in one PC.

Thanks for your suggestion.

BTW: I notice that you repeated most of the topic of sequence file in
this mail-list :-)

Best Wishes,

-Lin


2011/5/12 Harsh J <harsh@cloudera.com>:
> Are you doing this as a MapReduce job or is it a simple linear
> program? MapReduce could be much faster (Combined-files input format,
> with a few Reducers for merging if you need that as well).
>
> On Thu, May 12, 2011 at 5:18 AM, 丛林 <conglin02@gmail.com> wrote:
>> Hi, all.
>>
>> I want to write lots of little files (32GB) to HDFS as
>> org.apache.hadoop.io.SequenceFile.
>>
>> But now it is too slow: we use about 8 hours to create this
>> SequenceFile (6.7GB).
>>
>> So I wonder how to create this SequenceFile more faster?
>>
>> Thanks for your suggestion.
>>
>> -Best Wishes,
>>
>> -Lin
>>
>
>
>
> --
> Harsh J
>

Mime
View raw message