hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Hadoop not splitting bzip2
Date Tue, 19 Apr 2011 19:07:11 GMT
Hello Deepak,

On Tue, Apr 19, 2011 at 9:33 PM, Deepak Diwakar <ddeepak4u@gmail.com> wrote:
> Hi,
>
>  I am using hadoop-0.20.1
> But when I use my own InputFormat say SafeInputFormat( extends
> FileInputFormat ) and allow isSplitable true. It executes multiple mappers,
> but fails when reducers reaches 33% for the large size(of order of 2 GB) of
> bzip2 files.

BZip2 splitting support was added to Apache Hadoop 0.21.0 release, and
isn't available in the Apache Hadoop 0.20.x. Was the 0.20.1 version a
typo?
Also, what reason/trace does the reducer throw up when it fails?

-- 
Harsh J

Mime
View raw message