hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Antonio D'Ettole" <coda...@gmail.com>
Subject Custom InputFormat, problem with constructors
Date Fri, 11 Dec 2009 19:03:01 GMT

I've been trying to code a pretty simple InputFormat. The idea is this: I
have an array of numbers (say, the range [0-5000]) and I want each mapper to
receive a split of size 500 i.e. 500 LongWritable's.

this is an excerpt from the class extending InputSplit:

public class myInputSplit extends InputSplit implements Writable {

long[] rows;
        myInputSplit(){ }

public myInputSplit(long[] rows) {



I also wrote the classes myInputFormat and myRecordReader (omitted).

Now, the default constructor in the class above doesn't do much but I had to
put it there anyway because hadoop was throwing an exception at runtime
because it couldn't find said constructor. Obviously myInputFormat uses the
right constructor with the long[] argument, but hadoop sems somehow to give
the mapper input splits which have been built using the default constructor,
which is used nowhere in my code. I can tell because i put a breakpoint in
the default constructor and yes, it is being called. As a result all the
input splits that are processed by the mappers are "broken" as the "rows"
variable was never set.
Interestingly, I also put a breakpoint in the _right_ constructor and it is
also being called, by the getSplits() method in myInputFormat (which is what
one would expect)

Does anybody have an idea why the default constructor is being called?

I hope I was clear enough, thanks for your time.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message