hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmed Eldawy <aseld...@gmail.com>
Subject Which InputFormat to use?
Date Thu, 04 Jul 2013 18:29:59 GMT
Hi I'm developing a new set of InputFormats that are used for a project I'm
doing. I found that there are two ways to create  a new InputFormat.
1- Extend the abstract class org.apache.hadoop.mapreduce.InputFormat
2- Implement the interface org.apache.hadoop.mapred.InputFormat
I don't know why there are two versions which are incompatible. I found out
that for each one, there is a whole set of interfaces for different classes
such as InputSplit, RecordReader and MapReduce job. Unfortunately, each set
of classes is not compatible with the other one. This means that I have to
choose one of the interfaces and go with it till the end. I have two
questions basically.
1- Which of these two interfaces I should go with? I didn't find any
deprecation in one of them so they both seem legitimate. Is there any plan
to retire one of them?
2- I already have some classes implemented in one of the formats, does it
worth refactoring these classes to use the other interface, in case I used
he old format.
Thanks in advance for your help.

Best regards,
Ahmed Eldawy

View raw message