hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mohamed Riadh Trad <Mohamed.t...@inria.fr>
Subject MapRed Split Size
Date Thu, 01 Jul 2010 14:32:31 GMT

Has any one addressed the org.apache.hadoop.mapreduce.lib.input.TextInputFormat compatibility
with hadoop streaming?

The new API generates the following exception when lunching pipes jobs with  org.apache.hadoop.mapreduce.lib.input.TextInputFormat
 Input Format instead of org.apache.hadoop.mapred.TextInputFormat.

Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: class org.apache.hadoop.mapreduce.lib.input.TextInputFormat
not org.apache.hadoop.mapred.InputFormat

My problem with the deprecated classes stands in mapred.min.split.size and the Map Tasks number.

I need to generate N Maps on splits of approximately a same size. However, by fixing the 
mapred.min.split.size to 20MB I get splits of 6 to 64 MB.

Any suggestions?

Trad Mohamed Riadh, M.Sc, Ing.
PhD. student

Office: 11-15
Phone: (33)-1 39 63 59 33
Fax: (33)-1 39 63 56 74
Email: Riadh.Trad(a)inria.fr
Home page: http://www-rocq.inria.fr/who/Mohamed.Trad/

View raw message