hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 李耀宗 <lee_yiu_ch...@yahoo.com.INVALID>
Subject undocumented hadoop streaming properties stream.map.input.ignoreKey
Date Wed, 16 Mar 2016 02:01:07 GMT

I am using hadoop streaming, and found that if I specify -inputformat to use another InputFormat
org.apache.hadoop.mapred.lib.CombineTextInputFormat) instead of 
using the default org.apache.hadoop.mapred.lib.TextInputFormat, an extra key emits out to
the mapper program.

After digging the hadoop streaming source code, I found that there is a undocumented job property
stream.map.input.ignoreKey. If -inputformat is unset (or set to org.apache.hadoop.mapred.lib.TextInputFormat),
then this property is default to true, otherwise false. I have to manually set this property
to true (-D stream.map.input.ignoreKey=true) when issuing hadoop streaming command, if I want
to change -inputformat.

Actually this property was documented before, but somehow disappeared in recent documentation.
Is this property deprecated or simply somehow missed in documentation?

To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org

View raw message