hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Basjes <Ni...@basjes.nl>
Subject Deprecated configuration settings set from the core code / {core,hdfs,...}-default.xml ??
Date Thu, 21 Aug 2014 08:15:24 GMT
Hi,

I found this because I was wondering why simply starting something as
trivial as the pig grunt gives the following messages during startup:

2014-08-21 09:36:55,171 [main] INFO
 org.apache.hadoop.conf.Configuration.deprecation - *mapred.job.tracker is
deprecated*. Instead, use mapreduce.jobtracker.address
2014-08-21 09:36:55,172 [main] INFO
 org.apache.hadoop.conf.Configuration.deprecation - *fs.default.name
<http://fs.default.name> is deprecated*. Instead, use fs.defaultFS

What I found is that these settings are not part of my config but they are
part of the 'core hadoop' files.

I found that the mapred.job.tracker is set from code when using the mapred
package (probably this is what pig uses)
https://github.com/apache/hadoop-common/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/JobClient.java#L869

and that the fs.default.name is explicitly defined here as 'deprecated' in
one of the *-default.xml config files.
https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/resources/core-default.xml#L524

I did some more digging and found that there are several other properties
that have been defined as deprecated that are still present in the various
*-default.xml files throughout the hadoop code base.

I used this list as a reference:
https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/site/apt/DeprecatedProperties.apt.vm

The ones I found so far:
./hadoop-common-project/hadoop-common/src/main/resources/core-default.xml:
 <name>fs.default.name</name>
./hadoop-common-project/hadoop-common/src/main/resources/core-default.xml:
 <name>io.bytes.per.checksum</name>
./hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml:
<name>mapreduce.job.counters.limit</name>
./hadoop-tools/hadoop-distcp/src/main/resources/distcp-default.xml:
 <name>mapred.job.map.memory.mb</name>
./hadoop-tools/hadoop-distcp/src/main/resources/distcp-default.xml:
 <name>mapred.job.reduce.memory.mb</name>
./hadoop-tools/hadoop-distcp/src/main/resources/distcp-default.xml:
 <name>mapreduce.reduce.class</name>

Seems to me fixing these removes a lot of senseless clutter in the
messaging in the console for end users.

Or is there a good reason to keep it like this?

-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message