Return-Path: X-Original-To: apmail-hadoop-common-commits-archive@www.apache.org Delivered-To: apmail-hadoop-common-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C2732D501 for ; Sun, 16 Sep 2012 22:03:01 +0000 (UTC) Received: (qmail 44192 invoked by uid 500); 16 Sep 2012 22:03:00 -0000 Delivered-To: apmail-hadoop-common-commits-archive@hadoop.apache.org Received: (qmail 44141 invoked by uid 500); 16 Sep 2012 22:03:00 -0000 Mailing-List: contact common-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-commits@hadoop.apache.org Received: (qmail 44132 invoked by uid 99); 16 Sep 2012 22:03:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Sep 2012 22:03:00 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO eris.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Sep 2012 22:02:57 +0000 Received: from eris.apache.org (localhost [127.0.0.1]) by eris.apache.org (Postfix) with ESMTP id E8DCE23888E3 for ; Sun, 16 Sep 2012 22:02:14 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: svn commit: r1385383 - /hadoop/common/branches/branch-1.1/src/docs/releasenotes.html Date: Sun, 16 Sep 2012 22:02:14 -0000 To: common-commits@hadoop.apache.org From: mattf@apache.org X-Mailer: svnmailer-1.0.8-patched Message-Id: <20120916220214.E8DCE23888E3@eris.apache.org> X-Virus-Checked: Checked by ClamAV on apache.org Author: mattf Date: Sun Sep 16 22:02:14 2012 New Revision: 1385383 URL: http://svn.apache.org/viewvc?rev=1385383&view=rev Log: release notes for Hadoop-1.1.0-rc4 Modified: hadoop/common/branches/branch-1.1/src/docs/releasenotes.html Modified: hadoop/common/branches/branch-1.1/src/docs/releasenotes.html URL: http://svn.apache.org/viewvc/hadoop/common/branches/branch-1.1/src/docs/releasenotes.html?rev=1385383&r1=1385382&r2=1385383&view=diff ============================================================================== --- hadoop/common/branches/branch-1.1/src/docs/releasenotes.html (original) +++ hadoop/common/branches/branch-1.1/src/docs/releasenotes.html Sun Sep 16 22:02:14 2012 @@ -36,50 +36,18 @@ -

HADOOP-7509. - Trivial improvement reported by raviprak and fixed by raviprak
- Improve message when Authentication is required
-

Thanks Aaron and Suresh!
- -Marking as resolved fixed since changes have gone in. -

HADOOP-8230. Major improvement reported by eli2 and fixed by eli
Enable sync by default and disable append

Append is not supported in Hadoop 1.x. Please upgrade to 2.x if you need append. If you enabled dfs.support.append for HBase, you're OK, as durable sync (why HBase required dfs.support.append) is now enabled by default. If you really need the previous functionality, to turn on the append functionality set the flag "dfs.support.broken.append" to true.

HADOOP-8352. - Major improvement reported by owen.omalley and fixed by owen.omalley
- We should always generate a new configure script for the c++ code
-

If you are compiling c++, the configure script will now be automatically regenerated as it should be.
- -This requires autoconf version 2.61 or greater. -

HADOOP-8365. Blocker improvement reported by eli2 and fixed by eli
Add flag to disable durable sync

This patch enables durable sync by default. Installation where HBase was not used, that used to run without setting "dfs.support.append" or setting it to false explicitly in the configuration, must add a new flag "dfs.durable.sync" and set it to false to preserve the previous semantics.

HDFS-2318. - Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)
- Provide authentication to webhdfs using SPNEGO
-

Added two new conf properties dfs.web.authentication.kerberos.principal and dfs.web.authentication.kerberos.keytab for the SPNEGO servlet filter. - - -

- -

HDFS-2338. - Major sub-task reported by jnp and fixed by jnp (webhdfs)
- Configuration option to enable/disable webhdfs.
-

Added a conf property dfs.webhdfs.enabled for enabling/disabling webhdfs. - - -

HDFS-2465. Major improvement reported by tlipcon and fixed by tlipcon (data-node, performance)
Add HDFS support for fadvise readahead and drop-behind
@@ -146,6 +114,18 @@ dfs.datanode.readahead.bytes - set to a

HDFS-3703. + Major improvement reported by nkeywal and fixed by jingzhao (data-node, name-node)
+ Decrease the datanode failure detection time
+

This jira adds a new DataNode state called "stale" at the NameNode. DataNodes are marked as stale if it does not send heartbeat message to NameNode within the timeout configured using the configuration parameter "dfs.namenode.stale.datanode.interval" in seconds (default value is 30 seconds). NameNode picks a stale datanode as the last target to read from when returning block locations for reads.
+ +
+ +This feature is by default turned * off *. To turn on the feature, set the HDFS configuration "dfs.namenode.check.stale.datanode" to true.
+ + +

HDFS-3814. Major improvement reported by sureshms and fixed by jingzhao (name-node)
Make the replication monitor multipliers configurable in 1.x
@@ -236,10 +216,10 @@ Please see hdfs-default.xml for detailed Error in the documentation regarding Checkpoint/Backup Node

On http://hadoop.apache.org/common/docs/r0.20.203.0/hdfs_user_guide.html#Checkpoint+Node: the command bin/hdfs namenode -checkpoint required to launch the backup/checkpoint node does not exist.
I have removed this from the docs.

HADOOP-7461. - Major bug reported by rbodkin and fixed by gkesavan (build)
- Jackson Dependency Not Declared in Hadoop POM
-

(COMMENT: This bug still affects 0.20.205.0, four months after the bug was filed. This causes total failure, and the fix is trivial for whoever manages the POM -- just add the missing dependency! --ben)

This issue was identified and the fix & workaround was documented at

https://issues.cloudera.org/browse/DISTRO-44

The issue affects use of Hadoop 0.20.203.0 from the Maven central repo. I built a job using that maven repo and ran it, resulting in this failure:

Exception in thread "main" ...

HADOOP-7509. + Trivial improvement reported by raviprak and fixed by raviprak
+ Improve message when Authentication is required
+

The message when security is enabled and authentication is configured to be simple is not explicit enough. It simply prints out "Authentication is required" and prints out a stack trace. The message should be "Authorization (hadoop.security.authorization) is enabled but authentication (hadoop.security.authentication) is configured as simple. Please configure another method."

HADOOP-7621. Critical bug reported by tucu00 and fixed by atm (security)
@@ -256,21 +236,11 @@ Please see hdfs-default.xml for detailed Cluster setup docs specify wrong owner for task-controller.cfg

The cluster setup docs indicate task-controller.cfg must be owned by the user running TaskTracker but the code checks for root. We should update the docs to reflect the real requirement.

HADOOP-7645. - Blocker bug reported by atm and fixed by jnp (security)
- HTTP auth tests requiring Kerberos infrastructure are not disabled on branch-0.20-security
-

The back-port of HADOOP-7119 to branch-0.20-security included tests which require Kerberos infrastructure in order to run. In trunk and 0.23, these are disabled unless one enables the {{testKerberos}} maven profile. In branch-0.20-security, these tests are always run regardless, and so fail most of the time.

See this Jenkins build for an example: https://builds.apache.org/view/G-L/view/Hadoop/job/Hadoop-0.20-security/26/

HADOOP-7653. Minor bug reported by natty and fixed by natty (build)
tarball doesn't include .eclipse.templates

The hadoop tarball doesn't include .eclipse.templates. This results in a failure to successfully run ant eclipse-files:

eclipse-files:

BUILD FAILED
/home/natty/Downloads/hadoop-0.20.2/build.xml:1606: /home/natty/Downloads/hadoop-0.20.2/.eclipse.templates not found.

HADOOP-7664. - Minor improvement reported by raviprak and fixed by raviprak (conf)
- o.a.h.conf.Configuration complains of overriding final parameter even if the value with which its attempting to override is the same.
-

o.a.h.conf.Configuration complains of overriding final parameter even if the value with which its attempting to override is the same.

HADOOP-7665. Major bug reported by atm and fixed by atm (security)
branch-0.20-security doesn't include SPNEGO settings in core-default.xml
@@ -281,11 +251,6 @@ Please see hdfs-default.xml for detailed branch-0.20-security doesn't include o.a.h.security.TestAuthenticationFilter

Looks like the back-port of HADOOP-7119 to branch-0.20-security missed {{o.a.h.security.TestAuthenticationFilter}}.

HADOOP-7674. - Major bug reported by jnp and fixed by jnp
- TestKerberosName fails in 20 branch.
-

TestKerberosName fails in 20 branch. In fact this test has got duplicated in 20, with a little change to the rules.

HADOOP-7745. Major bug reported by raviprak and fixed by raviprak
I switched variable names in HADOOP-7509
@@ -336,11 +301,6 @@ Please see hdfs-default.xml for detailed UserGroupInformation fails to login if thread's context classloader can't load HadoopLoginModule

In a few hard-to-reproduce situations, we've seen a problem where the UGI login call causes a failure to login exception with the following cause:

Caused by: javax.security.auth.login.LoginException: unable to find
LoginModule class: org.apache.hadoop.security.UserGroupInformation
$HadoopLoginModule

After a bunch of debugging, I determined that this happens when the login occurs in a thread whose Context ClassLoader has been set to null.

HADOOP-7987. - Major improvement reported by devaraj and fixed by jnp (security)
- Support setting the run-as user in unsecure mode
-

Some applications need to be able to perform actions (such as launch MR jobs) from map or reduce tasks. In earlier unsecure versions of hadoop (20.x), it was possible to do this by setting user.name in the configuration. But in 20.205 and 1.0, when running in unsecure mode, this does not work. (In secure mode, you can do this using the kerberos credentials).

HADOOP-7988. Major bug reported by jnp and fixed by jnp
Upper case in hostname part of the principals doesn't work with kerberos.
@@ -361,21 +321,11 @@ Please see hdfs-default.xml for detailed Add option to relax build-version check for branch-1

In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie svn revision) do not match. TTs refuse to connect to JTs if their build *version* (version, revision, user, and source checksum) do not match.

This prevents rolling upgrades, which is intentional, see the discussion in HADOOP-5203. The primary motivation in that jira was (1) it's difficult to guarantee every build on a large cluster got deployed correctly, builds don't get rolled back to old versions by accident etc,...

HADOOP-8251. - Blocker bug reported by tlipcon and fixed by tlipcon (security)
- SecurityUtil.fetchServiceTicket broken after HADOOP-6941
-

HADOOP-6941 replaced direct references to some classes with reflective access so as to support other JDKs. Unfortunately there was a mistake in the name of the Krb5Util class, which broke fetchServiceTicket. This manifests itself as the inability to run checkpoints or other krb5-SSL HTTP-based transfers:

java.lang.ClassNotFoundException: sun.security.jgss.krb5

HADOOP-8269. Trivial bug reported by eli2 and fixed by eli (documentation)
Fix some javadoc warnings on branch-1

There are some javadoc warnings on branch-1, let's fix them.

HADOOP-8293. - Major bug reported by owen.omalley and fixed by owen.omalley (build)
- The native library's Makefile.am doesn't include JNI path
-

When compiling on centos 6, I get the following error when compiling the native library:

{code}
[exec] /usr/bin/ld: cannot find -ljvm
{code}

The problem is simply that the Makefile.am libhadoop_la_LDFLAGS doesn't include AM_LDFLAGS.

HADOOP-8314. Major bug reported by tucu00 and fixed by tucu00 (security)
HttpServer#hasAdminAccess should return false if authorization is enabled but user is not authenticated
@@ -386,11 +336,6 @@ Please see hdfs-default.xml for detailed Build fails with Java 7

I am seeing the following message running IBM Java 7 running branch-1.0 code.
compile:
[echo] contrib: gridmix
[javac] Compiling 31 source files to /home/hadoop/branch-1.0_0427/build/contrib/gridmix/classes
[javac] /home/hadoop/branch-1.0_0427/src/contrib/gridmix/src/java/org/apache/hadoop/mapred/gridmix/Gridmix.java:396: error: type argument ? extends T is not within bounds of type-variable E
[javac] private <T> String getEnumValues(Enum<? extends T>[] e) {
[javac] ^
[javac] where T,E are ty...

HADOOP-8338. - Major bug reported by owen.omalley and fixed by owen.omalley (security)
- Can't renew or cancel HDFS delegation tokens over secure RPC
-

The fetchdt tool is failing for secure deployments when given --renew or --cancel on tokens fetched using RPC. (The tokens fetched over HTTP can be renewed and canceled fine.)

HADOOP-8399. Major bug reported by cos and fixed by cos (build)
Remove JDK5 dependency from Hadoop 1.0+ line
@@ -421,11 +366,6 @@ Please see hdfs-default.xml for detailed backport forced daemon shutdown of HADOOP-8353 into branch-1

the init.d service shutdown code doesn't work if the daemon is hung -backporting the portion of HADOOP-8353 that edits bin/hadoop-daemon.sh corrects this

HDFS-1108. - Major sub-task reported by dhruba and fixed by tlipcon (ha, name-node)
- Log newly allocated blocks
-

The current HDFS design says that newly allocated blocks for a file are not persisted in the NN transaction log when the block is allocated. Instead, a hflush() or a close() on the file persists the blocks into the transaction log. It would be nice if we can immediately persist newly allocated blocks (as soon as they are allocated) for specific files.

HDFS-1378. Major improvement reported by tlipcon and fixed by cmccabe (name-node)
Edit log replay should track and report file offsets in case of errors
@@ -436,91 +376,16 @@ Please see hdfs-default.xml for detailed when dfs.name.dir and dfs.name.edits.dir are same fsimage will be saved twice every time

when image and edits dir are configured same, the fsimage flushing from memory to disk will be done twice whenever saveNamespace is done. this may impact the performance of backupnode/snn where it does a saveNamespace during every checkpointing time.

HDFS-2065. - Major bug reported by bharathm and fixed by umamaheswararao
- Fix NPE in DFSClient.getFileChecksum
-

The following code can throw NPE if callGetBlockLocations returns null.

If server returns null

{code}
List<LocatedBlock> locatedblocks
= callGetBlockLocations(namenode, src, 0, Long.MAX_VALUE).getLocatedBlocks();
{code}

The right fix for this is server should throw right exception.

HDFS-2305. Major bug reported by atm and fixed by atm (name-node)
Running multiple 2NNs can result in corrupt file system

Here's the scenario:

* You run the NN and 2NN (2NN A) on the same machine.
* You don't have the address of the 2NN configured, so it's defaulting to 127.0.0.1.
* There's another 2NN (2NN B) running on a second machine.
* When a 2NN is done checkpointing, it says "hey NN, I have an updated fsimage for you. You can download it from this URL, which includes my IP address, which is x"

And here's the steps that occur to cause this issue:

# Some edits happen.
# 2NN A (on the NN machine) does a c...

HDFS-2317. - Major sub-task reported by szetszwo and fixed by szetszwo
- Read access to HDFS using HTTP REST
-

- -

HDFS-2331. - Major bug reported by abhijit.shingate and fixed by abhijit.shingate (hdfs client)
- Hdfs compilation fails
-

I am trying to perform complete build from trunk folder but the compilation fails.

*Commandline:*
mvn clean install

*Error Message:*

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.
3.2:compile (default-compile) on project hadoop-hdfs: Compilation failure
[ERROR] \Hadoop\SVN\trunk\hadoop-hdfs-project\hadoop-hdfs\src\main\java\org
\apache\hadoop\hdfs\web\WebHdfsFileSystem.java:[209,21] type parameters of <T>T
cannot be determined; no unique maximal instance...

HDFS-2332. Major test reported by tlipcon and fixed by tlipcon (test)
Add test for HADOOP-7629: using an immutable FsPermission as an IPC parameter

HADOOP-7629 fixes a bug where an immutable FsPermission would throw an error if used as the argument to fs.setPermission(). This JIRA is to add a test case for the common bugfix.

HDFS-2333. - Major bug reported by ikelly and fixed by szetszwo
- HDFS-2284 introduced 2 findbugs warnings on trunk
-

When HDFS-2284 was submitted it made DFSOutputStream public which triggered two SC_START_IN_CTOR findbug warnings.

- -

HDFS-2340. - Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)
- Support getFileBlockLocations and getDelegationToken in webhdfs
-

- -

HDFS-2348. - Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)
- Support getContentSummary and getFileChecksum in webhdfs
-

- -

HDFS-2356. - Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)
- webhdfs: support case insensitive query parameter names
-

- -

HDFS-2361. - Critical bug reported by rajsaha and fixed by jnp (name-node)
- hftp is broken
-

Distcp with hftp is failing.

{noformat}
$hadoop distcp hftp://<NNhostname>:50070/user/hadoopqa/1316814737/newtemp 1316814737/as
11/09/23 21:52:33 INFO tools.DistCp: srcPaths=[hftp://<NNhostname>:50070/user/hadoopqa/1316814737/newtemp]
11/09/23 21:52:33 INFO tools.DistCp: destPath=1316814737/as
Retrieving token from: https://<NN IP>:50470/getDelegationToken
Retrieving token from: https://<NN IP>:50470/getDelegationToken?renewer=mapred
11/09/23 21:52:34 INFO security.TokenCache: Got dt for h...

- -

HDFS-2366. - Major sub-task reported by arpitgupta and fixed by szetszwo (webhdfs)
- webhdfs throws a npe when ugi is null from getDelegationToken
-

- -

HDFS-2368. - Major bug reported by arpitgupta and fixed by szetszwo
- defaults created for web keytab and principal, these properties should not have defaults
-

the following defaults are set in hdfs-defaults.xml

<property>
<name>dfs.web.authentication.kerberos.principal</name>
<value>HTTP/${dfs.web.hostname}@${kerberos.realm}</value>
<description>
The HTTP Kerberos principal used by Hadoop-Auth in the HTTP endpoint.

The HTTP Kerberos principal MUST start with 'HTTP/' per Kerberos
HTTP SPENGO specification.
</description>
</property>

<property>
<name>dfs.web.authentication.kerberos.keytab</name>
<value>${user.home}/dfs.web....

- -

HDFS-2427. - Major sub-task reported by arpitgupta and fixed by szetszwo (webhdfs)
- webhdfs mkdirs api call creates path with 777 permission, we should default it to 755
-

- -

HDFS-2432. - Major sub-task reported by arpitgupta and fixed by szetszwo (webhdfs)
- webhdfs setreplication api should return a 403 when called on a directory
-

Currently the set replication api on a directory leads to a 200.

Request URI http://NN:50070/webhdfs/tmp/webhdfs_data/dir_replication_tests?op=SETREPLICATION&replication=5
Request Method: PUT
Status Line: HTTP/1.1 200 OK
Response Content: {"boolean":false}

Since we can determine that this call did not succeed (boolean=false) we should rather just return a 403

- -

HDFS-2453. - Major sub-task reported by arpitgupta and fixed by szetszwo (webhdfs)
- tail using a webhdfs uri throws an error
-

/usr//bin/hadoop --config /etc/hadoop dfs -tail webhdfs://NN:50070/file
tail: HTTP_PARTIAL expected, received 200

- -

HDFS-2494. - Major sub-task reported by umamaheswararao and fixed by umamaheswararao (webhdfs)
- [webhdfs] When Getting the file using OP=OPEN with DN http address, ESTABLISHED sockets are growing.
-

As part of the reliable test,
Scenario:
Initially check the socket count. ---there are aroud 42 sockets are there.
open the file with DataNode http address using op=OPEN request parameter about 500 times in loop.
Wait for some time and check the socket count. --- There are thousands of ESTABLISHED sockets are growing. ~2052

Here is the netstat result:

C:\Users\uma>netstat | grep 127.0.0.1 | grep ESTABLISHED |wc -l
2042
C:\Users\uma>netstat | grep 127.0.0.1 | grep ESTABLISHED |wc -l
2042
C:\...

- -

HDFS-2501. - Major sub-task reported by szetszwo and fixed by szetszwo (webhdfs)
- add version prefix and root methods to webhdfs
-

HDFS-2541. Major bug reported by qwertymaniac and fixed by qwertymaniac (data-node)
For a sufficiently large value of blocks, the DN Scanner may request a random number with a negative seed value.
@@ -531,21 +396,6 @@ Please see hdfs-default.xml for detailed ReplicationTargetChooser has incorrect block placement comments

{code}
/** The class is responsible for choosing the desired number of targets
* for placing block replicas.
* The replica placement strategy is that if the writer is on a datanode,
* the 1st replica is placed on the local machine,
* otherwise a random datanode. The 2nd replica is placed on a datanode
* that is on a different rack. The 3rd replica is placed on a datanode
* which is on the same rack as the **first replca**.
*/
{code}

That should read "second replica". The test cases c...

HDFS-2552. - Major task reported by szetszwo and fixed by szetszwo (webhdfs)
- Add WebHdfs Forrest doc
-

- -

HDFS-2590. - Major bug reported by szetszwo and fixed by szetszwo (webhdfs)
- Some links in WebHDFS forrest doc do not work
-

Some links are pointing to DistributedFileSystem javadoc but the javadoc of DistributedFileSystem is not generated by default.

- -

HDFS-2604. - Minor improvement reported by szetszwo and fixed by szetszwo (webhdfs)
- Add a log message to show if WebHDFS is enabled
-

WebHDFS can be enabled/disabled by the conf key {{dfs.webhdfs.enabled}}. Let's add a log message to show if it is enabled.

HDFS-2637. Major bug reported by eli and fixed by eli (hdfs client)
The rpc timeout for block recovery is too low
@@ -636,6 +486,11 @@ Please see hdfs-default.xml for detailed HDFS does not use ClientProtocol in a backward-compatible way

HDFS-617 was brought into branch-0.20-security/branch-1 to support non-recursive create, along with HADOOP-6840 and HADOOP-6886. However, the changes in HDFS was done in an incompatible way, making the client unusable against older clusters, even when plain old create() is called. This is because DFS now internally calls create() through the newly introduced method. By simply changing how the methods are wired internally, we can remove this limitation. We may eventually switch back to the app...

HDFS-3466. + Major bug reported by owen.omalley and fixed by owen.omalley (name-node, security)
+ The SPNEGO filter for the NameNode should come out of the web keytab file
+

Currently, the spnego filter uses the DFS_NAMENODE_KEYTAB_FILE_KEY to find the keytab. It should use the DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY to do it.

HDFS-3504. Major improvement reported by sseth and fixed by szetszwo
Configurable retry in DFSClient
@@ -651,6 +506,11 @@ Please see hdfs-default.xml for detailed WebHDFS CREATE does not use client location for redirection

CREATE currently redirects client to a random datanode but not using the client location information.

HDFS-3617. + Major improvement reported by mattf and fixed by qwertymaniac
+ Port HDFS-96 to branch-1 (support blocks greater than 2GB)
+

Please see HDFS-96.

HDFS-3652. Blocker bug reported by tlipcon and fixed by tlipcon (name-node)
1.x: FSEditLog failure removes the wrong edit stream when storage dirs have same name
@@ -681,11 +541,6 @@ Please see hdfs-default.xml for detailed Job may hang if mapreduce.job.committer.setup.cleanup.needed=false and mapreduce.map/reduce.failures.maxpercent>0

Job may hang at RUNNING state if mapreduce.job.committer.setup.cleanup.needed=false and mapreduce.map/reduce.failures.maxpercent>0. It happens when some tasks fail but havent reached failures.maxpercent.

MAPREDUCE-2289. - Major bug reported by tlipcon and fixed by ahmed.radwan (job submission)
- Permissions race can make getStagingDir fail on local filesystem
-

I've observed the following race condition in TestFairSchedulerSystem which uses a MiniMRCluster on top of RawLocalFileSystem:
- two threads call getStagingDir at the same time
- Thread A checks fs.exists(stagingArea) and sees false
-- Calls mkdirs(stagingArea, JOB_DIR_PERMISSIONS)
--- mkdirs calls the Java mkdir API which makes the file with umask-based permissions
- Thread B runs, checks fs.exists(stagingArea) and sees true
-- checks permissions, sees the default permissions, and throws IOE...

MAPREDUCE-2376. Major bug reported by tlipcon and fixed by tlipcon (task-controller, test)
test-task-controller fails if run as a userid < 1000
@@ -741,11 +596,6 @@ Please see hdfs-default.xml for detailed Add local dir failure info to metrics and the web UI

Like HDFS-811/HDFS-1850 but for the TT.

MAPREDUCE-3076. - Blocker bug reported by acmurthy and fixed by acmurthy (test)
- TestSleepJob fails
-

TestSleepJob fails, it was intended to be used in other tests for MAPREDUCE-2981.

MAPREDUCE-3278. Major improvement reported by tlipcon and fixed by tlipcon (mrv1, performance, task)
0.20: avoid a busy-loop in ReduceTask scheduling
@@ -811,11 +661,6 @@ Please see hdfs-default.xml for detailed TestJobInProgress#testLocality uses a bogus topology

The following in TestJobInProgress#testLocality:

{code}
Node r2n4 = new NodeBase("/default/rack2/s1/node4");
nt.add(r2n4);
{code}

violates the check introduced by HADOOP-8159:

{noformat}
Testcase: testLocality took 0.005 sec
Caused an ERROR
Invalid network topology. You cannot have a rack and a non-rack node at the same level of the network topology.
org.apache.hadoop.net.NetworkTopology$InvalidTopologyException: Invalid network topology. You cannot have a rack and a non-ra...

MAPREDUCE-4195. - Critical bug reported by jira.shegalov and fixed by (jobtracker)
- With invalid queueName request param, jobqueue_details.jsp shows NPE
-

When you access /jobqueue_details.jsp manually, instead of via a link, it has queueName set to null internally and this goes for a lookup into the scheduling info maps as well.

As a result, if using FairScheduler, a Pool with String name = null gets created and this brings the scheduler down. I have not tested what happens to the CapacityScheduler, but ideally if no queueName is set in that jsp, it should fall back to 'default'. Otherwise, this brings down the JobTracker completely.

FairSch...

MAPREDUCE-4241. Major bug reported by abayer and fixed by abayer (build, examples)
Pipes examples do not compile on Ubuntu 12.04