hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Caesar Samsi <caesarsa...@mac.com>
Subject RE: can't submit remote job
Date Tue, 19 May 2015 20:02:23 GMT
[I am still a new to all of this, but hope I can help some]

 

Hello,

 

What I’ve noticed, when Namenode can’t write in the Datanode, it’s usually due to the
datanode process not running there.

 

I also noticed that there is a message indicating there is only 1 replica in the system, perhaps
check the hdfs-site.xml configuration?

 

There could also be firewall connection issues, but I don’t see related messages such as
“Connection refused” or something along the lines of unable to connect.

 

HTH, Caesar.

 

From: xeonmailinglist-gmail [mailto:xeonmailinglist@gmail.com] 
Sent: Tuesday, May 19, 2015 1:45 PM
To: user@hadoop.apache.org
Subject: Re: can't submit remote job

 

I am still looking to find out how to solve this. What it seems is that the Namenode can’t
write in the Datanode, but I am not sure.
While I debug the code, I have found out that Hadoop is all the time decreasing the number
of available nodes in [1]. 

If this is correct, I don’t understand why this happens because my system is well configured.
So, I still have no answer for this. If someone could give me a hint I would be appreciated.

org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault#ln 624
 
protected DatanodeStorageInfo chooseRandom(...) {
    while(numOfReplicas > 0 && numOfAvailableNodes > 0) {
      DatanodeDescriptor chosenNode = 
          (DatanodeDescriptor)clusterMap.chooseRandom(scope);
      if (excludedNodes.add(chosenNode)) { //was not in the excluded list
        if (LOG.isDebugEnabled()) {
          builder.append("\nNode ").append(NodeBase.getPath(chosenNode)).append(" [");
        }
        numOfAvailableNodes--;
        (....)
    }
}

On 05/19/2015 05:44 PM, Rajesh Kartha wrote:

Wondering if you have used the REST API to submit jobs:
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application

There are some issues that I have come across, but it does seem to work.



Also the message 
java.io.IOException: File /tmp/hadoop-yarn/staging/xeon/.staging/job_1432045089375_0001/job.split
could only be replicated to 0 nodes instead of minReplication (=1).  
There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

One node excluded from the operation (?) do you know why it is excluded ?



-Rajesh

 

 

On Tue, May 19, 2015 at 7:34 AM, xeonmailinglist-gmail <xeonmailinglist@gmail.com> wrote:

This has been a real struggle to launch a remote MapReduce job. I know that there is the Netflix
genie to submit the job, but for the purpose of this application (small and personal), I want
to code it from scratch.

I am debugging my code to see what is going on during the submission of a remote job, and
now I have the error [1]. This error happens during the submission of the job, more precisely
when it is writing in the remote HDFS. I have put the Hadoop code [2] where I get the error.
The error [1] happens in the instruction out.close() of [2].

The Namenode and the datanodes are working properly. I have 1 Namenode and 1 datanode. The
replication factor is set to 1.
Despite everything is running ok I get this error. Any hint so that I can see what is going
on?

[1]

2015-05-19 10:21:03,147 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:
Failed to place enough replicas, still in need of 1 to reach 1 (unavailab
leStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[],
replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage
 types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7,
storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2015-05-19 10:21:03,147 DEBUG org.apache.hadoop.ipc.Server: Served: addBlock queueTime= 1
procesingTime= 1 exception= IOException
2015-05-19 10:21:03,147 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException
as:xeon (auth:SIMPLE) cause:java.io.IOException: File /tmp/hadoop
-yarn/staging/xeon/.staging/job_1432045089375_0001/job.split could only be replicated to 0
nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 no
de(s) are excluded in this operation.  
2015-05-19 10:21:03,148 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000, call
org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 194.117.18.101:39006 Call#18
Retry#0  
java.io.IOException: File /tmp/hadoop-yarn/staging/xeon/.staging/job_1432045089375_0001/job.split
could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s)
running and 1 node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)  
2015-05-19 10:21:03,148 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000:
responding to org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 194.117.18.101:39006
Call#18 Retry#0

[2]

 public static <T extends InputSplit> void createSplitFiles(Path jobSubmitDir, 
      Configuration conf, FileSystem fs, T[] splits) 
  throws IOException, InterruptedException {
    FSDataOutputStream out = createFile(fs, 
        JobSubmissionFiles.getJobSplitFile(jobSubmitDir), conf);
    SplitMetaInfo[] info = writeNewSplits(conf, splits, out);
    out.close();
    writeJobSplitMetaInfo(fs,JobSubmissionFiles.getJobSplitMetaFile(jobSubmitDir), 
        new FsPermission(JobSubmissionFiles.JOB_FILE_PERMISSION), splitVersion,
        info);
  }

Thanks,

On 05/18/2015 03:56 PM, xeonmailinglist-gmail wrote:

Shabab, I think so, but the Hadoop’s site says The user@ mailing list is the preferred mailing
list for end-user questions and discussion.So I am using the right mailing list.

Back to my problem, I think that this is a problem about HDFS security. But the strangest
thing is that I have disabled it in hdfs-site.xml [1].

I think that this error happens when MapReduce is trying to write the job configuration files
in HDFS. 

I have set the username of the remote client in the mapreduce using the commands in [2].

Now, I am looking to the Netflix Geni to figure it out how they do it, but right now I still
haven’t found a solution to submit a remote job using Java. If anyone have a hint, or advice,
please tell me. I really don’t understand why I get this error.

[1]

$ cat etc/hadoop/hdfs-site.xml
 
<property> <name>dfs.permissions</name> <value>false</value>
</property>

[2]

```
   In Namenode host.
 
   $ sudo adduser xeon
   $ sudo adduser xeon ubuntu

```

On 05/18/2015 02:46 PM, Shahab Yunus wrote:

I think that poster wanted to unsubscribe from the mailing list? 

 

Gopy, if that is the case then please see this for that:https://hadoop.apache.org/mailing_lists.html


Regards,

Shahab

 

On Mon, May 18, 2015 at 9:42 AM, xeonmailinglist-gmail <xeonmailinglist@gmail.com> wrote:

Why "Remove"? 

 

On 05/18/2015 02:25 PM, Gopy Krishna wrote:

REMOVE

 

On Mon, May 18, 2015 at 6:54 AM, xeonmailinglist-gmail <xeonmailinglist@gmail.com> wrote:

Hi,

I am trying to submit a remote job in Yarn MapReduce, but I can’t because I get the error
[1]. I don’t have more exceptions in the other logs.

My Mapreduce runtime have 1 ResourceManager and 3 NodeManagers, and the HDFS is running properly
(all nodes are alive). 

I have looked to all logs, and I still don’t understand why I get this error. Any help to
fix this? Is it a problem of the remote job that I am submitting?

[1] 

$ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log                      
 
2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock:
file /tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split 
fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock:
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
2015-05-18 10:42:16,571 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:
Failed to choose remote rack (location = ~/default-rack), fallback to lo
cal rack
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:

        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)

​

-- 
--





 

-- 

Thanks & Regards
Gopy
Rapisource LLC
Direct: 732-419-9663
Fax: 512-287-4047
Email: gopy@rapisource.com
www.rapisource.com
http://www.linkedin.com/in/gopykrishna

According to Bill S.1618 Title III passed by the 105th US Congress,this message is not considered
as "Spam" as we have included the contact information.If you wish to be removed from our mailing
list, please respond with "remove" in the subject field.We apologize for any inconvenience
caused.

 

-- 
--

 

​

-- 
--

​

-- 
--

 

​

-- 
--

Mime
View raw message