hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajesh Kartha <karth...@gmail.com>
Subject Re: can't submit remote job
Date Tue, 19 May 2015 16:44:03 GMT
Wondering if you have used the REST API to submit jobs:
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application

There are some issues that I have come across, but it does seem to work.


Also the message
java.io.IOException: File
/tmp/hadoop-yarn/staging/xeon/.staging/job_1432045089375_0001/job.split
could only be replicated to 0 nodes instead of minReplication (=1).
There are 1 datanode(s) running and 1 node(s) are excluded in this
operation.

One node excluded from the operation (?) do you know why it is excluded ?


-Rajesh


On Tue, May 19, 2015 at 7:34 AM, xeonmailinglist-gmail <
xeonmailinglist@gmail.com> wrote:

>  This has been a real struggle to launch a remote MapReduce job. I know
> that there is the Netflix genie to submit the job, but for the purpose of
> this application (small and personal), I want to code it from scratch.
>
> I am debugging my code to see what is going on during the submission of a
> remote job, and now I have the error [1]. This error happens during the
> submission of the job, more precisely when it is writing in the remote
> HDFS. I have put the Hadoop code [2] where I get the error. The error [1]
> happens in the instruction out.close() of [2].
>
> The Namenode and the datanodes are working properly. I have 1 Namenode and
> 1 datanode. The replication factor is set to 1.
> Despite everything is running ok I get this error. Any hint so that I can
> see what is going on?
>
> [1]
>
> 2015-05-19 10:21:03,147 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:
Failed to place enough replicas, still in need of 1 to reach 1 (unavailab
> leStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[],
replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage
>  types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7,
storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
> 2015-05-19 10:21:03,147 DEBUG org.apache.hadoop.ipc.Server: Served: addBlock queueTime=
1 procesingTime= 1 exception= IOException
> 2015-05-19 10:21:03,147 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException
as:xeon (auth:SIMPLE) cause:java.io.IOException: File /tmp/hadoop
> -yarn/staging/xeon/.staging/job_1432045089375_0001/job.split could only be replicated
to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 no
> de(s) are excluded in this operation.
> 2015-05-19 10:21:03,148 INFO org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000,
call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 194.117.18.101:39006 Call#18
Retry#0
> java.io.IOException: File /tmp/hadoop-yarn/staging/xeon/.staging/job_1432045089375_0001/job.split
could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s)
running and 1 node(s) are excluded in this operation.
>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)
> 2015-05-19 10:21:03,148 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 6 on 9000:
responding to org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 194.117.18.101:39006
Call#18 Retry#0
>
> [2]
>
>  public static <T extends InputSplit> void createSplitFiles(Path jobSubmitDir,
>       Configuration conf, FileSystem fs, T[] splits)
>   throws IOException, InterruptedException {
>     FSDataOutputStream out = createFile(fs,
>         JobSubmissionFiles.getJobSplitFile(jobSubmitDir), conf);
>     SplitMetaInfo[] info = writeNewSplits(conf, splits, out);
>     out.close();
>     writeJobSplitMetaInfo(fs,JobSubmissionFiles.getJobSplitMetaFile(jobSubmitDir),
>         new FsPermission(JobSubmissionFiles.JOB_FILE_PERMISSION), splitVersion,
>         info);
>   }
>
> Thanks,
>
> On 05/18/2015 03:56 PM, xeonmailinglist-gmail wrote:
>
>   Shabab, I think so, but the Hadoop’s site says The user@ mailing list
> is the preferred mailing list for end-user questions and discussion.So I
> am using the right mailing list.
>
> Back to my problem, I think that this is a problem about HDFS security.
> But the strangest thing is that I have disabled it in hdfs-site.xml [1].
>
> I think that this error happens when MapReduce is trying to write the job
> configuration files in HDFS.
>
> I have set the username of the remote client in the mapreduce using the
> commands in [2].
>
> Now, I am looking to the Netflix Geni to figure it out how they do it, but
> right now I still haven’t found a solution to submit a remote job using
> Java. If anyone have a hint, or advice, please tell me. I really don’t
> understand why I get this error.
>
> [1]
>
> $ cat etc/hadoop/hdfs-site.xml
>
> <property> <name>dfs.permissions</name> <value>false</value>
</property>
>
> [2]
>
> ```
>    In Namenode host.
>
>    $ sudo adduser xeon
>    $ sudo adduser xeon ubuntu
>
> ```
>
> On 05/18/2015 02:46 PM, Shahab Yunus wrote:
>
> I think that poster wanted to unsubscribe from the mailing list?
>
>  Gopy, if that is the case then please see this for that:
> https://hadoop.apache.org/mailing_lists.html
>
> Regards,
> Shahab
>
> On Mon, May 18, 2015 at 9:42 AM, xeonmailinglist-gmail <
> xeonmailinglist@gmail.com> wrote:
>
>>  Why "Remove"?
>>
>>
>> On 05/18/2015 02:25 PM, Gopy Krishna wrote:
>>
>> REMOVE
>>
>> On Mon, May 18, 2015 at 6:54 AM, xeonmailinglist-gmail <
>> xeonmailinglist@gmail.com> wrote:
>>
>>>  Hi,
>>>
>>> I am trying to submit a remote job in Yarn MapReduce, but I can’t
>>> because I get the error [1]. I don’t have more exceptions in the other logs.
>>>
>>> My Mapreduce runtime have 1 *ResourceManager* and 3 *NodeManagers*, and
>>> the HDFS is running properly (all nodes are alive).
>>>
>>> I have looked to all logs, and I still don’t understand why I get this
>>> error. Any help to fix this? Is it a problem of the remote job that I am
>>> submitting?
>>>
>>> [1]
>>>
>>> $ less logs/hadoop-ubuntu-namenode-ip-172-31-17-45.log
>>>
>>> 2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock:
file /tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.split
>>> fileId=16394 for DFSClient_NONMAPREDUCE_-1923902075_14
>>> 2015-05-18 10:42:16,570 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock:
/tmp/hadoop-yarn/staging/xeon/.staging/job_1431945660897_0001/job.
>>> split inodeId 16394 for DFSClient_NONMAPREDUCE_-1923902075_14
>>> 2015-05-18 10:42:16,571 DEBUG org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:
Failed to choose remote rack (location = ~/default-rack), fallback to lo
>>> cal rack
>>> org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy$NotEnoughReplicasException:
>>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRandom(BlockPlacementPolicyDefault.java:691)
>>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseRemoteRack(BlockPlacementPolicyDefault.java:580)
>>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:348)
>>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:214)
>>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:111)
>>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseTarget(BlockPlacementPolicyDefault.java:126)
>>>         at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1545)
>>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
>>>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
>>>
>>> ​
>>>
>>> --
>>> --
>>>
>>>
>>
>>
>>  --
>>  Thanks & Regards
>> Gopy
>> Rapisource LLC
>> Direct: 732-419-9663
>> Fax: 512-287-4047
>> Email: gopy@rapisource.com
>> www.rapisource.com
>> http://www.linkedin.com/in/gopykrishna
>>
>> According to Bill S.1618 Title III passed by the 105th US Congress,this
>> message is not considered as "Spam" as we have included the contact
>> information.If you wish to be removed from our mailing list, please respond
>> with "remove" in the subject field.We apologize for any inconvenience
>> caused.
>>
>>
>>  --
>> --
>>
>>
>   ​
>
> --
> --
>
>   ​
>
> --
> --
>
>

Mime
View raw message