hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-10648) LLAP: registry; Tez attempted to schedule to daemon that didn't exist
Date Thu, 07 May 2015 22:00:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-10648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14533464#comment-14533464
] 

Gopal V commented on HIVE-10648:
--------------------------------

bq. 2015-05-07 12:13:28,082 INFO [Dispatcher thread: Central] node.AMNodeTracker: Num cluster
nodes = 19

That's the number of Nodemanagers in YARN AFAIK - you do have 19 of those.

> LLAP: registry; Tez attempted to schedule to daemon that didn't exist
> ---------------------------------------------------------------------
>
>                 Key: HIVE-10648
>                 URL: https://issues.apache.org/jira/browse/HIVE-10648
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergey Shelukhin
>            Assignee: Gopal V
>
> I can post logs externally; for now app IDs on test cluster are application_1429683757595_0784
and application_1429683757595_0783, I also have logs copied over.
> AM found the node (same logs for other nodes):
> {noformat}
> 2015-05-07 12:13:28,074 INFO [ServiceThread:org.apache.tez.dag.app.rm.TaskSchedulerEventHandler]
impl.LlapYarnRegistryImpl: Adding new worker 342f4992-2608-43ab-a119-b50882e35f75 which mapped
to DynamicServiceInstance [alive=true, host=cn059-10.l42scl.hortonworks.com:15001 with resources=<memory:20480,
vCores:6>]
> ....
> 2015-05-07 12:13:28,082 INFO [Dispatcher thread: Central] node.AMNodeTracker: Num cluster
nodes = 19
> {noformat}
> Trouble is, this node never actually existed... The cluster only had 15 nodes. 
> As the job was progressing, AM repeatedly tried to schedule to this node and failed.
There was no other LLAP cluster running at the same time.
> In fact, given that I always start a 15-node cluster I am not sure where 19-node data
could conceivably come from...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message