asterixdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Taewoo Kim <wangs...@gmail.com>
Subject Re: Requesting node X to shutdown to ensure failure?
Date Sun, 30 Sep 2018 20:03:31 GMT
P.S. We set the heartbeat setting as follows.

[cc]
heartbeat.max.misses=10
heartbeat.period=8000

Best,
Taewoo

On Sun, Sep 30, 2018 at 12:50 PM, Taewoo Kim <wangsaeu@gmail.com> wrote:

> Hi Till,
>
> These files are indeed the same log file. I just extracted one line from
> the original log and placed the line at the beginning. At 9:50 pm, the CC
> initiates a task cluster and all NCs were executing. Node 2 sent the
> taskcomplete message, and somehow the CC decides to shut down the node 2.
> That's my understanding.
>
> Best,
> Taewoo
>
> On Sun, Sep 30, 2018 at 12:40 PM, Till Westmann <tillw@apache.org> wrote:
>
>> Hi Taewoo,
>>
>> I’m little confused as we seem to have 2 cc logs.
>>
>> The "Requesting node 2 to shutdown …" message is logged, when nodes are
>> considered to be dead as they have missed too many heartbeats. And you can
>> see in the later logs that indeed more than 80s passed without receiving a
>> heartbeat.
>> Could it be that the machines were too busy to send/receive heartbeats?
>>
>> Cheers,
>> Till
>>
>> On 30 Sep 2018, at 10:52, Taewoo Kim wrote:
>>
>> Hi devs,
>>>
>>> I saw the following log messages in the cc.log of the Cloudberrry
>>> cluster.
>>> Yesterday around 9:50 pm, a task cluster was created and executed. And
>>> around 9:51 pm, the cluster shows the following message. In the nc.log of
>>> the node 2, there were no noticeable log messages. I wonder what could
>>> cause this situation.
>>>
>>> -- cc.log at 9:51 pm
>>> *21:51:54.203 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Requesting node 2
>>> to
>>> shutdown to ensure failure*
>>>
>>> *-- nc-2.log around 9:50 pm*
>>>
>>> 21:50:30.314 [Worker:2] INFO
>>>  org.apache.hyracks.control.nc.work.StartTasksWork - input: 0: CDID:2
>>> 21:50:30.375 [Worker:2] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:7:0:3:0
>>> 21:50:30.382 [Worker:2] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:9:0:3:0
>>> 21:50:30.390 [Worker:2] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:9:0:2:0
>>> 21:50:30.391 [Worker:2] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:7:0:2:0
>>> 21:50:30.392 [Worker:2] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:1:1:3:0
>>> 21:50:30.392 [Worker:2] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:1:1:2:0
>>> 21:50:30.398 [Worker:2] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:1:1:3:0
>>> 21:50:30.398 [Worker:2] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> NotifyTaskCompleteWork:TAID:TID:ANID:ODID:1:1:2:0
>>> 22:02:30.290 [JVM exit thread] INFO  org.apache.hyracks.util.ExitUtil -
>>> JVM
>>> exiting with status 0; bye!
>>>
>>>
>>> -- *cc.log from 9:51 pm*
>>>
>>> 21:50:30.313 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.ActivityClusterPlanner - Built 1
>>> Task Clusters
>>> 21:50:30.313 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.ActivityClusterPlanner - Tasks:
>>> [TID:ANID:ODID:1:1:0, TID:ANID:ODID:1:1:1, TID:ANID:ODID:1:1:2,
>>> TID:ANID:ODID:1:1:3, TID:ANID:ODID:1:1:4, TID:ANID:ODID:1:1:5,
>>> TID:ANID:ODID:1:1:6, TID:ANID:ODID:1:1:7, TID:ANID:ODID:1:1:8,
>>> TID:ANID:ODID:1:1:9, TID:ANID:ODID:2:0:0, TID:ANID:ODID:2:0:1,
>>> TID:ANID:ODID:2:0:2, TID:ANID:ODID:2:0:3, TID:ANID:ODID:2:0:4,
>>> TID:ANID:ODID:2:0:5, TID:ANID:ODID:2:0:6, TID:ANID:ODID:2:0:7,
>>> TID:ANID:ODID:2:0:8, TID:ANID:ODID:2:0:9, TID:ANID:ODID:9:0:0,
>>> TID:ANID:ODID:9:0:1, TID:ANID:ODID:9:0:2, TID:ANID:ODID:9:0:3,
>>> TID:ANID:ODID:9:0:4, TID:ANID:ODID:9:0:5, TID:ANID:ODID:9:0:6,
>>> TID:ANID:ODID:9:0:7, TID:ANID:ODID:9:0:8, TID:ANID:ODID:9:0:9]
>>> 21:50:30.313 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - Runnable TC roots:
>>> [TC:[TID:ANID:ODID:1:1:0, TID:ANID:ODID:1:1:1, TID:ANID:ODID:1:1:2,
>>> TID:ANID:ODID:1:1:3, TID:ANID:ODID:1:1:4, TID:ANID:ODID:1:1:5,
>>> TID:ANID:ODID:1:1:6, TID:ANID:ODID:1:1:7, TID:ANID:ODID:1:1:8,
>>> TID:ANID:ODID:1:1:9, TID:ANID:ODID:2:0:0, TID:ANID:ODID:2:0:1,
>>> TID:ANID:ODID:2:0:2, TID:ANID:ODID:2:0:3, TID:ANID:ODID:2:0:4,
>>> TID:ANID:ODID:2:0:5, TID:ANID:ODID:2:0:6, TID:ANID:ODID:2:0:7,
>>> TID:ANID:ODID:2:0:8, TID:ANID:ODID:2:0:9, TID:ANID:ODID:9:0:0,
>>> TID:ANID:ODID:9:0:1, TID:ANID:ODID:9:0:2, TID:ANID:ODID:9:0:3,
>>> TID:ANID:ODID:9:0:4, TID:ANID:ODID:9:0:5, TID:ANID:ODID:9:0:6,
>>> TID:ANID:ODID:9:0:7, TID:ANID:ODID:9:0:8, TID:ANID:ODID:9:0:9]],
>>> inProgressTaskClusters: []
>>> 21:50:30.366 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.700:TAID:TID:ANID:ODID:7:0:5:0]
>>> 21:50:30.374 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [5[JID:0.700:TAID:TID:ANID:ODID:7:0:9:0]
>>> 21:50:30.376 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [2[JID:0.700:TAID:TID:ANID:ODID:7:0:3:0]
>>> 21:50:30.376 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [5[JID:0.700:TAID:TID:ANID:ODID:7:0:8:0]
>>> 21:50:30.377 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [1[JID:0.700:TAID:TID:ANID:ODID:7:0:0:0]
>>> 21:50:30.378 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.700:TAID:TID:ANID:ODID:7:0:4:0]
>>> 21:50:30.381 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [1[JID:0.700:TAID:TID:ANID:ODID:7:0:1:0]
>>> 21:50:30.382 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [5[JID:0.701:TAID:TID:ANID:ODID:9:0:8:0]
>>> 21:50:30.383 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [2[JID:0.701:TAID:TID:ANID:ODID:9:0:3:0]
>>> 21:50:30.384 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.701:TAID:TID:ANID:ODID:9:0:4:0]
>>> 21:50:30.386 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.701:TAID:TID:ANID:ODID:9:0:5:0]
>>> 21:50:30.386 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [5[JID:0.701:TAID:TID:ANID:ODID:9:0:9:0]
>>> 21:50:30.389 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [4[JID:0.701:TAID:TID:ANID:ODID:9:0:6:0]
>>> 21:50:30.389 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [1[JID:0.701:TAID:TID:ANID:ODID:9:0:0:0]
>>> 21:50:30.392 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [2[JID:0.701:TAID:TID:ANID:ODID:9:0:2:0]
>>> 21:50:30.392 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [4[JID:0.700:TAID:TID:ANID:ODID:7:0:7:0]
>>> 21:50:30.392 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [4[JID:0.700:TAID:TID:ANID:ODID:7:0:6:0]
>>> 21:50:30.394 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [1[JID:0.701:TAID:TID:ANID:ODID:9:0:1:0]
>>> 21:50:30.395 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [2[JID:0.700:TAID:TID:ANID:ODID:7:0:2:0]
>>> 21:50:30.395 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [4[JID:0.701:TAID:TID:ANID:ODID:9:0:7:0]
>>> 21:50:30.396 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [1[JID:0.700:TAID:TID:ANID:ODID:1:1:1:0]
>>> 21:50:30.396 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [1[JID:0.701:TAID:TID:ANID:ODID:1:1:0:0]
>>> 21:50:30.396 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [1[JID:0.700:TAID:TID:ANID:ODID:1:1:0:0]
>>> 21:50:30.396 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [1[JID:0.701:TAID:TID:ANID:ODID:1:1:1:0]
>>> 21:50:30.397 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [5[JID:0.700:TAID:TID:ANID:ODID:1:1:8:0]
>>> 21:50:30.398 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.700:TAID:TID:ANID:ODID:1:1:4:0]
>>> 21:50:30.398 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.700:TAID:TID:ANID:ODID:1:1:5:0]
>>> 21:50:30.398 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.701:TAID:TID:ANID:ODID:1:1:4:0]
>>> 21:50:30.398 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.701:TAID:TID:ANID:ODID:1:1:5:0]
>>> 21:50:30.399 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [4[JID:0.700:TAID:TID:ANID:ODID:1:1:6:0]
>>> 21:50:30.399 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [2[JID:0.700:TAID:TID:ANID:ODID:1:1:3:0]
>>> 21:50:30.399 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [2[JID:0.700:TAID:TID:ANID:ODID:1:1:2:0]
>>> 21:50:30.399 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [5[JID:0.700:TAID:TID:ANID:ODID:1:1:9:0]
>>> 21:50:30.399 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [5[JID:0.701:TAID:TID:ANID:ODID:1:1:8:0]
>>> 21:50:30.399 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [5[JID:0.701:TAID:TID:ANID:ODID:1:1:9:0]
>>> 21:50:30.402 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [4[JID:0.700:TAID:TID:ANID:ODID:1:1:7:0]
>>> 21:50:30.402 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [4[JID:0.701:TAID:TID:ANID:ODID:1:1:6:0]
>>> 21:50:30.402 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [4[JID:0.701:TAID:TID:ANID:ODID:1:1:7:0]
>>> 21:50:30.402 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.700:TAID:TID:ANID:ODID:2:0:5:0]
>>> 21:50:30.402 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.700:TAID:TID:ANID:ODID:2:0:4:0]
>>> 21:50:30.402 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.700:TAID:TID:ANID:ODID:2:0:5:0]
>>> 21:50:30.402 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [3[JID:0.700:TAID:TID:ANID:ODID:2:0:4:0]
>>> 21:50:30.404 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [2[JID:0.701:TAID:TID:ANID:ODID:1:1:3:0]
>>> 21:50:30.404 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.common.work.WorkQueue - Executing:
>>> TaskComplete: [2[JID:0.701:TAID:TID:ANID:ODID:1:1:2:0]
>>> *21:51:54.203 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Requesting node 2
>>> to
>>> shutdown to ensure failure*
>>>
>>> 21:51:54.206 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Request to shutdown
>>> failed node 2 succeeded. false positive heartbeat miss indication
>>> 21:51:54.206 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - 2 considered dead.
>>> Last heartbeat received 83799ms ago. Max miss period: 80000ms
>>> 21:51:54.206 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Requesting node 5
>>> to
>>> shutdown to ensure failure
>>> 21:51:54.206 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Request to shutdown
>>> failed node 5 succeeded. false positive heartbeat miss indication
>>> 21:51:54.206 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - 5 considered dead.
>>> Last heartbeat received 83806ms ago. Max miss period: 80000ms
>>> 21:51:54.207 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Requesting node 3
>>> to
>>> shutdown to ensure failure
>>> 21:51:54.207 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Request to shutdown
>>> failed node 3 succeeded. false positive heartbeat miss indication
>>> 21:51:54.207 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - 3 considered dead.
>>> Last heartbeat received 83804ms ago. Max miss period: 80000ms
>>> 21:51:54.207 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Requesting node 4
>>> to
>>> shutdown to ensure failure
>>> 21:51:54.207 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - Request to shutdown
>>> failed node 4 succeeded. false positive heartbeat miss indication
>>> 21:51:54.207 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.cluster.NodeManager - 4 considered dead.
>>> Last heartbeat received 83804ms ago. Max miss period: 80000ms
>>> 21:51:54.207 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.work.RemoveDeadNodesWork - Number of
>>> affected jobs: 3
>>> 21:51:54.216 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - Abort map for job:
>>> JID:0.1: {1=[TAID:TID:ANID:ODID:0:0:0:0, TAID:TID:ANID:ODID:3:0:0:0,
>>> TAID:TID:ANID:ODID:3:0:1:0]}
>>> 21:51:54.216 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - Aborting:
>>> [TAID:TID:ANID:ODID:0:0:0:0, TAID:TID:ANID:ODID:3:0:0:0,
>>> TAID:TID:ANID:ODID:3:0:1:0] at 1
>>> 21:51:54.217 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.partitions.PartitionMatchMaker - Removing
>>> uncommitted partitions: []
>>> 21:51:54.217 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.partitions.PartitionMatchMaker - Removing
>>> partition requests: []
>>> 21:51:54.218 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - aborting doomed
>>> task
>>> clusters
>>> 21:51:54.218 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - number of doomed
>>> task
>>> clusters found = 0
>>> 21:51:54.218 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - Runnable TC roots:
>>> [TC:[TID:ANID:ODID:0:0:0, TID:ANID:ODID:3:0:0, TID:ANID:ODID:3:0:1,
>>> TID:ANID:ODID:3:0:2, TID:ANID:ODID:3:0:3, TID:ANID:ODID:3:0:4,
>>> TID:ANID:ODID:3:0:5, TID:ANID:ODID:3:0:6, TID:ANID:ODID:3:0:7,
>>> TID:ANID:ODID:3:0:8, TID:ANID:ODID:3:0:9]], inProgressTaskClusters: []
>>> 21:51:54.218 [Worker:ClusterController] ERROR
>>> org.apache.hyracks.control.cc.executor.JobExecutor - Unexpected failure.
>>> Aborting job JID:0.1
>>> org.apache.hyracks.api.exceptions.HyracksException: Node 2 not live
>>>         at
>>> org.apache.hyracks.control.cc.executor.JobExecutor.assignLoc
>>> ation(JobExecutor.java:473)
>>> ~[hyracks-control-cc-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
>>>         at
>>> org.apache.hyracks.control.cc.executor.JobExecutor.assignTas
>>> kLocations(JobExecutor.java:365)
>>> ~[hyracks-control-cc-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
>>>         at
>>> org.apache.hyracks.control.cc.executor.JobExecutor.startRunn
>>> ableTaskClusters(JobExecutor.java:245)
>>> ~[hyracks-control-cc-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
>>>         at
>>> org.apache.hyracks.control.cc.executor.JobExecutor.startRunn
>>> ableActivityClusters(JobExecutor.java:209)
>>> ~[hyracks-control-cc-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
>>>         at
>>> org.apache.hyracks.control.cc.executor.JobExecutor.notifyNod
>>> eFailures(JobExecutor.java:731)
>>> [hyracks-control-cc-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
>>>         at
>>> org.apache.hyracks.control.cc.work.RemoveDeadNodesWork.run(R
>>> emoveDeadNodesWork.java:60)
>>> [hyracks-control-cc-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
>>>         at
>>> org.apache.hyracks.control.common.work.WorkQueue$WorkerThrea
>>> d.run(WorkQueue.java:127)
>>> [hyracks-control-common-0.3.4-SNAPSHOT.jar:0.3.4-SNAPSHOT]
>>> 21:51:54.218 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - Abort map for job:
>>> JID:0.700: {}
>>> 21:51:54.218 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.partitions.PartitionMatchMaker - Removing
>>> uncommitted partitions: []
>>> 21:51:54.218 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.partitions.PartitionMatchMaker - Removing
>>> partition requests: []
>>> 21:51:54.218 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - aborting doomed
>>> task
>>> clusters
>>> 21:51:54.219 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - number of doomed
>>> task
>>> clusters found = 0
>>> 21:51:54.219 [Worker:ClusterController] INFO
>>>  org.apache.hyracks.control.cc.executor.JobExecutor - Abort map for job:
>>> JID:0.700: {1=[TAID:TID:ANID:ODID:2:0:0:0, TAID:TID:ANID:ODID:2:0:1:0]}
>>>
>>> Best,
>>> Taewoo
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message