[ https://issues.apache.org/jira/browse/TEZ-1635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159558#comment-14159558
]
Rajesh Balamohan edited comment on TEZ-1635 at 10/5/14 3:23 PM:
----------------------------------------------------------------
Attaching the successful and hung job details for tez_smb_1.q with additional logs in org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.
tez_smb_1.q: (DAG snapshot is already attached)
- Map_1 [Map_2] - MultiMRInput, initializer=MRInputAMSplitGenerator
- Map_1 [s1] - MRInputLegacy, initializer=MRInputAMSplitGenerator
- (Map_1 [Map_2], Map_1 [s1]) --> Map_1[MapTezProcessor]
- Map_1[MapTezProcessor] --> Map_1[out_Map_1] MROutput
Map_1 vertexManager is org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex
Successful job:
===============
- When CustomPartitionVertex.onRootVertexInitialized() gets called for "s1", CustomPartionVertex.processAllEvents()
is invoked which internally populates bucketToTaskMap datastructure.
- When CustomPartitionVertex.onRootVertexInitialized() gets called for "Map_2", CustomPartitionVertex.processAllSideEvents()
is invoked which depends on bucketToTaskMap to generate the InputDataInformationEvent.
Failure/hung job:
===============
- CPV.onRootVertexInitialized() gets called for "Map_2" first. This ends up calling CPV.processAllSideEvents().
Since bucketToTaskMap structure is empty, it would *not* generate any InputDataInformationEvent.
- CPV.onRootVertexInitialized() gets called for "s1" later.
In this case, events pertaining to MultiMRInput (Map_2) is never sent to Tez from CustomPartitionVertex.
[~hagleitn] - Is this expected behavior of CustomPartitionVertex?
was (Author: rajesh.balamohan):
Attaching the successful and hung job details for tez_smb_1.q with additional logs in org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex.
tez_smb_1.q: (DAG snapshot is already attached)
- Map_1 [Map_2] - MultiMRInput, initializer=MRInputAMSplitGenerator
- Map_1 [s1] - MRInputLegacy, initializer=MRInputAMSplitGenerator
- (Map_1 [Map_2], Map_1 [s1]) --> Map_1[MapTezProcessor]
- Map_1[MapTezProcessor] --> Map_1[out_Map_1] MROutput
Map_1 vertexManager is org.apache.hadoop.hive.ql.exec.tez.CustomPartitionVertex
Successful job:
===============
- When CustomPartitionVertex.onRootVertexInitialized() gets called for "s1", CustomPartionVertex.processAllEvents()
is invoked which internally populates bucketToTaskMap datastructure.
- When CustomPartitionVertex.onRootVertexInitialized() gets called for "Map_2", CustomPartitionVertex.processAllSideEvents()
is invoked which depends on bucketToTaskMap to generate the InputDataInformationEvent.
Failure/hung job:
===============
- CPV.onRootVertexInitialized() gets called for "Map_2" first. This ends up calling CPV.processAllSideEvents().
Since bucketToTaskMap structure is empty, it would *not* generate any InputDataInformationEvent.
- CPV.onRootVertexInitialized() gets called for "s1" later.
In this case, events pertaining to MultiMRInput (Map_2) is never sent to Tez from CustomPartitionVertex.
> Dag gets stuck intermittently
> -----------------------------
>
> Key: TEZ-1635
> URL: https://issues.apache.org/jira/browse/TEZ-1635
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.5.0
> Reporter: Vikram Dixit K
> Priority: Blocker
> Attachments: Screen Shot 2014-10-05 at 9.46.31 AM.png, syslog_dag_1412109415326_0002_10.gz,
tez_smb_1_hung_job.log, tez_smb_1_successful_job.log
>
>
> Attaching logs for the dag.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|