hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <>
Subject [jira] [Commented] (HIVE-8793) Make sure multi-insert works with map join [Spark Branch]
Date Wed, 12 Nov 2014 08:02:34 GMT


Rui Li commented on HIVE-8793:

Hi [~xuefuz],

The failed tests are because I changed how {{ExplainTask}} prints maps: if the map to print
is a {{LinkedHashMap}}, it keeps the original order, rather than re-order the map with a {{TreeMap}}.
I did this to avoid printing like following:
Reducer 3 <- Reducer 5
Reducer 4 <- Reducer 6
Reducer 5 <- Map1
Reducer 6 <- Map1
And change it to something like:
Reducer 5 <- Map1
Reducer 3 <- Reducer 5
Reducer 6 <- Map1
Reducer 4 <- Reducer 6
But problem is that even the LinkedHashMap's order can be non-deterministic. So I'll revert
this change as it's only a printing issue and well beyond the scope here. I can create separate
JIRA if you think it's worth the effort.

> Make sure multi-insert works with map join [Spark Branch]
> ---------------------------------------------------------
>                 Key: HIVE-8793
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Chao
>            Assignee: Rui Li
>         Attachments: HIVE-8793.1-spark.patch
> Currently, HIVE-8622 is implemented based on an assumption, that for a map join query,
a BaseWork would not have multiple children. By testing through subquery_multiinsert.q did
reveal that's the case. But, we need to investigate on this, and make sure this won't happen
in general.

This message was sent by Atlassian JIRA

View raw message