hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7731) Spark: Incorrect result returned when a map work has multiple downstream reduce works
Date Thu, 14 Aug 2014 12:25:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096910#comment-14096910
] 

Thejas M Nair commented on HIVE-7731:
-------------------------------------

Editing the title to convey that this is noticed only in the spark mode. Please include spark
in the title when the issue is specific to spark mode, that will also help while reading the
commit log and release notes.


> Spark: Incorrect result returned when a map work has multiple downstream reduce works
> -------------------------------------------------------------------------------------
>
>                 Key: HIVE-7731
>                 URL: https://issues.apache.org/jira/browse/HIVE-7731
>             Project: Hive
>          Issue Type: Bug
>          Components: Spark
>            Reporter: Rui Li
>
> Encountered when running on spark. Suppose we have three tables:
> {noformat}
> table1(x int, y int);
> table2(x int);
> table3(x int);
> {noformat}
> I run the following query:
> {noformat}
> from table1
> insert overwrite table table2 select x group by x
> insert overwrite table table3 select y group by y;
> {noformat}
> The query generates 1 map and 2 reduces. The map operator has 2 RS, so I suppose it has
output for both reduces.
> The problem is all (incorrect) results go to table2 and table3 is empty.
> I tried the same query on MR and it gives correct results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message