spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheng Lian (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-6450) MetastoreRelation.equals doesn't compare output attributes
Date Wed, 25 Mar 2015 08:20:52 GMT

     [ https://issues.apache.org/jira/browse/SPARK-6450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Cheng Lian updated SPARK-6450:
------------------------------
    Summary: MetastoreRelation.equals doesn't compare output attributes  (was: f)

> MetastoreRelation.equals doesn't compare output attributes
> ----------------------------------------------------------
>
>                 Key: SPARK-6450
>                 URL: https://issues.apache.org/jira/browse/SPARK-6450
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.3.0
>            Reporter: Anand Mohan Tumuluri
>            Assignee: Michael Armbrust
>            Priority: Blocker
>
> The below query was working fine till 1.3 commit 9a151ce58b3e756f205c9f3ebbbf3ab0ba5b33fd.(Yes
it definitely works at this commit although this commit is completely unrelated)
> It got broken in 1.3.0 release with an AnalysisException: resolved attributes ... missing
from .... (although this list contains the fields which it reports missing)
> {code}
> at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.run(Shim13.scala:189)
> 	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:231)
> 	at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:218)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:79)
> 	at org.apache.hive.service.cli.session.HiveSessionProxy.access$000(HiveSessionProxy.java:37)
> 	at org.apache.hive.service.cli.session.HiveSessionProxy$1.run(HiveSessionProxy.java:64)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
> 	at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:493)
> 	at org.apache.hive.service.cli.session.HiveSessionProxy.invoke(HiveSessionProxy.java:60)
> 	at com.sun.proxy.$Proxy17.executeStatementAsync(Unknown Source)
> 	at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:233)
> 	at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:344)
> 	at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1313)
> 	at org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1298)
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> 	at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:55)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> {code}
> {code}
> select Orders.Country, Orders.ProductCategory,count(1) from Orders join (select Orders.Country,
count(1) CountryOrderCount from Orders where to_date(Orders.PlacedDate) > '2015-01-01'
group by Orders.Country order by CountryOrderCount DESC LIMIT 5) Top5Countries on Top5Countries.Country
= Orders.Country where to_date(Orders.PlacedDate) > '2015-01-01' group by Orders.Country,Orders.ProductCategory;
> {code}
> The temporary workaround is to add explicit alias for the table Orders
> {code}
> select o.Country, o.ProductCategory,count(1) from Orders o join (select r.Country, count(1)
CountryOrderCount from Orders r where to_date(r.PlacedDate) > '2015-01-01' group by r.Country
order by CountryOrderCount DESC LIMIT 5) Top5Countries on Top5Countries.Country = o.Country
where to_date(o.PlacedDate) > '2015-01-01' group by o.Country,o.ProductCategory;
> {code}
> However this change not only affects self joins, it also seems to affect union queries
as well, like the below query which was again working before(commit 9a151ce) got broken
> {code}
> select Orders.Country,null,count(1) OrderCount from Orders group by Orders.Country,null
> union all
> select null,Orders.ProductCategory,count(1) OrderCount from Orders group by null, Orders.ProductCategory
> {code}
> also fails with a Analysis exception.
> The workaround is to add different aliases for the tables.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message