spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shawn Lavelle (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-21212) Can't use Count(*) with Order Clause
Date Mon, 26 Jun 2017 23:27:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063952#comment-16063952
] 

Shawn Lavelle edited comment on SPARK-21212 at 6/26/17 11:26 PM:
-----------------------------------------------------------------

[~srowen],Are you trying to say is that the order by is attempting to apply itself to the
aggregate count column while ignoring columns within the table itself? 


was (Author: azeroth2b):
[~srowen], I can assure you that value is a column in the table.  For example, 

{code}
select * from table where value between 1 and 5 order by value;
{code}
and
{code}
select count(*) from table where value between 1 and 5
{code}
both complete successfully.

> Can't use Count(*) with Order Clause
> ------------------------------------
>
>                 Key: SPARK-21212
>                 URL: https://issues.apache.org/jira/browse/SPARK-21212
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.1.0
>         Environment: Windows; external data provided through data source api
>            Reporter: Shawn Lavelle
>            Priority: Minor
>
> I don't think this should fail the query:
> _Notes: VALUE is a column of table TABLE. columns and table names redacted. I can generate
a simplified test case if needed, but this is easy to reproduce. _ 
> {code}jdbc:hive2://user:port/> select count(*) from table where value between 1498240079000
and cast(now() as bigint)*1000 order by value;
> {code}
> {code}
> Error: org.apache.spark.sql.AnalysisException: cannot resolve '`value`' given input columns:
[count(1)]; line 1 pos 113;
> 'Sort ['value ASC NULLS FIRST], true
> +- Aggregate [count(1) AS count(1)#718L]
>    +- Filter ((value#413L >= 1498240079000) && (value#413L <= (cast(current_timestamp()
as bigint) * cast(1000 as bigint))))
>       +- SubqueryAlias table
>          +- Relation[field1#411L,field2#412,value#413L,field3#414,field4#415,field5#416,field6#417,field7#418,field8#419,field9#420]
com.redacted@16004579 (state=,code=0)
> {code}
> Arguably, the optimizer could ignore the "order by" clause, but I leave that to more
informed minds than my own.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message