hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zoltan Haindrich (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-17406) UDAF throws IllegalArgumentException for a complex input when column stats is not provided
Date Sun, 03 Sep 2017 06:17:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-17406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zoltan Haindrich updated HIVE-17406:
------------------------------------
    Status: Patch Available  (was: Open)

> UDAF throws IllegalArgumentException for a complex input when column stats is not provided
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-17406
>                 URL: https://issues.apache.org/jira/browse/HIVE-17406
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 2.3.0
>            Reporter: Makoto Yui
>            Assignee: Zoltan Haindrich
>            Priority: Minor
>         Attachments: HIVE-17406.1.patch
>
>
> I found that UDAF (both generic and non-generic UDAF w/ or w/o estimable) of Hive v2.3.0
throws IllegalArgumentException for a complex input when column stats is not provided. 
> The exception does not occur in v2.1.0.
> https://github.com/apache/hive/blob/34eebff194e81180202d198200e84058c4910d95/ql/src/java/org/apache/hadoop/hive/ql/stats/StatsUtils.java#L1156
> {code:sql}
> select version();
> > 2.3.0-amzn-0 rcb482944667f96f43c89932dcb66d61ee7e4ac1d
> with t2 as ( 
>   select array(1,2) as c1 
>   union all 
>   select array(2,3) as c1
> ) 
> select collect_list(c1) from t2;
> > FAILED: IllegalArgumentException Size requested for unknown type: java.util.Collection
> {code}
> On the other hand, it succeeds when colunm stats is provided as follows:
> {code:sql}
> create table t1 as (
>   select array(1,2) as c1 
>   union all
>   select array(2,3) as c1
> );
> > select collect_list(c1) from t1;
> [[1,2],[2,3]]
> > desc formatted t1;
> ...       
> Table Parameters:                
>         COLUMN_STATS_ACCURATE   {\"BASIC_STATS\":\"true\"}
>         numFiles                2                   
>         numRows                 2                   
>         rawDataSize             6                   
>         totalSize               8                   
>         transient_lastDdlTime   1503990290
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message