hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gang Tim Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-4213) List bucketing error too restrictive
Date Thu, 21 Mar 2013 22:19:16 GMT

    [ https://issues.apache.org/jira/browse/HIVE-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13609586#comment-13609586
] 

Gang Tim Liu commented on HIVE-4213:
------------------------------------

[~mgrover]

I am a little confused. Please correct me. The current logic is not restrictive. 

For example, it is legal for the following case: 
set hive.mapred.supports.subdirectories=true;
set mapred.input.dir.recursive=true;
set hive.optimize.listbucketing=false;
                
> List bucketing error too restrictive
> ------------------------------------
>
>                 Key: HIVE-4213
>                 URL: https://issues.apache.org/jira/browse/HIVE-4213
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mark Grover
>             Fix For: 0.11.0
>
>
> With the introduction of List bucketing, we introduced a config validation step where
we say:
> {code}
>   SUPPORT_DIR_MUST_TRUE_FOR_LIST_BUCKETING(
>       10199,
>       "hive.mapred.supports.subdirectories must be true"
>           + " if any one of following is true: hive.internal.ddl.list.bucketing.enable,"
>           + " hive.optimize.listbucketing and mapred.input.dir.recursive"),
> {code}
> This seems overly restrictive to because there are use cases where people may want to
use {{mapred.input.dir.recursive}} to {{true}} even when they don't care about list bucketing.
> Is that not true?
> For example, here is the unit test code for {{clientpositive/recursive_dir.q}}
> {code}
> CREATE TABLE fact_daily(x int) PARTITIONED BY (ds STRING);
> CREATE TABLE fact_tz(x int) PARTITIONED BY (ds STRING, hr STRING)
> LOCATION 'pfile:${system:test.tmp.dir}/fact_tz';
> INSERT OVERWRITE TABLE fact_tz PARTITION (ds='1', hr='1')
> SELECT key+11 FROM src WHERE key=484;
> ALTER TABLE fact_daily SET TBLPROPERTIES('EXTERNAL'='TRUE');
> ALTER TABLE fact_daily ADD PARTITION (ds='1')
> LOCATION 'pfile:${system:test.tmp.dir}/fact_tz/ds=1';
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> SELECT * FROM fact_daily WHERE ds='1';
> SELECT count(1) FROM fact_daily WHERE ds='1';
> {code}
> The unit test doesn't seem to be concerned about list bucketing but wants to set {{mapred.input.dir.recursive}}
to {{true}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message