hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Grover (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-3554) Hive List Bucketing - Query logic
Date Thu, 01 Nov 2012 06:39:12 GMT

    [ https://issues.apache.org/jira/browse/HIVE-3554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488505#comment-13488505
] 

Mark Grover commented on HIVE-3554:
-----------------------------------

@Tim, I took a quick look at the patch and had a few questions for you.

1. Have you tested the following cases:
Test A: Renaming the skewed key
{code}
CREATE TABLE test (c1 STRING, c2 STRING) SKEWED BY (c1) ON ('x1');
ALTER TABLE test CHANGE c1 c STRING; -- Changing the type of the skewed key
SELECT * from test where c1='x1'; -- query with a skewed value
SELECT * from test where c1='x_something else'; -- query with a non-skewed value
{code}

Test B: Changing the type of the skewed key
{code}
CREATE TABLE test (c1 STRING, c2 STRING) SKEWED BY (c1) ON ('12');
ALTER TABLE test CHANGE c1 c1 INT; -- Changing the type of the skewed key
SELECT * from test where c1=12; -- query with a skewed value
SELECT * from test where c1=11; -- query with a non-skewed value
{code}

2. Is it possible for user to add/remove skewed keys once the skewed table has been created?
If so, would it make sense to add a test case for that?
3. Is it possible for user to add/remove skewed values ('x1', 'x_something else', 12, 11,
in above examples) once the skewed table has been created? If so, would it make sense to add
a test case for that?
                
> Hive List Bucketing - Query logic
> ---------------------------------
>
>                 Key: HIVE-3554
>                 URL: https://issues.apache.org/jira/browse/HIVE-3554
>             Project: Hive
>          Issue Type: New Feature
>    Affects Versions: 0.10.0
>            Reporter: Gang Tim Liu
>            Assignee: Gang Tim Liu
>         Attachments: HIVE-3554.patch.1, HIVE-3554.patch.2, HIVE-3554.patch.3, HIVE-3554.patch.4,
HIVE-3554.patch.5, HIVE-3554.patch.7, HIVE-3554.patch.8, HIVE-3554.patch.9
>
>
> This is part of efforts for list bucketing feature: https://cwiki.apache.org/Hive/listbucketing.html
> This patch includes:
> 1. Query logic: hive chooses right sub-directory instead of partition directory.
> 2. alter table grammar which is required to support query logic
> This patch doesn't include list bucketing DML. Main reasons:
> 1. risk. w/o DML, this patch won't impact any existing hive regression features since
no touch on any data manipulation so that very low risk.
> 2. manageability. w/ DML, patch is getting bigger and hard to review. Removing DML, it's
easy to review.
> We still disable hive feature by default since DML is not in yet.
> DML will be in follow-up patch. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message