hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1193) ensure sorting properties for a table
Date Fri, 26 Feb 2010 17:45:28 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838958#action_12838958
] 

He Yongqiang commented on HIVE-1193:
------------------------------------

@Zheng,
>>1. How do we make sure that the data is bucketed / sorted? By adding an additional
map-reduce job?
Yes. 
>>2. What if the user already specified "CLUSTER BY key" in his query?
As 1, there will be a new job added which will redistribute the data. 
If the user specify a cluster by column different than the table's sort and bucket property,
we maybe should let it fail. But right now that cluster by is actually ignored.
>>3. Do we disable merging of small files when we do this?
Yes. We should disable it. we should disable it when enabled enforceBucketing or enforceSorting


> ensure sorting properties for a table
> -------------------------------------
>
>                 Key: HIVE-1193
>                 URL: https://issues.apache.org/jira/browse/HIVE-1193
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>             Fix For: 0.6.0
>
>         Attachments: hive.1193.1.patch
>
>
> If a table is sorted, and data is being inserted into that - currently, we dont make
sure that data is sorted. That might be useful some downstream operations.
> This cannot be made the default due to backward compatibility, but an option can be added
for the same

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message