hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <>
Subject [jira] Commented: (HIVE-1193) ensure sorting properties for a table
Date Fri, 26 Feb 2010 17:45:28 GMT


He Yongqiang commented on HIVE-1193:

>>1. How do we make sure that the data is bucketed / sorted? By adding an additional
map-reduce job?
>>2. What if the user already specified "CLUSTER BY key" in his query?
As 1, there will be a new job added which will redistribute the data. 
If the user specify a cluster by column different than the table's sort and bucket property,
we maybe should let it fail. But right now that cluster by is actually ignored.
>>3. Do we disable merging of small files when we do this?
Yes. We should disable it. we should disable it when enabled enforceBucketing or enforceSorting

> ensure sorting properties for a table
> -------------------------------------
>                 Key: HIVE-1193
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>             Fix For: 0.6.0
>         Attachments: hive.1193.1.patch
> If a table is sorted, and data is being inserted into that - currently, we dont make
sure that data is sorted. That might be useful some downstream operations.
> This cannot be made the default due to backward compatibility, but an option can be added
for the same

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message