hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prasanth Jayachandran <pjayachand...@hortonworks.com>
Subject Re: Skewed Tables
Date Fri, 25 Apr 2014 22:23:38 GMT
Lefty,

I can add this information. Can you please point me to the location to add this? Perhaps,
you can help reviewing it.

Thanks
Prasanth Jayachandran

On Apr 24, 2014, at 1:13 PM, Lefty Leverenz <leftyleverenz@gmail.com> wrote:

> I'm looking at the docs and thinking of ways to include this information.  But Prasanth,
if you want to do it yourself that would be great.
> 
> -- Lefty
> 
> 
> On Thu, Apr 24, 2014 at 5:33 AM, Mayur Gupta <mayur.gupta81@gmail.com> wrote:
> Thanks a lot Prasanth for the reply. I would have never figured that out as the documentation
at Hive Wiki DDL page and design page doesn't list this. 
> 
> One additional point it seems the Skewed table doesn't work when the table is created
as CTAS. The below statement doesn't create separate files. Is it a bug or is it by intent?
> 
> create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as directories
select r1, r2 from t2;
> 
> 
> On Thu, Apr 24, 2014 at 6:12 AM, Prasanth Jayachandran <pjayachandran@hortonworks.com>
wrote:
> Hi Mayur,
> 
> The reason why you see single file is, you have not enabled storing skewed columns/values
as directories.
> You can do the following to enable storing the skewed columns and values as directories
> 
> set hive.mapred.supports.subdirectories=true;
> set mapred.input.dir.recursive=true;
> create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as directories;
> 
> This will enable you to store the skewed columns as directories below
> 
> /user/hive/warehouse/t1/r2=a/000000_0 (skewed values go here)
> /user/hive/warehouse/t1/HIVE_DEFAULT_LIST_BUCKETING_DIR_NAME/000000_0 (all other values
go here)
> 
> With respect to your desc extended question where skewedColValueLocationMaps is empty,
its a bug in implementation. I just verified that it shows empty for unpartitioned tables.
But it shows correctly for partitioned tables.
> I have created a bug for unpartitioned tables here which you can track for progress on
this issue https://issues.apache.org/jira/browse/HIVE-6968
> 
> 
> Thanks
> Prasanth Jayachandran
> 
> On Apr 23, 2014, at 6:52 AM, Mayur Gupta <mayur.gupta81@gmail.com> wrote:
> 
>> Below is my skewedInfo
>> 
>> skewedInfo:SkewedInfo(skewedColNames:[r2], skewedColValues:[[a]], skewedColValueLocationMaps:{})
>> 
>> Any idea why is the skewedColValueLocationMaps empty? 
>> 
>> 
>> On Mon, Apr 21, 2014 at 11:19 AM, Mayur Gupta <mayur.gupta81@gmail.com> wrote:
>> Hey There,
>> 
>> I was trying to use Skewed tables but I am facing the issue that it is not creating
separate files for the skewed data. Even with a simple example I am having the same issue.
The hive version is 0.11.
>> 
>> create table t(col1 string, col2 string);
>> load  data local inpath '/home/hadoop/a.txt' into table t; 
>> 
>> create table t1(r1 string, r2 string) skewed by (r2) on ('a');
>> insert into table t1 select * from t;
>> 
>> The contents of a.txt are :
>> 1 ^Aa
>> 2^A b
>> 3 ^Ac
>> 4 ^Aa
>> 5 ^Ab
>> 6 ^Aa
>> 
>> I see only single file.
>> 
>> /user/hive/warehouse/t1/000000_0
>> 
>> Any pointers on what I am doing wrong?
>> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it
is addressed and may contain information that is confidential, privileged and exempt from
disclosure under applicable law. If the reader of this message is not the intended recipient,
you are hereby notified that any printing, copying, dissemination, distribution, disclosure
or forwarding of this communication is strictly prohibited. If you have received this communication
in error, please contact the sender immediately and delete it from your system. Thank You.
> 
> 


-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

Mime
View raw message