hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth J (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-6968) list bucketing feature does not update the location map for unpartitioned tables
Date Thu, 24 Apr 2014 00:43:14 GMT
Prasanth J created HIVE-6968:
--------------------------------

             Summary: list bucketing feature does not update the location map for unpartitioned
tables
                 Key: HIVE-6968
                 URL: https://issues.apache.org/jira/browse/HIVE-6968
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.12.0, 0.11.0, 0.13.0, 0.14.0
            Reporter: Prasanth J
            Assignee: Prasanth J


list bucketing feature maintains a map of skewed columns/values to location in metastore.
This map is not getting updated for unpartitioned tables. For partitioned tables the location
map gets updated properly. To reproduce the issue
{code}
hive>set hive.mapred.supports.subdirectories=true;
hive>set mapred.input.dir.recursive=true;

hive>create table t(col1 string, col2 string);
hive>load  data local inpath '/home/hadoop/a.txt' into table t; 
hive> select * from t;                                                                
  
OK
1	a
2	b
3	c
4	a
5	b
6	a

hive>create tablet1(r1 string, r2 string) skewed by (r2) on (‘a’) stored as directories;
hive>insert into table t1 select * from t;
hive>desc extended t1;
OK
r1                  	string              	                    
r2                  	string              	                    
	 	 
Detailed Table Information	Table(tableName:t1, dbName:default, owner:pjayachandran, createTime:1398295903,
lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:r1, type:string,
comment:null), FieldSchema(name:r2, type:string, comment:null)], location:file:/app/warehouse/t1,
inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[r2],
skewedColValues:[[a]], skewedColValueLocationMaps:{}), storedAsSubDirectories:true), partitionKeys:[],
parameters:{numFiles=6, COLUMN_STATS_ACCURATE=true, transient_lastDdlTime=1398297887, numRows=6,
totalSize=72, rawDataSize=18}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)

Time taken: 0.119 seconds, Fetched: 4 row(s)
{code}

as seen from describe output *skewedColValueLocationMaps* is empty



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message