hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesus Camacho Rodriguez (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-13943) Null is inserted into an existing partition after replacing a column of int with string group type of text formatted partitioned table
Date Wed, 23 Nov 2016 10:04:58 GMT

     [ https://issues.apache.org/jira/browse/HIVE-13943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jesus Camacho Rodriguez updated HIVE-13943:
-------------------------------------------
    Target Version/s: 1.2.1, 2.2.0, 2.1.2  (was: 1.2.1, 2.2.0, 2.1.1)

> Null is inserted into an existing partition after replacing a column of int with string
group type of text formatted partitioned table 
> ---------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-13943
>                 URL: https://issues.apache.org/jira/browse/HIVE-13943
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.2.1, 2.1.0
>            Reporter: Takahiko Saito
>            Priority: Critical
>
> Create a text formatted table with a int column partitioned by a string column.
> After replacing the columns of int with string and inserting a new row with the existing
partition ('horton' in this case), null is inserted as a value in the altered column:
> {noformat}
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> drop table if exists test;
> No rows affected (0.249 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181>
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> create table test (i int) partitioned
by (s string);
> No rows affected (0.116 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181>
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> insert into table test partition (s)
values(1, 'horton');
> INFO  : Session is already open
> INFO  : Dag name: insert into table test partition...'horton')(Stage-1)
> INFO  :
> INFO  : Status: Running (Executing on YARN cluster with App id application_1464727816747_0759)
> INFO  : Map 1: 0/1
> INFO  : Map 1: 0(+1)/1
> INFO  : Map 1: 0(+1)/1
> INFO  : Map 1: 1/1
> INFO  : Loading data to table default.test partition (s=null) from hdfs://ts-0531-5.openstacklocal:8020/apps/hive/warehouse/test/.hive-staging_hive_2016-06-03_21-09-46_619_6747668726633461908-940/-ext-10000
> INFO  : 	 Time taken for load dynamic partitions : 122
> INFO  : 	Loading partition {s=horton}
> INFO  : 	 Time taken for adding to write entity : 0
> INFO  : Partition default.test{s=horton} stats: [numFiles=1, numRows=1, totalSize=2,
rawDataSize=1]
> No rows affected (8.301 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181>
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> select * from test;
> +---------+---------+--+
> | test.i  | test.s  |
> +---------+---------+--+
> | 1       | horton  |
> +---------+---------+--+
> 1 row selected (0.184 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181>
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> alter table test replace columns (i
string);
> No rows affected (0.138 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181>
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> insert into table test partition (s)
values ('horton', 'horton');
> INFO  : Session is already open
> INFO  : Dag name: insert into table test partition...'horton')(Stage-1)
> INFO  :
> INFO  : Map 1: 1/1
> INFO  : Loading data to table default.test partition (s=null) from hdfs://ts-0531-5.openstacklocal:8020/apps/hive/warehouse/test/.hive-staging_hive_2016-06-03_21-09-55_295_8631413609124947877-940/-ext-10000
> INFO  : 	 Time taken for load dynamic partitions : 190
> INFO  : 	Loading partition {s=horton}
> INFO  : 	 Time taken for adding to write entity : 0
> INFO  : Partition default.test{s=horton} stats: [numFiles=2, numRows=2, totalSize=9,
rawDataSize=7]
> No rows affected (1.35 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181>
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> select * from test;
> +---------+---------+--+
> | test.i  | test.s  |
> +---------+---------+--+
> | 1       | horton  |
> | NULL    | horton  |
> +---------+---------+--+
> 2 rows selected (0.08 seconds)
> {noformat}
> The below is explain of insertion:
> {noformat}
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> explain insert into table test partition
(s) values ('horton', 'horton');
> +-------------------------------------------------------------------------------------------------------+--+
> |                                                Explain                            
                   |
> +-------------------------------------------------------------------------------------------------------+--+
> | STAGE DEPENDENCIES:                                                               
                   |
> |   Stage-1 is a root stage                                                         
                   |
> |   Stage-2 depends on stages: Stage-1                                              
                   |
> |   Stage-0 depends on stages: Stage-2                                              
                   |
> |   Stage-3 depends on stages: Stage-0                                              
                   |
> |                                                                                   
                   |
> | STAGE PLANS:                                                                      
                   |
> |   Stage: Stage-1                                                                  
                   |
> |     Tez                                                                           
                   |
> |       DagId: hive_20160603211130_4262d739-5bc1-4be0-95c3-2b666f5db7b8:1323        
                   |
> |       Vertices:                                                                   
                   |
> |         Map 1                                                                     
                   |
> |             Map Operator Tree:                                                    
                   |
> |                 TableScan                                                         
                   |
> |                   alias: values__tmp__table__17                                   
                   |
> |                   Statistics: Num rows: 1 Data size: 14 Basic stats: COMPLETE Column
stats: NONE      |
> |                   Select Operator                                                 
                   |
> |                     expressions: tmp_values_col1 (type: string), tmp_values_col2 (type:
string)       |
> |                     outputColumnNames: _col0, _col1                               
                   |
> |                     Statistics: Num rows: 1 Data size: 14 Basic stats: COMPLETE Column
stats: NONE    |
> |                     File Output Operator                                          
                   |
> |                       compressed: false                                           
                   |
> |                       Statistics: Num rows: 1 Data size: 14 Basic stats: COMPLETE Column
stats: NONE  |
> |                       table:                                                      
                   |
> |                           input format: org.apache.hadoop.mapred.TextInputFormat  
                   |
> |                           output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
  |
> |                           serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
                  |
> |                           name: default.test                                      
                   |
> |                                                                                   
                   |
> |   Stage: Stage-2                                                                  
                   |
> |     Dependency Collection                                                         
                   |
> |                                                                                   
                   |
> |   Stage: Stage-0                                                                  
                   |
> |     Move Operator                                                                 
                   |
> |       tables:                                                                     
                   |
> |           partition:                                                              
                   |
> |             s                                                                     
                   |
> |           replace: false                                                          
                   |
> |           table:                                                                  
                   |
> |               input format: org.apache.hadoop.mapred.TextInputFormat              
                   |
> |               output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
              |
> |               serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe           
                   |
> |               name: default.test                                                  
                   |
> |                                                                                   
                   |
> |   Stage: Stage-3                                                                  
                   |
> |     Stats-Aggr Operator                                                           
                   |
> |                                                                                   
                   |
> +-------------------------------------------------------------------------------------------------------+--+
> {noformat}
> The issue is NOT seen with orc table.
> Also I tried with another table partitioned by timestamp, but the issue was NOT seen:
> {noformat}
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> drop table if exists test;
> No rows affected (0.266 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> create table test (i int) partitioned
by (ts timestamp);
> No rows affected (0.16 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> insert into table test partition (ts)
values (1, '2034-08-04 17:42:59.0');
> INFO  : Session is already open
> INFO  : Dag name: insert into table test partit...17:42:59.0')(Stage-1)
> INFO  :
> INFO  : Status: Running (Executing on YARN cluster with App id application_1464727816747_0760)
> INFO  : Map 1: 0/1
> INFO  : Map 1: 0(+1)/1
> INFO  : Map 1: 1/1
> INFO  : Loading data to table default.test partition (ts=null) from hdfs://ts-0531-5.openstacklocal:8020/apps/hive/warehouse/test/.hive-staging_hive_2016-06-03_21-44-11_737_2024697703048602688-940/-ext-10000
> INFO  : 	 Time taken for load dynamic partitions : 170
> INFO  : 	Loading partition {ts=2034-08-04 17:42:59.0}
> INFO  : 	 Time taken for adding to write entity : 1
> INFO  : Partition default.test{ts=2034-08-04 17:42:59.0} stats: [numFiles=1, numRows=1,
totalSize=2, rawDataSize=1]
> No rows affected (5.029 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> select * from test;
> +---------+------------------------+--+
> | test.i  |        test.ts         |
> +---------+------------------------+--+
> | 1       | 2034-08-04 17:42:59.0  |
> +---------+------------------------+--+
> 1 row selected (0.158 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> alter table test replace columns (s
string);
> No rows affected (0.14 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> insert into table test partition (ts)
values ('2', '2034-08-04 17:42:59.0');
> INFO  : Session is already open
> INFO  : Dag name: insert into table test partit...17:42:59.0')(Stage-1)
> INFO  :
> INFO  : Status: Running (Executing on YARN cluster with App id application_1464727816747_0760)
> INFO  : Map 1: 0/1
> INFO  : Map 1: 0(+1)/1
> INFO  : Map 1: 0/1
> INFO  : Map 1: 1/1
> INFO  : Loading data to table default.test partition (ts=null) from hdfs://ts-0531-5.openstacklocal:8020/apps/hive/warehouse/test/.hive-staging_hive_2016-06-03_21-44-45_488_1120953773256187058-940/-ext-10000
> INFO  : 	 Time taken for load dynamic partitions : 254
> INFO  : 	Loading partition {ts=2034-08-04 17:42:59.0}
> INFO  : 	 Time taken for adding to write entity : 1
> INFO  : Partition default.test{ts=2034-08-04 17:42:59.0} stats: [numFiles=2, numRows=2,
totalSize=4, rawDataSize=2]
> No rows affected (7.127 seconds)
> 0: jdbc:hive2://ts-0531-1.openstacklocal:2181> select * from test;
> +---------+------------------------+--+
> | test.s  |        test.ts         |
> +---------+------------------------+--+
> | 1       | 2034-08-04 17:42:59.0  |
> | 2       | 2034-08-04 17:42:59.0  |
> +---------+------------------------+--+
> 2 rows selected (0.115 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message