hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-16559) Parquet schema evolution for partitioned tables may break if table and partition serdes differ
Date Mon, 26 Jun 2017 13:06:01 GMT

    [ https://issues.apache.org/jira/browse/HIVE-16559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16063052#comment-16063052
] 

Hive QA commented on HIVE-16559:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12874485/HIVE-16559.06.patch

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 13 failed/errored test(s), 10846 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestBeeLineDriver.testCliDriver[insert_overwrite_local_directory_1]
(batchId=238)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[tez_smb_main] (batchId=150)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=146)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query16] (batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query23] (batchId=233)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query94] (batchId=233)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testBootstrapFunctionReplication
(batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionIncrementalReplication
(batchId=217)
org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcrossInstances.testCreateFunctionWithFunctionBinaryJarsOnHDFS
(batchId=217)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionRegistrationWithCustomSchema (batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testPartitionSpecRegistrationWithCustomSchema
(batchId=178)
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation (batchId=178)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/5772/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/5772/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-5772/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 13 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12874485 - PreCommit-HIVE-Build

> Parquet schema evolution for partitioned tables may break if table and partition serdes
differ
> ----------------------------------------------------------------------------------------------
>
>                 Key: HIVE-16559
>                 URL: https://issues.apache.org/jira/browse/HIVE-16559
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>            Reporter: Barna Zsombor Klara
>            Assignee: Barna Zsombor Klara
>             Fix For: 3.0.0
>
>         Attachments: HIVE-16559.01.patch, HIVE-16559.02.patch, HIVE-16559.03.patch, HIVE-16559.04.patch,
HIVE-16559.05.patch, HIVE-16559.06.patch
>
>
> Parquet schema evolution should make it possible to have partitions/tables 
>  backed by files with different schemas. Hive should match the table columns with file
columns based on the column name if possible.
> However if the serde for a table is missing columns from the serde of a partition Hive
fails to match the columns together.
> Steps to reproduce:
> {code}
> CREATE TABLE myparquettable_parted
> (
>   name string,
>   favnumber int,
>   favcolor string,
>   age int,
>   favpet string
> )
> PARTITIONED BY (day string)
> STORED AS PARQUET;
> INSERT OVERWRITE TABLE myparquettable_parted
> PARTITION(day='2017-04-04')
> SELECT
>    'mary' as name,
>    5 AS favnumber,
>    'blue' AS favcolor,
>    35 AS age,
>    'dog' AS favpet;
> alter table myparquettable_parted
> REPLACE COLUMNS
> (
> favnumber int,
> age int
> );   <!--- No cascade option, so the partition will not be altered. 
> {code}
> {{SELECT * FROM myparquettable_parted where day='2017-04-04';}}
> will fail with:
> {{java.lang.UnsupportedOperationException: Cannot inspect org.apache.hadoop.io.IntWritable}}
> Hive should either match the columns together or prevent the user from dropping columns
from the table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message