hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-8637) In insert into X select from Y, table properties from X are clobbering those from Y
Date Tue, 28 Oct 2014 20:56:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-8637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14187445#comment-14187445
] 

Alan Gates commented on HIVE-8637:
----------------------------------

The issue is that HiveOutputFormatImpl.checkOutputSpecs writes the table properties for table
X into the conf file.  When HiveInputFormat.getInputSplits later takes that same conf file,
and goes to copy the table properties in, it calls Utilities.copyTableJobPropertiesToConf
(the same method that checkOutputSpecs did).  The issue is in copyTablePropertiesToConf, it
does not overwrite a given table property in the job conf if it is already set.  This means
that many of the table properties from Y don't get propagated because the values from X are
already set.

I do not believe this is a new problem, but it is showing up now because reading transactional
tables depends on the bucket count to be accurate.  So a query like:
{code}
create table notbucketed (a string, b int);
create table transactional (a string, b int) clustered by (b) into 2 buckets stored as orc
tblproperties = ('transactional' = 'true');
insert into table notbucketed select * from transactional;
{code}
results in the table 'transactional' being told it has no buckets.  Since the acid reader
depends on this value, it concludes that with no buckets it has no splits, and thus the above
insert writes nothing into 'notbucketed' regardless of how many records are in 'transactional'.


> In insert into X select from Y, table properties from X are clobbering those from Y
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-8637
>                 URL: https://issues.apache.org/jira/browse/HIVE-8637
>             Project: Hive
>          Issue Type: Task
>    Affects Versions: 0.14.0
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>            Priority: Critical
>             Fix For: 0.14.0
>
>
> With a query like:
> {code}
> insert into table X select * from Y;
> {code}
> the table properties from table X are being sent to the input formats for table Y.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message