hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-17403) Fail concatenation for unmanaged and transactional tables
Date Mon, 11 Sep 2017 20:22:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Prasanth Jayachandran updated HIVE-17403:
-----------------------------------------
    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

No related test failures now. Committed patch to master and branch-2

> Fail concatenation for unmanaged and transactional tables
> ---------------------------------------------------------
>
>                 Key: HIVE-17403
>                 URL: https://issues.apache.org/jira/browse/HIVE-17403
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 1.3.0, 3.0.0, 2.4.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Blocker
>         Attachments: HIVE-17403.1.patch, HIVE-17403.2.patch, HIVE-17403.2.patch, HIVE-17403.3.patch
>
>
> ALTER TABLE .. CONCATENATE should fail if the table is not managed by hive. 
> For unmanaged tables, file names can be anything. Hive has some assumptions about file
names which can result in data loss for unmanaged tables. 
> Example of this is a table/partition having 2 different files files (part-m-00000__1417075294718
and part-m-00018__1417075294718). Although both are completely different files, hive thinks
these are files generated by separate instances of same task (because of failure or speculative
execution). Hive will end up removing this file
> {code}
> 2017-08-28T18:19:29,516 WARN  [b27f10d5-d957-4695-ab2a-1453401793df main]: exec.Utilities
(:()) - Duplicate taskid file removed: file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00018__1417075294718
with length 958510. Existing file: file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-10000/part-m-00000__1417075294718
with length 1123116
> {code}
> DDL should restrict concatenation for unmanaged tables. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message