hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sankar Hariappan (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-17367) IMPORT table doesn't load from data dump if a metadata-only dump was already imported.
Date Sat, 26 Aug 2017 00:40:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16142035#comment-16142035
] 

Sankar Hariappan edited comment on HIVE-17367 at 8/26/17 12:39 AM:
-------------------------------------------------------------------

Added 02.patch with additional handling to support retry after failure of import command.

Request [~thejas], [~anishek], [~daijy] to please review.


was (Author: sankarh):
Added 02.patch with additional handling to support retry after failure of import command.

> IMPORT table doesn't load from data dump if a metadata-only dump was already imported.
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-17367
>                 URL: https://issues.apache.org/jira/browse/HIVE-17367
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2, Import/Export, repl
>    Affects Versions: 3.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>              Labels: DR, replication
>             Fix For: 3.0.0
>
>         Attachments: HIVE-17367.01.patch, HIVE-17367.02.patch
>
>
> Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data (as per events)
across clusters.
> For instance, let's say, insert generates 2 events such as
> ALTER_TABLE (ID: 10)
> INSERT (ID: 11)
> Each event generates a set of EXPORT and IMPORT commands.
> ALTER_TABLE event generates metadata only export/import
> INSERT generates metadata+data export/import.
> As Hive always dump the latest copy of table during export, it sets the latest notification
event ID as current state of it. So, in this example, import of metadata by ALTER_TABLE event
sets the current state of the table as 11.
> Now, when we try to import the data dumped by INSERT event, it is noop as the table's
current state(11) is equal to the dump state (11) which in-turn leads to the data never gets
replicated to target cluster.
> So, it is necessary to allow overwrite of table/partition if their current state equals
the dump state.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message