falcon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Balu Vellanki (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (FALCON-1390) Develop Auto Bootstrapping for HiveDR
Date Wed, 02 Dec 2015 00:19:11 GMT

    [ https://issues.apache.org/jira/browse/FALCON-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034945#comment-15034945
] 

Balu Vellanki edited comment on FALCON-1390 at 12/2/15 12:19 AM:
-----------------------------------------------------------------

[~peeyushb] : your design doc says that 

{code}
1. Get list of tables for specified DB.
2. Check whether returned tables from step 1 exists on target Hive server.
3. If exists, don’t bootstrap and get last event id.
{code}

This will not work in all scenarios. HiveServer deletes notification log older than N days
(N can be configured).  If the table exists and the last event id is 100, you cannot assume
that all events after 100 will be available in the NOTIFICATION_LOG table in the source. The
log might only have notifications starting at 1000. 

Any task that replaces manual bootstrap will have to generate the entire table data on source,
copy it to target hive and import the data. Once this task is complete, the bootstrap process
will have to set the last_event_id in the target hive table. 


was (Author: bvellanki):
{code}
1. Get list of tables for specified DB.
2. Check whether returned tables from step 1 exists on target Hive server.
3. If exists, don’t bootstrap and get last event id.
{code}

This will not work in all scenarios. HiveServer deletes notification log older than N days
(N can be configured).  If the table exists and the last event id is 100, you cannot assume
that all events after 100 will be available in the NOTIFICATION_LOG table in the source. The
log might only have notifications starting at 1000. 

Any task that replaces manual bootstrap will have to generate the entire table data on source,
copy it to target hive and import the data. Once this task is complete, the bootstrap process
will have to set the last_event_id in the target hive table. 

> Develop Auto Bootstrapping for HiveDR
> -------------------------------------
>
>                 Key: FALCON-1390
>                 URL: https://issues.apache.org/jira/browse/FALCON-1390
>             Project: Falcon
>          Issue Type: New Feature
>    Affects Versions: 0.7
>            Reporter: Peeyush Bishnoi
>            Assignee: Peeyush Bishnoi
>         Attachments: AutoBootstrap_DB_Table.pdf
>
>
> Currently Hive DR require manual bootstrap of Database and Table to be replicated, if
not available on target cluster. It is good to automate the Database and Table bootstrap so
that user should not perform manually.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message