hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17290) Potential loss of data for replication of bulk loaded hfiles
Date Thu, 05 Jan 2017 16:52:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15801852#comment-15801852

Ted Yu commented on HBASE-17290:

Please enrich the javadoc with explanation for the second component of the Pair.
+   * @param pairs list of hfile references to be added
Please make the parameter name consistent - should be pairs
+  public void addHFileRefs(String peerId, List<Pair<Path, Path>> files)
For ReplicationObserver.java, please reference the format of license header of existing classes.
Consider adding @InterfaceAudience.LimitedPrivate(HBaseInterfaceAudience.CONFIG) to this class.
+      LOG.debug("Skipping recording bulk load entries in preCommitStoreFile for bulkloaded
+          + "data replication.");
It would be better if the case for bulk load replication and the case where pairs is empty
are logged separately.
This would facilitate troubleshooting.

Can you point me to the code which handles the case where Path for bulk loaded hfile is recorded
but the commit (move of hfile) fails ?
In that scenario, the file wouldn't be found at time of replication.

> Potential loss of data for replication of bulk loaded hfiles
> ------------------------------------------------------------
>                 Key: HBASE-17290
>                 URL: https://issues.apache.org/jira/browse/HBASE-17290
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 1.3.0
>            Reporter: Ted Yu
>            Assignee: Ashish Singhi
>             Fix For: 2.0.0, 1.4.0
>         Attachments: HBASE-17290.patch
> Currently the support for replication of bulk loaded hfiles relies on bulk load marker
written in the WAL.
> The move of bulk loaded hfile(s) (into region directory) may succeed but the write of
bulk load marker may fail.
> This means that although bulk loaded hfile is being served in source cluster, the replication
wouldn't happen.
> Normally operator is supposed to retry the bulk load. But relying on human retry is not
robust solution.

This message was sent by Atlassian JIRA

View raw message