hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2223) Handle 10min+ network partitions between clusters
Date Fri, 11 Jun 2010 22:34:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12878113#action_12878113

HBase Review Board commented on HBASE-2223:

Message from: stack@duboce.net

This is an automatically generated e-mail. To reply, visit:

First pass.  Maybe by the 3rd pass I'll have an idea of whats going on.  General comment is
that there is a lot of new code here but tests seem to test replication system.  There are
few instances of unit tests ensuring newly added methods are working properly.


    For sure setConf will have been called before we get here?  So, stuff gets setup by setConf?
 Can setConf be called more than once?  How do I know how to use this class?  Not doc'd. 
Doesn't have a Constructor.


    The way this is done, if I didn't want to wait on the ttl, then I'd have to write a new
class.  Can't we have ttl and recplication be distinct and then if I want delete based off
ttl and whether log up in zk, then chain them?


    I dont follow?


    Should read this out of config. rather than hardcode 10.


    Same here.


    Long while loop; can break it up?


    Only operate on the first kv?


    Do you have to write position back to zk?


    Can code from HLog be used here?


    This ain't a constructor?


    We have to copy?


    Not a constructor.  If javadoc in an interface, you don't need to reproduce the javadoc
in the implementation.


    This should be SortedSet, not TreeSet... or NavigableSet.




    What does this class do?


    No dfs in this test.  Thats intentional?


    Can't you squash some of these tests together?  They each start up own minidfscluster...
just start it once?

- stack

> Handle 10min+ network partitions between clusters
> -------------------------------------------------
>                 Key: HBASE-2223
>                 URL: https://issues.apache.org/jira/browse/HBASE-2223
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Jean-Daniel Cryans
>            Assignee: Jean-Daniel Cryans
>             Fix For: 0.21.0
>         Attachments: HBASE-2223.patch
> We need a nice way of handling long network partitions without impacting a master cluster
(which pushes the data). Currently it will just retry over and over again.
> I think we could:
>  - Stop replication to a slave cluster if it didn't respond for more than 10 minutes
>  - Keep track of the duration of the partition
>  - When the slave cluster comes back, initiate a MR job like HBASE-2221 
> Maybe we want less than 10 minutes, maybe we want this to be all automatic or just the
first 2 parts. Discuss.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message