hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Billy Pearson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1295) Federated HBase
Date Mon, 11 May 2009 07:41:45 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707924#action_12707924

Billy Pearson commented on HBASE-1295:

I was thanking on this there is some other thing to consider like table splits will the regions
be the same on both because there is no guarantee the compactions will happen at the same
time or the split will find the same mid key.

I would thank the master would be the idea process to pull logs a pass to peer master then
it can split the logs in to regions and pass the edits on to the servers hosting the regions.
I would like to see Sequential process of the edits to the peer so everything is in the same
order and that's the way we store the wal's now.

I am not sure what the current status of appends on hdfs right now but if we had that 100%
working the master could just remember where in the wal it read up to and pull every x secs
to see if there are any updates then we would not have to worry about waiting for a log to
roll which could be a while in some cases. Waiting for a log to roll for the updates to get
pushed to the peers seams like the wrong way to go with this but might be the only way we
have now if append is not working right in hdfs.

As for a first sync for the peers would be hugh saving if we could do a rolling read only
mode on the regions and flush the memcache and copy the needed files unlock the region and
start the transfer to the peer this would allow one by one copy of the regions to the remote
and  it would only be depending on the site-site bandwidth as the bottleneck in the mean time
the peer could be holding edits and waiting for all regions to get copied and then start the
replay of the logs skipping any edit that is older the the time stamp of the copy. I thank
that could be written in the hfile now I thank as meta data.

Just some suggestions and/or other thoughts

> Federated HBase
> ---------------
>                 Key: HBASE-1295
>                 URL: https://issues.apache.org/jira/browse/HBASE-1295
>             Project: Hadoop HBase
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>         Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
> HBase should consider supporting a federated deployment where someone might have terascale
(or beyond) clusters in more than one geography and would want the system to handle replication
between the clusters/regions. It would be sweet if HBase had something on the roadmap to sync
between replicas out of the box. 
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally scoped
edits to peer clusters. The HBase/Bigtable data model has convenient features (timestamps,
multiversioning) such that simple exchange of globally scoped cells would be conflict free
and would "just work". Implementation effort here would be in producing an efficient mechanism
for collecting up edits from all the HRS and transmitting the edits over the network to peers
where they would then be split out to the HRS there. Holding on to the edit trace and tracking
it until the remote commits succeed would also be necessary. So, HLog is probably the right
place to set up the tee. This would be filtered log shipping, basically.  
> This proposal does not consider transactional tables. For transactional tables, enforcement
of global mutation commit ordering would come into the picture if the user  wants the  transaction
to span the federation. This should be an optional feature even with transactional tables
themselves being optional because of how slow it would be.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message