hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ryan rawson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2315) BookKeeper for write-ahead logging
Date Fri, 12 Mar 2010 19:39:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844635#action_12844635
] 

ryan rawson commented on HBASE-2315:
------------------------------------

A few more questions...

What's the persistence story? How many nodes is the log data stored on?

On performance, how about 100-200k ops/sec with data sizes about 150 bytes
or so? This would be aggregately generated on 20 nodes.

On Mar 12, 2010 2:18 PM, "Flavio Paiva Junqueira (JIRA)" <jira@apache.org>
wrote:


   [
https://issues.apache.org/jira/browse/HBASE-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12844627#action_12844627]


Flavio Paiva Junqueira commented on HBASE-2315:
-----------------------------------------------
Those are good questions, Stack. BookKeeper scales throughput with the
number of servers, so adding more bookies should increase your current
throughput if your traffic is not saturating the network already.
Consequently, if you don't have a constraint on the number of bookies you'd
like to use, your limitation would be the amount of bandwidth you have
available.

Just to give you some numbers, we have so far been able to saturate the
network when writing around 1k bytes per entry, and the number of writes/s
for 1k writes is of the order of tens of thousands for 3-5 bookies. Now, if
I pick the largest numbers in your small example to consider the worst case
(5 nodes, 1MB writes, 5k writes/s), then we would need a 40Gbit/s network,
so I'm not sure you can do it with any distributed system unless you write
locally in all nodes, in which case you can't guarantee the data will be
available upon a node crash. Let me know if I'm misinterpreting your comment
and you have something else in mind.

I also have to mention that we added a feature to enable thousands of
concurrent ledgers with minimal performance penalty on the writes, so I
don't see any trouble in increasing the number of concurrent nodes as long
as the BookKeeper cluster is provisioned accordingly. Of course, it would be
great to measure it with hbase, though.





> BookKeeper for write-ahead logging
> ----------------------------------
>
>                 Key: HBASE-2315
>                 URL: https://issues.apache.org/jira/browse/HBASE-2315
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: regionserver
>            Reporter: Flavio Paiva Junqueira
>         Attachments: HBASE-2315.patch, zookeeper-dev-bookkeeper.jar
>
>
> BookKeeper, a contrib of the ZooKeeper project, is a fault tolerant and high throughput
write-ahead logging service. This issue provides an implementation of write-ahead logging
for hbase using BookKeeper. Apart from expected throughput improvements, BookKeeper also has
stronger durability guarantees compared to the implementation currently used by hbase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message