hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: multi-data center support
Date Thu, 03 May 2012 22:00:02 GMT
Well if you need a MR job to aggregate the counters maybe you have too
many data centers? :)

What I meant is that each physical counter should be tagged with the
data center its in. You want to count page_views so you'd have:

page_views_CA => 5
page_views_NY => 10
page_views_FL => 2

So on read you get those three and treat it as one, eg 17. No need to
do massive rollups unless you're really planning on having thousands
of data centers.


On Thu, May 3, 2012 at 2:39 PM, Marco Villalobos
<mvillalobos@kineteque.com> wrote:
> Hence a counter should be local to a data-center, and perhaps a map
> reduce job can aggregate them later, then replicate?
> I hope something like that works.
> On Thu, May 3, 2012 at 1:23 PM, Jean-Daniel Cryans <jdcryans@apache.org> wrote:
>> Since 0.92 you can replicate in a Master-Master fashion if you want,
>> just set each cluster to be the slave of the other, but it won't work
>> for counters. The reason is that a counter is a "Put" in the end with
>> a specific value.
>> This issue is described here: https://issues.apache.org/jira/browse/HBASE-2804
>> One way to solve it is to shard your counters, on read you just sum them up.
>> J-D
>> On Thu, May 3, 2012 at 1:14 PM, Marco Villalobos
>> <mvillalobos@kineteque.com> wrote:
>>> I'm fine with replication.
>>> But does that mean I can only write from one data-center?
>>> Ideally I would want counters to work across data-center, with the
>>> correct increment eventually merging.
>>> On Thu, May 3, 2012 at 11:26 AM, Jean-Daniel Cryans <jdcryans@apache.org>
>>>> A single HBase instance doesn't work across datacenters, maybe that's
>>>> why you haven't found any documentation.
>>>> HBase does have replication between clusters, see
>>>> http://hbase.apache.org/replication.html
>>>> J-D
>>>> On Thu, May 3, 2012 at 11:10 AM, Marco Villalobos
>>>> <mvillalobos@kineteque.com> wrote:
>>>>> I have not found any documentation on how hbase would work across
>>>>> multiple data-centers.
>>>>> In fact, I am concerned about how a centralized zookeeper would make
>>>>> multi-data center support impossible.
>>>>> How is this handled?  What if somebody needs to read and write from
>>>>> multiple data-centers?
>>>>> Any advice?

View raw message