hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: Silly question... Coprocessor writes to ZK?
Date Fri, 05 Sep 2014 19:56:14 GMT

Hmmm. Interesting. 

Lets take a step back for a second.
Which do you prefer: A push model or a pull/poll  model?

That would kind of dictate the decisions you would make in terms of design. 

To your point about moving away from ZK, it would mean putting those features in to HBase
directly. It can be done, but if you won’t fix coprocessors, then why re-invent the wheel
for a system that really isn’t a standalone system like an Oracle database. (Look at it
this way… when was the last time you installed an HBase instance where you didn’t have
HDFS? ) 

So if you did actually do this, the features found in ZK would probably be tossed in to the
HMaster and you will end up requiring a quorum of HMasters just like you have in ZK. 
The downside… by keeping with ZK you have the ability to interact with other systems that
also use ZK so you could potentially evolve in to a system where you can run a single query
against multiple data sources?
(Someone is doing that right? ) 

And if you think about it… a push model would be more efficient and if you’re going to
have the RS push it to something… it would most likely be ZK. 

Sorry, I’m just being lazy…


PS.  On the Ganglia topic… why not just store the data in HBase and have ganglia or your
own D3 view  hit HBase instead? 

On Sep 5, 2014, at 7:17 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:

> I guess my thought is that it'd be nice to minimize dependency on ZK,
> and eventually remove it all together.  It just adds too much
> deployment complexity, and code complexity -- about 10000 lines of
> code.
> I do like the notion of HBase self-hosting it's own performance data,
> it's what Oracle and other databases do.  Ganglia is annoying to
> install, and often isnt.
> On Fri, Sep 5, 2014 at 11:10 AM, Michael Segel
> <michael_segel@hotmail.com> wrote:
>> @Ted,
>> Yes, that’s the general idea or rather a specific use case for what I was thinking.
>> So it would be a different mechanism to help manage the information.
>> I would think that it would result in faster access to the information.
>> This would be very important if one were to do some query optimization… and by
using ZK… you could think beyond just HBase, but doing a query to join data from both HBase
and non HBase systems.
>> Just a thought… ;-)
>> -Mike
>> On Sep 5, 2014, at 2:29 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>>> This reminds me of
>>> HBASE-7958 Statistics per-column family per-region
>>> Cheers
>>> On Thu, Sep 4, 2014 at 6:23 PM, Mikhail Antonov <olorinbant@gmail.com>
>>> wrote:
>>>> I think ZK isn't the best possible storage for statistics. A separate stats
>>>> table may be better solution.
>>>> -Mikhail
>>>> 2014-09-04 15:48 GMT-07:00 Michael Segel <michael_segel@hotmail.com>:
>>>>> So suppose I want to capture metadata about a table across all of the
>>>>> regions for that table.
>>>>> Has anyone used a coprocessor to capture a region’s statistics and
>>>>> them up to ZK where its stored by (table, region, <metadata object>)
>>>>> then a table wide value is also stored based on a computational update?
>>>>> So if I wanted to store the row counts for each region of a table,  each
>>>>> region would update its record in ZK on each insert / delete (can you
>>>>> easily remove a tombstone?) and then update the computational value?
>>>>> (Assuming you could lock those values for a short enough time to do the
>>>>> quick computation. If not, then it can be computed on the fly)
>>>>> Has this been done?
>>>>> Thoughts?
>>>> --
>>>> Thanks,
>>>> Michael Antonov

View raw message