hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject Re: [DISCUSS] Multi-Cluster HBase Client
Date Tue, 30 Jun 2015 14:55:57 GMT
Sean, 

You’re a developer, or just some higher level primate that pounds code? 

I don’t want to embarrass you, but what do they teach in engineering schools these days?


An ‘elevator pitch’ is more than just some sound bites. The 30 second pitch is a high
level brief or an executive summary that is supposed to catch the attention of your audience.
 Good luck getting the attention of your business sponsor to approve your project if you can’t
explain it quickly and concisely. The reason I asked for it is that what Ted write had two
different connotations. 

But I digress. 

As you can see from the ensuing messages in this thread, the clarification that I requested
was needed.  (The Big Table presentation wasn’t talking about what Ted wanted to do.) 

Which brings us to the point of my questioning to Ted.

While its great that Ted is thinking, he needs to think about the problem and why what he
is asking for is not such a great idea. 

Ted,  
I would have to say that trying to create a fault tolerant client for connecting to paired
clusters won’t work well in practice and here’s why:

Draw out a pretty picture. You have Cluster A represented by a cloud on the left hand side.
 You have cluster B represented on the right hand side. You have an arrow going from Cluster
A to Cluster B. This represents the flow of the data replication.   Now draw below a stick
figure , or a circle to represent the client.  Now draw two bidirectional arrows between the
client and the two clusters.  You have a triangle.   So you can have either an Active/Active
or Active / Passive scenario.   

If you have Active/Passive, then you’re not really going to want to serve data from Cluster
B until Cluster A fails.  If you have an Active / Active, then you have both cluster serving
data.  Which gets us back to another point… 
You have to have another box or circle representing the data ingestion source.  You then draw
the line to cluster A.  (You said replication to Cluster B)  So now you have the following
scenario…. 

Data flows in to cluster A, its then replicated to cluster B. That replication doesn’t happen
instantaneously. So there is some time delta where data exists in one cluster, but not the
other cluster.  So that if you did a set of simultaneous scans or ran a server side function
(using coprocessors) you would not be guaranteed the same result.  (Ooops!) 

Eventual consistency between the two clusters will kill you.  

The other issue… what constitutes a failure so that you don’t want to connect to Cluster
A, but to Cluster B? 

The safest thing is to fail, and recognized that you failed and to then connect to cluster
B. 

The problem you face is that you have a lot of variables working against you.  Are the clusters
homogenous?  Are the pipes to either data center the same?  Is the pipe between each data
center large enough?  (e.g. you may have a 40Gb/S pipe between data centers, but if you’ve
got a lot of systems pushing data over the network… you can still see delays between the
two clusters. (Assuming that the data centers are only 30km - 50km apart (or less) so that
you have a minimal amount of network latency. ) 

At the same time… suppose you have your data source write to both clusters simultaneously
and you’re serving data from both clusters simultaneously. 
You may still have small windows where the clusters’ results will not be consistent. (But
lets assume that they are consistent. ) We now have your redundant client. You would therefore
run the same scan across both data centers. Assuming that you get the same results, you take
the first data set back and use it. But you’re constantly using resources on the second
cluster and wasting them.  

You’re suggesting a generic client and that means that you’re wasting cluster resources
all the time. 

If you had a specific application… then your solution is an one off solution. So here, you
might as well just write a multi-threaded client where each connection is thread safe and
just overload the call to when you scan to pass the scan to both connections. 

HTH




> On Jun 29, 2015, at 3:09 PM, Sean Busbey <busbey@cloudera.com> wrote:
> 
> Michael,
> 
> This is the dev list, no sound-bite pitch is needed. We have plenty of
> features that take time to explain the nuance. Please either engage with
> the complexity of the topic or wait for the feature to land and get
> user-accessible documentation. We all get busy from time to time, but
> that's no reason to push a higher burden on those who are currently engaged
> with a particular effort, especially this early in development.
> 
> That said, the first paragraph gives a suitable brief motivation (slightly
> rephrased below):
> 
>> Some applications require response and availability SLAs that a single
> HBase cluster can not meet alone. Particularly for
>> high percentiles, queries to a single cluster can be delayed by e.g. GC
> pauses, individual server process failure, or maintenance
>> activity. By providing clients with a transparent multi-cluster
> configuration option we can avoid these outlier conditions by
>> mask these failures from applications that are tolerant to weaker
> consistency guarantees than HBase provides out of the box.
> 
> 
> Ted,
> 
> Thanks for writing this up! We'd prefer to keep discussion of it on the
> mailing list, so please avoid moving to private webex's.
> 
> Would you mind if I or one of the other community members converted the
> design doc to pdf so that it's more accessible?
> 
> 
> 
> On Mon, Jun 29, 2015 at 4:52 PM, Ted Malaska <ted.malaska@cloudera.com>
> wrote:
> 
>> Why don't we set up a webex to talk out the detail.  What times r u open to
>> talk this week.
>> 
>> But to answer your questions.  This is for active active and active
>> failover clusters.  There is a primary and n number of fail overs per
>> client.  This is for gets and puts.
>> 
>> There r a number of configs in the doc to define how to failover.  The
>> options allow a couple different use cases.  There is a lot of detail in
>> the doc and I just didn't want to put it all in the email.
>> 
>> But honestly I put a lot of time in the doc.   I would love to know what u
>> think.
>> On Jun 29, 2015 5:46 PM, "Michael Segel" <michael_segel@hotmail.com>
>> wrote:
>> 
>>> Ted,
>>> 
>>> If you can’t do a 30 second pitch, then its not worth the effort. ;-)
>>> 
>>> Look, when someone says that they want to have a single client talk to
>>> multiple HBase clusters, that could mean two very different things.
>>> First, you could mean that you want a single client to connect to an
>>> active/active pair of HBase clusters where they replicate to each other.
>>> (Active / Passive would also be implied, but then you have the issue of
>>> when does the passive cluster go active? )
>>> 
>>> Then you have the issue of someone wanting to talk to multiple different
>>> clusters so that they can query the data, create local data sets which
>> they
>>> wish to join, combining data from various sources.
>>> 
>>> The second is a different problem from the first.
>>> 
>>> -Mike
>>> 
>>>> On Jun 29, 2015, at 3:38 PM, Ted Malaska <ted.malaska@cloudera.com>
>>> wrote:
>>>> 
>>>> Hey Michael,
>>>> 
>>>> Read the doc please.  It goes through everything at a low level.
>>>> 
>>>> Thanks
>>>> Ted Malaska
>>>> 
>>>> On Mon, Jun 29, 2015 at 4:36 PM, Michael Segel <
>>> michael_segel@hotmail.com>
>>>> wrote:
>>>> 
>>>>> No down time?
>>>>> 
>>>>> So you want a client to go against a pair of active/active hbase
>>> instances
>>>>> on tied clusters?
>>>>> 
>>>>> 
>>>>>> On Jun 29, 2015, at 3:20 PM, Ted Malaska <ted.malaska@cloudera.com>
>>>>> wrote:
>>>>>> 
>>>>>> Hey Michael,
>>>>>> 
>>>>>> The use case is simple "No down time use cases" even in the case
of
>>> site
>>>>>> failure.
>>>>>> 
>>>>>> Now on this statement
>>>>>> "Why not simply manage each connection/context via a threaded child?"
>>>>>> 
>>>>>> That is the point, to make that simple, tested, easy, and transparent
>>> for
>>>>>> HBase users.
>>>>>> 
>>>>>> Ted Malaska
>>>>>> 
>>>>>> On Mon, Jun 29, 2015 at 4:11 PM, Michael Segel <
>>>>> michael_segel@hotmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> So if I understand your goal, you want a client who can connect
to
>> one
>>>>> or
>>>>>>> more hbase clusters at the same time…
>>>>>>> 
>>>>>>> Ok, so lets walk through the use case and help me understand
a
>> couple
>>> of
>>>>>>> use cases for this…
>>>>>>> 
>>>>>>> Why not simply manage each connection/context via a threaded
child?
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On Jun 29, 2015, at 1:48 PM, Ted Malaska <ted.malaska@cloudera.com
>>> 
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Hey Dev List,
>>>>>>>> 
>>>>>>>> 
>>>>>>>> My name is Ted Malaska, long time lover and user of HBase.
I would
>>> like
>>>>>>> to
>>>>>>>> discuss adding in a multi-cluster client into HBase. Here
is the
>> link
>>>>> for
>>>>>>>> the design doc (
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>> https://github.com/tmalaska/HBase.MCC/blob/master/MultiHBaseClientDesignDoc.docx%20(1).docx
>>>>>>> )
>>>>>>>> but I have pulled some parts into this main e-mail to give
you a
>> high
>>>>>>> level
>>>>>>>> understanding of it's scope.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> *Goals*
>>>>>>>> 
>>>>>>>> The proposed solution is a multi-cluster HBase client that
relies
>> on
>>>>> the
>>>>>>>> existing HBase Replication functionality to provide an eventual
>>>>>>> consistent
>>>>>>>> solution in cases of primary cluster down time.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>> https://github.com/tmalaska/HBase.MCC/blob/master/FailoverImage.png
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Additional goals are:
>>>>>>>> 
>>>>>>>> -
>>>>>>>> 
>>>>>>>> Be able to switch between single HBase clusters to Multi-HBase
>> Client
>>>>>>>> with limited or no code changes.  This means using the
>>>>>>> HConnectionManager,
>>>>>>>> Connection, and Table interfaces to hide complexities from
the
>>>>>>> developer
>>>>>>>> (Connection and Table are the new classes for HConnection,
and
>>>>>>>> HTableInterface in HBase version 0.99).
>>>>>>>> -
>>>>>>>> 
>>>>>>>> Offer thresholds to allow developers to decide between degrees
of
>>>>>>>> strongly consistent and eventually consistent.
>>>>>>>> - Support N number of linked HBase Clusters
>>>>>>>> 
>>>>>>>> 
>>>>>>>> *Read-Replicas*
>>>>>>>> Also note this is in alinement with Read-Replicas and can
work with
>>>>> that.
>>>>>>>> This client is multi-cluster where Read-Replicas help us
to be
>> multi
>>>>>>> Region
>>>>>>>> Server.
>>>>>>>> 
>>>>>>>> *Replication*
>>>>>>>> You will also see in the document that this works with current
>>>>>>> replication
>>>>>>>> and requires no changes to it.
>>>>>>>> 
>>>>>>>> *Only a Client change*
>>>>>>>> You will also see in the doc this is only a new client. Which
means
>>> no
>>>>>>>> extra code for the end developer, only addition configs to
set it
>> up.
>>>>>>>> 
>>>>>>>> *Github*
>>>>>>>> This is a github project that shows that this works at:
>>>>>>>> https://github.com/tmalaska/HBase.MCC
>>>>>>>> Note this is only a prototype. When adding it to HBase we
will use
>> it
>>>>> as
>>>>>>> a
>>>>>>>> starting point but there will be changes.
>>>>>>>> 
>>>>>>>> *Initial Results:*
>>>>>>>> 
>>>>>>>> Red is where our primary cluster has failed and you will
see from
>> the
>>>>>>>> bottom to graphs that our puts, deletes, and gets are not
>>> interrupted.
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>> https://github.com/tmalaska/HBase.MCC/blob/master/AveragePutTimeWithMultiRestartsAndShutDowns.png
>>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> Ted Malaska
>>>>>>> 
>>>>>>> The opinions expressed here are mine, while they may reflect
a
>>> cognitive
>>>>>>> thought, that is purely accidental.
>>>>>>> Use at your own risk.
>>>>>>> Michael Segel
>>>>>>> michael_segel (AT) hotmail.com
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> The opinions expressed here are mine, while they may reflect a
>> cognitive
>>>>> thought, that is purely accidental.
>>>>> Use at your own risk.
>>>>> Michael Segel
>>>>> michael_segel (AT) hotmail.com
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>> 
>>> The opinions expressed here are mine, while they may reflect a cognitive
>>> thought, that is purely accidental.
>>> Use at your own risk.
>>> Michael Segel
>>> michael_segel (AT) hotmail.com
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Sean

The opinions expressed here are mine, while they may reflect a cognitive thought, that is
purely accidental. 
Use at your own risk. 
Michael Segel
michael_segel (AT) hotmail.com






Mime
View raw message