hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Retry HTable.put() on client-side to handle temp connectivity problem
Date Mon, 27 Jun 2011 22:23:27 GMT
If I could override the default, I'd be a hesitant +1. I'd rather see
the default be something like retry 10 times, then throw an error.
With one option being infinite retries.

-Joey

On Mon, Jun 27, 2011 at 2:21 PM, Stack <stack@duboce.net> wrote:
> I'd be fine with changing the default in hbase so clients just keep
> trying.  What do others think?
> St.Ack
>
> On Mon, Jun 27, 2011 at 1:56 PM, Alex Baranau <alex.baranov.v@gmail.com> wrote:
>> The code I pasted works for me: it reconnects successfully. Just thought it
>> might be not the best way to do it.. I realized that by using HBase
>> configuration properties we could just say that it's up to user to configure
>> HBase client (created by Flume) properly (e.g. by adding hbase-site.xml with
>> settings to classpath). On the other hand, it looks to me that users of
>> HBase sinks will *always* want it to retry writing to HBase until it works
>> out. But default configuration works not this way: sinks stops when HBase is
>> temporarily down or inaccessible. Hence it makes using the sink more
>> complicated (because default configuration sucks), which I'd like to avoid
>> here by adding the code above. Ideally the default configuration should work
>> the best way for general-purpose case.
>>
>> I understood what are the ways to implement/configure such behavior. I think
>> we should discuss what is the best default behavior and do we need to allow
>> user override it on Flume ML (or directly at
>> https://issues.cloudera.org/browse/FLUME-685).
>>
>> Thank you guys,
>>
>> Alex Baranau
>> ----
>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase
>>
>>
>> On Mon, Jun 27, 2011 at 11:40 PM, Stack <stack@duboce.net> wrote:
>>
>>> Either should work Alex.  Your version will go "for ever".  Have you
>>> tried yanking hbase out from under the client to see if it reconnects?
>>>
>>> Good on you,
>>> St.Ack
>>>
>>> On Mon, Jun 27, 2011 at 1:33 PM, Alex Baranau <alex.baranov.v@gmail.com>
>>> wrote:
>>> > Yes, that is what intended, I think. To make the whole picture clear,
>>> here's
>>> > the context:
>>> >
>>> > * there's a Flume's HBase sink (read: HBase client) which writes data
>>> from
>>> > Flume "pipe" (read: some event-based messages source) to HTable;
>>> > * when HBase is down for some time (with default HBase configuration on
>>> > Flume's sink side) HTable.put throws exception and client exits (it
>>> usually
>>> > takes ~10 min to fail);
>>> > * Flume is smart enough to accumulate data to be written reliably if sink
>>> > behaves badly (not writing for some time, pauses, etc.), so it would be
>>> > great if the sink tries to write data until HBase is up again, BUT:
>>> > * but here, as we have complete "failure" of sink process (thread needs
>>> to
>>> > be restarted) the data never reaches HTable even after HBase cluster is
>>> > brought up again.
>>> >
>>> > So you suggest instead of this extra construction around HTable.put to
>>> use
>>> > configuration properties "hbase.client.pause" and
>>> > "hbase.client.retries.number"? I.e. make retries attempts to be
>>> (reasonably)
>>> > close to "perform forever". Is that what you meant?
>>> >
>>> > Thank you,
>>> > Alex Baranau
>>> > ----
>>> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
>>> HBase
>>> >
>>> > On Mon, Jun 27, 2011 at 11:16 PM, Ted Yu <yuzhihong@gmail.com> wrote:
>>> >
>>> >> This would retry indefinitely, right ?
>>> >> Normally maximum retry duration would govern how long the retry is
>>> >> attempted.
>>> >>
>>> >> On Mon, Jun 27, 2011 at 1:08 PM, Alex Baranau <alex.baranov.v@gmail.com
>>> >> >wrote:
>>> >>
>>> >> > Hello,
>>> >> >
>>> >> > Just wanted to confirm that I'm doing things in a proper way here.
How
>>> >> > about
>>> >> > this code to handle the temp cluster connectivity problems (or
cluster
>>> >> down
>>> >> > time) on client-side?
>>> >> >
>>> >> > +    // HTable.put() will fail with exception if connection to
cluster
>>> is
>>> >> > temporarily broken or
>>> >> > +    // cluster is temporarily down. To be sure data is written
we
>>> retry
>>> >> > writing.
>>> >> > +    boolean dataWritten = false;
>>> >> > +    do {
>>> >> > +      try {
>>> >> > +        table.put(p);
>>> >> > +        dataWritten = true;
>>> >> > +      } catch (IOException ioe) { // indicates cluster connectivity
>>> >> > problem
>>> >> > (also thrown when cluster is down)
>>> >> > +        LOG.error("Writing data to HBase failed, will try
again in "
>>> +
>>> >> > RETRY_INTERVAL_ON_WRITE_FAIL + " sec", ioe);
>>> >> > +        Thread.currentThread().wait(RETRY_INTERVAL_ON_WRITE_FAIL
*
>>> >> 1000);
>>> >> > +      }
>>> >> > +    } while (!dataWritten);
>>> >> >
>>> >> > Thank you in advance,
>>> >> > Alex Baranau
>>> >> > ----
>>> >> > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop
-
>>> >> HBase
>>> >> >
>>> >>
>>> >
>>>
>>
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Mime
View raw message