hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Liochon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15436) BufferedMutatorImpl.flush() appears to get stuck
Date Wed, 30 Mar 2016 09:24:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15217714#comment-15217714

Nicolas Liochon commented on HBASE-15436:

bq.  There should be a cap size for the size above which we should block the writes. We should
not take more than this limit. May be some thing like 1.5 times of what is the flush size.
We definitively want to take more than this limit, but may be not as much as what we're taking
today (or maybe we want to be clearer on what these settings mean)
There is a limit, given by the number of task executed in parallel (hbase.client.max.total.tasks).
If I understand correctly, this setting is now per client (and not per htable).
Ideally these parameters should be hidden to the user (i.e. the defaults are ok for a standard
client w/o too much memory constraints). 

bq. How long we should wait? Whether we should come out faster? 
iirc, A long time ago, the buffer was attached to the Table object, so the policy (or at least
the objective :-)) when one of the puts had failed (i.e. reached the max retry number) was
simple: all the operations currently in the buffer were considered as failed as well, even
if we had not even tried to send them. As a consequence the buffer was empty after the failure
of a single put. It was then up to the client to continue or not. May be we should do the
same with the buffered mutator, for all  cases, close or not? I haven't looked at the bufferedMutator
code, but I can have a look it you whish [~anoop.hbase]. 

bq.  What if we were doing multi Get to META table to know the region location for N mutations
at a time.
It seems like a good idea. There are many possible optimisation on how we use meta, and this
is one of them.

> BufferedMutatorImpl.flush() appears to get stuck
> ------------------------------------------------
>                 Key: HBASE-15436
>                 URL: https://issues.apache.org/jira/browse/HBASE-15436
>             Project: HBase
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 1.0.2
>            Reporter: Sangjin Lee
>         Attachments: hbaseException.log, threaddump.log
> We noticed an instance where the thread that was executing a flush ({{BufferedMutatorImpl.flush()}})
got stuck when the (local one-node) cluster shut down and was unable to get out of that stuck
> The setup is a single node HBase cluster, and apparently the cluster went away when the
client was executing flush. The flush eventually logged a failure after 30+ minutes of retrying.
That is understandable.
> What is unexpected is that thread is stuck in this state (i.e. in the {{flush()}} call).
I would have expected the {{flush()}} call to return after the complete failure.

This message was sent by Atlassian JIRA

View raw message