hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Ilyin <alexan...@weborama.com>
Subject Re: Spikes in HBase response time
Date Wed, 26 Oct 2016 10:17:55 GMT
I'm still struggling with the "slow" requests described above.

As a temporary solution, I want to limit request execution time and just
drop those messages that make HBase so long to respond. Could you advise
which timeout fits best for this task. I see the following options:
* hbase.rpc.timeout
* hbase.client.operation.timeout
* something else I missed?


On Tue, Oct 25, 2016 at 1:21 PM, Alexander Ilyin <alexander@weborama.com>
wrote:

> Hi Ted, Stack,
>
> thanks for the answers.
>
> We're not using Phoenix but the server process was running on some
> datanodes. We stopped it to be sure, nothing changed.
>
> Ted, I suppose you meant region server jstack traces, here it is:
> http://pastebin.com/CHzyrBp9
> http://pastebin.com/PdNczfVa
>
> Stack, our application pulls the messages from RabbitMQ queue and
> processes them using HBase so there is no dependency on external spikes in
> load. At least no obvious one.
> It looks like random requests taking much more time to complete that the
> others. It happens all the time starting Friday, not during some specific
> time windows, several percent of requests just slow.
> Region server GC pauses look normal -- less than 0.1 second every several
> seconds.
>
> The fact that most of "long" requests have execution time in the range of
> 800-900ms makes me think that we're hitting some timeout which is not
> logged properly. But this is just an idea.
>
>
> On Mon, Oct 24, 2016 at 7:20 PM, Stack <stack@duboce.net> wrote:
>
>> Any correlation with a spike in load -- more i/os, more traffic, other
>> tasks running on the box -- or a long GC. Is it random requests or over a
>> particular period all requests suffer the 800ms penalty?
>> Thanks,
>> St.Ack
>>
>> On Mon, Oct 24, 2016 at 5:44 AM, Alexander Ilyin <alexander@weborama.com>
>> wrote:
>>
>> > Hi,
>> >
>> > I'm experiencing some spikes in HBase response time. The problem
>> appeared
>> > in one moment: before all the requests used to take 4-30ms to complete
>> and
>> > now some of them (several percent) take as long as 800ms decreasing
>> > significantly an average processing time.
>> >
>> > This happens with the requests to all the tables I have so it doesn't
>> seem
>> > to be related to some problem in data schema. And it didn't change
>> anyway.
>> >
>> > I don't see any problems with the lack of resources: there is enough
>> CPU,
>> > memory on datanodes.
>> >
>> > Automatic splitting is off, no cluster balancing is taking place so it
>> > shouldn't be the client retries as well.
>> >
>> > There are some stacktraces in datanode log but as far as I understand
>> these
>> > are normal and anyway they had existed before the problem appeared:
>> > 2016-10-24 05:42:19,186 INFO
>> > [B.defaultRpcServer.handler=17,queue=2,port=16020]
>> > shortcircuit.ShortCircuitCache: ShortCircuitCache(0x3a130ae2): could
>> not
>> > load 1075792477_BP-415105157-10.190.240.206-1468423545420 due to
>> > InvalidToken exception.
>> > org.apache.hadoop.security.token.SecretManager$InvalidToken: access
>> > control
>> > error while attempting to set up short-circuit access to
>> > /apps/hbase/data/data/profile/DailyUserScores/
>> > 84105becaefcf071e43ebc4972d0249f/d/c7653cc98860451cb79e678e9
>> c3f0e85Block
>> > token with block_token_identifier (expiryDate=1477279957879,
>> > keyId=2039256459, userId=hbase,
>> > blockPoolId=BP-415105157-10.190.240.206-1468423545420,
>> blockId=1075792477,
>> > access modes=[READ]) is expired.
>> >
>> > At the moment I'm a bit stuck so any advices on what to check next are
>> > welcome.
>> >
>> > PS. HBase version is 1.1.2
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message