hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: sporadic hbase "outages"
Date Tue, 22 Mar 2016 18:07:53 GMT
bq. a small number will take 20 minutes or more

Were these mappers performing selective scan on big regions ?

Can you pastebin the stack trace of region server(s) which served such
regions during slow mapper operation ?

Pastebin of region server log would also give us more clue.

On Tue, Mar 22, 2016 at 10:57 AM, feedly team <feedlydev@gmail.com> wrote:

> Recently we have been experiencing short downtimes (~2-5 minutes) in our
> hbase cluster and are trying to understand why. Many times we have HLog
> write spikes around the down times, but not always. Not sure if this is a
> red herring.
> We have looked a bit farther back in time and have noticed many metrics
> deteriorating over the past few months:
> The compaction queue size seems to be growing.
> The flushQueueSize and flushSizeAvgTime are growing.
> Some map reduce tasks run extremely slowly. Maybe 90% will complete within
> a couple minutes, but a small number will take 20 minutes or more. If I
> look at the slow mappers, there is a high value for the
> MILLIS_BETWEEN_NEXTS counter (these mappers didn't run data local).
> We have seen application performance worsening, during slowdowns usually
> threads are blocked on hbase connection operations
> (HConnectionManager$HConnectionImplementation.processBatch).
> This is a bit puzzling as our data nodes' os load values are really low. In
> the past, we had performance issues when load got too high. The region
> server log doesn't have anything interesting, the only messages we get are
> a handful of responseTooSlow messages
> Do these symptoms point to anything or is there something else we should
> look at? We are (still) running 0.94.20. We are going to upgrade soon, but
> we want to diagnose this issue first.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message