hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang, Xinglong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18764) add slow read block log entry to alert slow datanodeinfo when reading a block is slow
Date Wed, 06 Sep 2017 07:08:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16154916#comment-16154916
] 

Wang, Xinglong commented on HBASE-18764:
----------------------------------------

In our case, a 64K hbase block with snappy compression, it will be 6K on disk data, it will
normally take around 2ms to get the block.

> add slow read block log entry to alert slow datanodeinfo when reading a block is slow
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-18764
>                 URL: https://issues.apache.org/jira/browse/HBASE-18764
>             Project: HBase
>          Issue Type: Improvement
>          Components: HFile
>    Affects Versions: 1.1.2
>            Reporter: Wang, Xinglong
>            Priority: Minor
>         Attachments: HBASE-18764.rev1.1.2.patch
>
>
> HBASE is on top of HDFS and both are distributed system. HBASE will also get impacted
when there is struggler datanode due to network/disk/cpu issue. All HBASE read/scan towards
that datanode will be slowdown. It's not easy for hbase admin to find out the struggler datanode
in such case.
> While we have a log entry known as slow sync. One such entry is like the following. It
will help hbase admin to quickly identify the slow datanode in the pipline in case of network/disk/cup
issue with one of the 3 datanods in pipeline.
> {noformat}
> 2017-07-08 19:11:30,538 INFO  [sync.3] wal.FSHLog: Slow sync cost: 490189 ms, current
pipeline: [DatanodeInfoWithStorage[xx.xx.xx.xx:50010,DS-c391299a-aa9f-4146-ac7e-a493ae536bff,DISK],
DatanodeInfoWithSt
> orage[xx.xx.xx.xx:50010,DS-21a85f8b-f389-4f9e-95a8-b711945fd210,DISK], DatanodeInfoWithStorage[xx.xx.xx.xx:50010,DS-aa48cef2-3554-482f-b49d-be4763f4d8b8,DISK]]
> {noformat}
> Inspired by slow sync log entry, I think it will also be beneficial for us to print out
such kind of entry when we encounter slow read case. So that it will be easy to identify the
slow datanode.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message