hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guanghao Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14247) Separate the old WALs into different regionserver directories
Date Sat, 23 Sep 2017 00:46:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16177395#comment-16177395
] 

Guanghao Zhang commented on HBASE-14247:
----------------------------------------

[~davelatham] I checked HBASE-9208. If I am not wrong, the key point is ReplicationLogCleaner
will check zk for every old log. So before HBASE-9208, the cleaner will check zk O(old log
number) per chore. Then HBASE-9208 changed this to O(1). And this issue will increased it
to O(region server number), you thought O(region server number) will have performance problem,
right? If this is a problem, can we share the replication queue result (check zk once) for
all sub region server directory?

Meanwhile, can you give some data when you found the problem in HBASE-9208, like how many
old logs and how many region servers?

> Separate the old WALs into different regionserver directories
> -------------------------------------------------------------
>
>                 Key: HBASE-14247
>                 URL: https://issues.apache.org/jira/browse/HBASE-14247
>             Project: HBase
>          Issue Type: Improvement
>          Components: wal
>            Reporter: Liu Shaohui
>            Assignee: Guanghao Zhang
>            Priority: Critical
>             Fix For: 2.0.0
>
>         Attachments: HBASE-14247.master.001.patch, HBASE-14247.master.002.patch, HBASE-14247-v001.diff,
HBASE-14247-v002.diff, HBASE-14247-v003.diff
>
>
> Currently all old WALs of regionservers are achieved into the single directory of oldWALs.
In big clusters, because of long TTL of WAL or disabled replications, the number of files
under oldWALs may reach the max-directory-items limit of HDFS, which will make the hbase cluster
crashed.
> {quote}
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.FSLimitException$MaxDirectoryItemsExceededException):
The directory item limit of /hbase/lgprc-xiaomi/.oldlogs is exceeded: limit=1048576 items=1048576
> {quote}
> A simple solution is to separate the old WALs into different  directories according to
the server name of the WAL.
> Suggestions are welcomed~ Thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message