hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16288) HFile intermediate block level indexes might recurse forever creating multi TB files
Date Tue, 02 Aug 2016 17:53:21 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404471#comment-15404471
] 

Hudson commented on HBASE-16288:
--------------------------------

FAILURE: Integrated in HBase-0.98-matrix #379 (See [https://builds.apache.org/job/HBase-0.98-matrix/379/])
HBASE-16288 HFile intermediate block level indexes might recurse forever (apurtell: rev 7f139a64398e91a64b8af9525334c3ddc22f7841)
* hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterV2.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockIndex.java


> HFile intermediate block level indexes might recurse forever creating multi TB files
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-16288
>                 URL: https://issues.apache.org/jira/browse/HBASE-16288
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>            Priority: Critical
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
>         Attachments: hbase-16288_v1.patch, hbase-16288_v2.patch, hbase-16288_v3.patch,
hbase-16288_v4.patch
>
>
> Mighty [~elserj] was debugging an opentsdb cluster where some region directory ended
up having 5TB+ files under <regiondir>/.tmp/ 
> Further debugging and analysis, we were able to reproduce the problem locally where we
never we recursing in this code path for writing intermediate level indices: 
> {code:title=HFileBlockIndex.java}
> if (curInlineChunk != null) {
>         while (rootChunk.getRootSize() > maxChunkSize) {
>           rootChunk = writeIntermediateLevel(out, rootChunk);
>           numLevels += 1;
>         }
>       }
> {code}
> The problem happens if we end up with a very large rowKey (larger than "hfile.index.block.max.size"
being the first key in the block, then moving all the way to the root-level index building.
We will keep writing and building the next level of intermediate level indices with a single
very-large key. This can happen in flush / compaction / region recovery causing cluster inoperability
due to ever-growing files. 
> Seems the issue was also reported earlier, with a temporary workaround: 
> https://github.com/OpenTSDB/opentsdb/issues/490



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message