hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16288) HFile intermediate block level indexes might recurse forever creating multi TB files
Date Wed, 27 Jul 2016 20:09:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15396257#comment-15396257
] 

Enis Soztutar commented on HBASE-16288:
---------------------------------------

The index blocks go to the block cache still. So it might not be ideal to have a very large
index block because the keys in there are large if we do only number-of-entries-based. The
patch now puts a hard coded limit (2) of how many entries a single block should contain at
a minimum. Maybe we would need that to be configurable, and perhaps make it so that it is
at least 16 entries or so, so that we do not end up with 50-level indices. 

> HFile intermediate block level indexes might recurse forever creating multi TB files
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-16288
>                 URL: https://issues.apache.org/jira/browse/HBASE-16288
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>            Priority: Critical
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 1.2.3
>
>         Attachments: hbase-16288_v1.patch, hbase-16288_v2.patch, hbase-16288_v3.patch
>
>
> Mighty [~elserj] was debugging an opentsdb cluster where some region directory ended
up having 5TB+ files under <regiondir>/.tmp/ 
> Further debugging and analysis, we were able to reproduce the problem locally where we
never we recursing in this code path for writing intermediate level indices: 
> {code:title=HFileBlockIndex.java}
> if (curInlineChunk != null) {
>         while (rootChunk.getRootSize() > maxChunkSize) {
>           rootChunk = writeIntermediateLevel(out, rootChunk);
>           numLevels += 1;
>         }
>       }
> {code}
> The problem happens if we end up with a very large rowKey (larger than "hfile.index.block.max.size"
being the first key in the block, then moving all the way to the root-level index building.
We will keep writing and building the next level of intermediate level indices with a single
very-large key. This can happen in flush / compaction / region recovery causing cluster inoperability
due to ever-growing files. 
> Seems the issue was also reported earlier, with a temporary workaround: 
> https://github.com/OpenTSDB/opentsdb/issues/490



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message