hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allan Yang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-17757) Unify blocksize after encoding to decrease memory fragment
Date Wed, 08 Mar 2017 08:16:38 GMT

    [ https://issues.apache.org/jira/browse/HBASE-17757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900895#comment-15900895
] 

Allan Yang commented on HBASE-17757:
------------------------------------

{quote}
In case where both DBE and compression in use, the size we track will be after compression
also. And what we keep in cache is uncompressed blocks (By default. There is config to keep
compressed also).. So the math will go wrong there?
{quote}
Sorry, I don't quit catch your question. As far as I know, compression happens after finish
writing one block. So it is hard to unify blocksize after compression(Don't know when to finish
one block, since we don't know the size after compression). On the other hand, unify blocksize
only works if the encoding happens 'on the fly', so unify blocksize for encoding algorithms
like prefix-tree is not possible too.

{quote}
How abt thinking the block size limit to be a hard limit than a soft one?
{quote}
It is hard, as far as I know, a single row must remain in one block(correct me if I'm wrong),
if a single row's size is bigger than blocksize, then this single block's size will beyond
our limit.

> Unify blocksize after encoding to decrease memory fragment 
> -----------------------------------------------------------
>
>                 Key: HBASE-17757
>                 URL: https://issues.apache.org/jira/browse/HBASE-17757
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>         Attachments: HBASE-17757.patch
>
>
> Usually, we store encoded block(uncompressed) in blockcache/bucketCache. Though we have
set the blocksize, after encoding, blocksize is varied. Varied blocksize will cause memory
fragment problem, which will result in more FGC finally.In order to relief the memory fragment,
This issue adjusts the encoded block to a unified size.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message