hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-18116) Replication buffer quota accounting should not include bulk transfer hfiles
Date Thu, 25 May 2017 19:01:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025196#comment-16025196
] 

Andrew Purtell edited comment on HBASE-18116 at 5/25/17 7:00 PM:
-----------------------------------------------------------------

In addition, when estimating the size of a replication queue entry we only consider the WALEdit
objects, not the also associated WALKey objects. 


was (Author: apurtell):
Also, when calculating the heap size of a replication queue entry we only track the WALEdit
objects, not the associated WALKey objects. 

> Replication buffer quota accounting should not include bulk transfer hfiles
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-18116
>                 URL: https://issues.apache.org/jira/browse/HBASE-18116
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Andrew Purtell
>
> In ReplicationSourceWALReaderThread we maintain a global quota on enqueued replication
work for preventing OOM by queuing up too many edits into queues on heap. When calculating
the size of a given replication queue entry, if it has associated hfiles (is a bulk load to
be replicated as a batch of hfiles), we get the file sizes and include the sum. We then apply
that result to the quota. This isn't quite right. Those hfiles will be pulled by the sink
as a file copy, not pushed by the source. The cells in those files are not queued in memory
at the source and therefore shouldn't be counted against the quota.
> Related, the sum of the hfile sizes are also included when checking if queued work exceeds
the configured replication queue capacity, which is by default 64 MB. HFiles are commonly
much larger than this. 
> So what happens is when we encounter a bulk load replication entry typically both the
quota and capacity limits are exceeded, we break out of loops, and send right away. What is
transferred on the wire via HBase RPC though has only a partial relationship to the calculation.

> Depending how you look at it, it makes sense to factor hfile file sizes against replication
queue capacity limits. The sink will be occupied transferring those files at the HDFS level.
Anyway, this is how we have been doing it and it is too late to change now. I do not however
think it is correct to apply hfile file sizes against a quota for in memory state on the source.
The source doesn't queue or even transfer those bytes. 
> Something I noticed while working on HBASE-18027.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message