hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anoop Sam John (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15669) HFile size is not considered correctly in a replication request
Date Wed, 20 Apr 2016 05:38:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249291#comment-15249291

Anoop Sam John commented on HBASE-15669:

Do we need a check like hasStoreFileSize()? getStoreFileSize(): 0?

int totalEdits = edit.size();
752	      for (int i = 0; i < totalEdits; i++) {
totalEdits? This is cells in this edit right?  In loop condition part u can have i < cells.size()?
 Other places also similar way.  Will it add more burden on other normal edits size calc?
 Like we have qualifier check on each and every cell.
There can be one WALEdit with a mix of bulk load cells + normal cells?  I dont think so. 
So we can early out when 1st cell in WALEdit is not a bulk load cell?  May be this optimization
can come in some other places also?

Can we mark the WALEdit itself as bulk load related? Am not sure.. May be see that later.

> HFile size is not considered correctly in a replication request
> ---------------------------------------------------------------
>                 Key: HBASE-15669
>                 URL: https://issues.apache.org/jira/browse/HBASE-15669
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.3.0
>            Reporter: Ashish Singhi
>            Assignee: Ashish Singhi
>             Fix For: 2.0.0, 1.3.0, 1.4.0
>         Attachments: HBASE-15669.patch
> In a single replication request from source cluster a RS can send either at most {{replication.source.size.capacity}}
size of data or {{replication.source.nb.capacity}} entries. 
> The size is calculated by considering the cells size in each entry which will get calculated
wrongly in case of bulk loaded data replication, in this case we need to consider the size
of hfiles not cell.

This message was sent by Atlassian JIRA

View raw message