hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashish Singhi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15669) HFile size is not considered correctly in a replication request
Date Wed, 20 Apr 2016 09:38:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249558#comment-15249558

Ashish Singhi commented on HBASE-15669:

{quote} What if there is no size for this file ?
I see LOG.warn() below. Is that enough ?{quote}
That's not possible, even if get a exception we will set it 0 so that should be enough.

bq. Do we need a check like hasStoreFileSize()? getStoreFileSize(): 0?
By default its 0.

bq. totalEdits? 
totalCells :)

In loop condition part u can have i < cells.size()? Other places also similar way. Will
it add more burden on other normal edits size calc? Like we have qualifier check on each and
every cell.
There can be one WALEdit with a mix of bulk load cells + normal cells? I dont think so. So
we can early out when 1st cell in WALEdit is not a bulk load cell? May be this optimization
can come in some other places also?
The same thing came up when we were working on the main jira (HBASE-13153), but we are not
sure if in future an edit can contain a mix of mutation and bulk load marker cells. If that
happens then it will break the replication. So to avoid that we are handling it in that way.

Thanks for the reviews.

> HFile size is not considered correctly in a replication request
> ---------------------------------------------------------------
>                 Key: HBASE-15669
>                 URL: https://issues.apache.org/jira/browse/HBASE-15669
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>    Affects Versions: 1.3.0
>            Reporter: Ashish Singhi
>            Assignee: Ashish Singhi
>             Fix For: 2.0.0, 1.3.0, 1.4.0
>         Attachments: HBASE-15669.patch
> In a single replication request from source cluster a RS can send either at most {{replication.source.size.capacity}}
size of data or {{replication.source.nb.capacity}} entries. 
> The size is calculated by considering the cells size in each entry which will get calculated
wrongly in case of bulk loaded data replication, in this case we need to consider the size
of hfiles not cell.

This message was sent by Atlassian JIRA

View raw message