hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-13303) Fix size calculation of results on the region server
Date Fri, 20 Mar 2015 19:50:38 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14371955#comment-14371955
] 

Andrew Purtell edited comment on HBASE-13303 at 3/20/15 7:50 PM:
-----------------------------------------------------------------

Though the change on the server side looks "reasonable", the client is using KeyValue.heapSize()
for the estimation, and we can't change this if interoperating with older clients. 
{code}
  @Override
  public long heapSize() {
    int sum = 0;
    sum += ClassSize.OBJECT;// the KeyValue object itself
    sum += ClassSize.REFERENCE;// pointer to "bytes"
    sum += ClassSize.align(ClassSize.ARRAY);// "bytes"
    sum += ClassSize.align(length);// number of bytes of data in the "bytes" array
    sum += 2 * Bytes.SIZEOF_INT;// offset, length
    sum += Bytes.SIZEOF_LONG;// memstoreTS
    return ClassSize.align(sum);
  }
{code}

I think we would need two variations on heapSize on the server instead of the current patch,
where we twiddle with the value of 'length', to subtract the size of the tags area if the
KeyValue came from a V3 HFile. I could do that.

However, after looking at this from a different perspective the change on HBASE-13297 that
updates the client is probably better. It's a cleaner change with less brittle magic. It results
in an extra RPC, but this can be fixed later by backporting proper scan state updates to the
scanner PB. We don't have to do this for 0.98.12. If we have the proper fix available in time
for 0.98.13 we can address this then. In the meantime we can document how to mitigate the
issue with configuration. 


was (Author: apurtell):
I think the problem is, though the change on the server side looks "reasonable", the client
is using KeyValue.heapSize() for the estimation, and we can't change this if interoperating
with older clients. 
{code}
  @Override
  public long heapSize() {
    int sum = 0;
    sum += ClassSize.OBJECT;// the KeyValue object itself
    sum += ClassSize.REFERENCE;// pointer to "bytes"
    sum += ClassSize.align(ClassSize.ARRAY);// "bytes"
    sum += ClassSize.align(length);// number of bytes of data in the "bytes" array
    sum += 2 * Bytes.SIZEOF_INT;// offset, length
    sum += Bytes.SIZEOF_LONG;// memstoreTS
    return ClassSize.align(sum);
  }
{code}

I think we would need two variations on heapSize on the server instead of the current patch,
where we twiddle with the value of 'length', to subtract the size of the tags area if the
KeyValue came from a V3 HFile. I could do that.

However, after looking at this from a different perspective the change on HBASE-13297 that
updates the client is probably better. It's a cleaner change with less brittle magic. It results
in an extra RPC, but this can be fixed later by backporting proper scan state updates to the
scanner PB. We don't have to do this for 0.98.12. If we have the proper fix available in time
for 0.98.13 we can address this then. In the meantime we can document how to mitigate the
issue with configuration. 

> Fix size calculation of results on the region server
> ----------------------------------------------------
>
>                 Key: HBASE-13303
>                 URL: https://issues.apache.org/jira/browse/HBASE-13303
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Client
>            Reporter: Lars Hofhansl
>             Fix For: 2.0.0, 1.1.0, 0.98.13
>
>         Attachments: HBASE-13303-0.98.patch, HBASE-13303-0.98.patch, HBASE-13303-0.98.patch,
HBASE-13303.patch, HBASE-13303.patch
>
>
> One of the problems in the parent is due to different size calculation between client
and server when HFilev3 is used.
> Since tags are _never_ shipped to the client in a scan, we can have special size function
(or a flag on the current one) that does not include the tags and the tags meta information
(the length is what causes the issue), so that client and server will always calculate the
same size.
> I'll make a patch within the hour, unless somebody beats me to it.
> [~apurtell], FYI.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message