phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karan Mehta (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-3884) Correct MutationState size estimation
Date Thu, 25 May 2017 02:02:04 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024029#comment-16024029
] 

Karan Mehta edited comment on PHOENIX-3884 at 5/25/17 2:01 AM:
---------------------------------------------------------------

bq. Or just use CellUtil#estimatedSerializedSizeOf(Cell) ? It's in branch-1.1 and up.
This computes values everytime. Can we use the following?

bq. + byteSize += c.getRowLength() + c.getFamilyLength() + c.getQualifierLength() + c.getValueLength()
+ KeyValue.KEY_INFRASTRUCTURE_SIZE + KeyValue.KEYVALUE_WITH_TAGS_INFRASTRUCTURE_SIZE;
I think Cell will always be an instance of KeyValue over here, so KeyValue#getLength() can
be used? The value of KV is computed during its creation and cached in this variable, so it
will be quick.

This is how HBase does for its own Quota calculation. You might want to use that function
as well.
{code}
    for (Map.Entry<byte [], List<Cell>> entry : mutation.getFamilyCellMap().entrySet())
{
      for (Cell cell : entry.getValue()) {
        size += KeyValueUtil.length(cell);
      }
    }
{code}


was (Author: karanmehta93):
bq. Or just use CellUtil#estimatedSerializedSizeOf(Cell) ? It's in branch-1.1 and up.
This computes values everytime. Can we use the following?

bq. + byteSize += c.getRowLength() + c.getFamilyLength() + c.getQualifierLength() + c.getValueLength()
+ KeyValue.KEY_INFRASTRUCTURE_SIZE + KeyValue.KEYVALUE_WITH_TAGS_INFRASTRUCTURE_SIZE;
I think Cell will always be an instance of KeyValue over here, so KeyValue#getLength() can
be used? The value of KV is computed during its creation and cached in this variable, so it
will be quick.

> Correct MutationState size estimation
> -------------------------------------
>
>                 Key: PHOENIX-3884
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3884
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.10.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>         Attachments: 3884.txt
>
>
> Currently the Mutation is estimated by called Mutation.heapSize(), which adds all the
overhead needed to store the Mutation on the Java heap and has little to do with the actual
size on the wire or the size of disk.
> With a sample row with a 20 byte key and 10 columns with a qualifier length and value
length of this reports 1800 bytes, where the size is closer to 600-700 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message