hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18573) Update Append and Delete to use Mutation#getCellList(family)
Date Wed, 16 Aug 2017 16:36:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16129044#comment-16129044

Xiang Li commented on HBASE-18573:

Thanks [~jerryhe].
I studied the code of ArrayList and find that
* New ArrayList with initial capacity of 1 is better (save space) when a caller adds only
one cell for a family and then processes the mutation.
* New ArrayList with no initial capacity specified is better when a caller adds a lot of cells
for a family. 

{panel:title=More details} 
When an ArrayList needs to inflate its backing array(elementData), the increment when doing
inflation is in proportion to the initial capacity. 
{code:title=ArrayList#grow(int minCapacity)|borderStyle=solid}
int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
if (newCapacity - minCapacity < 0)
   newCapacity = minCapacity;
The newCapacity is max of
* min capacity required
* current capacity * 1.5

When initial capacity is not specified, default capacity 10 is used, so it is more aggressive
then having initial capacity to 1 in terms of inflation. That is, with no initial capacity
specified, an Object array with size of 10 has been allocated even you only add 1 cell. If
initial capacity is set to 1, the array size is 1 when adding the first cell for a family,
the array size is 2 when another cell is added to the same family.

> Update Append and Delete to use Mutation#getCellList(family)
> ------------------------------------------------------------
>                 Key: HBASE-18573
>                 URL: https://issues.apache.org/jira/browse/HBASE-18573
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Xiang Li
>            Assignee: Xiang Li
>            Priority: Minor
> In addxxx() of Put and Increment, Mutation#getCellList(family) is called to get cell
list from familyMap. But in the other 2 sub-class of Mutation: Append and Delete, the logic
like Mutation#getCellList(family) is used, like
> {code}
>     List<Cell> list = familyMap.get(family);
>     if(list == null) {
>       list = new ArrayList<>(1);
>     }
> {code}
> in
> {code}
> public Delete addColumn(byte [] family, byte [] qualifier, long timestamp)
> {code}
> of Delete
> We could make them to call Mutation#getCellList(family) to get better encapsulation

This message was sent by Atlassian JIRA

View raw message