hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xiang Li (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-18573) Update Append and Delete to use Mutation#getCellList(family)
Date Wed, 23 Aug 2017 16:46:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138622#comment-16138622
] 

Xiang Li edited comment on HBASE-18573 at 8/23/17 4:45 PM:
-----------------------------------------------------------

Hi [~Jan Hentschel], thanks for the reply! Got your idea, but I might have a different opinion.
You mentioned 
bq. As you see, there's only a single call to add(), which makes it safe to assume that the
size of the list will be 1
It might be not like that. add() could be called several times to an Append object, to add
several cells. similarly, addColumn() could also be called several times to an Append object
to add several families, qualifiers and values.
{code}
Append a1 = new Append(row1);
a1.add(c1);
a1.add(c2);
a1.add(c3);
...
table1.append(a1);
{code}

So setting initial capacity to 1 is good when only adding one cell or one family/qualifier/value
to the Append object and make HTable process it, but when adding multiple cells, initial capacity
= 1 will inflate the backing array more frequently than setting the initial capacity to a
larger number. 
It varies in different use scenarios. Does it make sense to you?


was (Author: water):
Hi [~Jan Hentschel], thanks for the reply! Got your idea, but I might have a different opinion.
You mentioned 
bq. As you see, there's only a single call to add(), which makes it safe to assume that the
size of the list will be 1
It might be not like that. add() could be called several times to an Append object, to add
several cells. similarly, addColumn() could also be called several times to an Append object
to add several families, qualifiers and values.
{code}
Append a1 = new Append(row1);
a1.add(c1);
a1.add(c2);
a1.add(c3);
...
table1.append(a1);

{code}

So setting initial capacity to 1 is good when only adding one cell or one family/qualifier/value
to the Append object and make HTable process it, but when adding multiple cells, initial capacity
= 1 will inflate the backing array more frequently than setting the initial capacity to a
larger number. 
It varies in different use scenarios. Does it make sense to you?

> Update Append and Delete to use Mutation#getCellList(family)
> ------------------------------------------------------------
>
>                 Key: HBASE-18573
>                 URL: https://issues.apache.org/jira/browse/HBASE-18573
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Xiang Li
>            Assignee: Xiang Li
>            Priority: Minor
>             Fix For: 3.0.0, 1.4.0, 1.5.0, 2.0.0-alpha-3
>
>         Attachments: HBASE-18573.master.000.patch
>
>
> In addxxx() of Put and Increment, Mutation#getCellList(family) is called to get cell
list from familyMap. But in the other 2 sub-class of Mutation: Append and Delete, the logic
like Mutation#getCellList(family) is used, like
> {code}
>     List<Cell> list = familyMap.get(family);
>     if(list == null) {
>       list = new ArrayList<>(1);
>     }
> {code}
> in
> {code}
> public Delete addColumn(byte [] family, byte [] qualifier, long timestamp)
> {code}
> of Delete
> We could make them to call Mutation#getCellList(family) to get better encapsulation



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message