hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ramkrishna.s.vasudevan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15554) StoreFile$Writer.appendGeneralBloomFilter generates extra KV
Date Fri, 20 May 2016 17:59:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293863#comment-15293863
] 

ramkrishna.s.vasudevan commented on HBASE-15554:
------------------------------------------------

Thanks for the comments.
bq.The HasKey goes to far
I can try pushing this under CellUtils. 
But the idea of seeing key as consecutive entity is helpful in all the places where we build
an index. All index in the system (root index, bloom index) is from the key part. The idea
of knowing that key is consecutive helps us to avoid multiple copies or part by part copies
that we do. 
bq..How many Interfaces do we have currently around Cell and KeyValue? It might be worth listing
them? 
Sure we can. Infact if we accept HasKey we can avoid the KeyOnlyKV and BufferedKeyonlyKV concepts.
bq.Up to this we had Cell and we had Cell with empty value. Key is new concept, or rather,
it is an old one in that we always just kept Key in indices and blooms... and you are trying
to formalize it now?
Yes . We are only trying to ascertain that if possible try to identify keys directly. Infact
in another issue, [~anoop.hbase] was also saying that getKeyBuffer cannot be deprecated and
better to have it. So this HasKey is making it more formal.
bq.Does a Cell have a Key? 
No. Cell need not have a key. If you see the current patch HasKey is attributed only with
the Cell types. I don't think Cells need to  have a key. Let Cell be with the notion that
row, families, quals, tags and values are all independent.
bq.KeyValue has a Key (makes sense). If anything the Interface shoudl be called Key? So, we
have getKeyArray and offset and length. Do the latter work if a byte [] or BB? In same way
as ServerCell?
Ya we can call it Key. No problem. Yes. The Key interface also will have both byte[] and BB
based API. Let us discuss on how to handle if a getKeyBB is called on a byte[] cell and if
a getKeyArray is called on a BB cell. But remember this Key interface is going to be private
and will not be exposed to the user or in clients.
bq.You can ask it for family and qualifier pieces? And timestamps? You use the Cell APIs to
do this against a Key?
But to construct the index that we have now how do we create it without copying them every
time? Can you say more on what you think here. May be am not getting your bigger idea.
bq.Maybe there are some type refactorings we could do that could get rid of a bunch of Interfaces?
We could surely see that. Infact last time we took up that task but could not find much. But
can revisit once again.
Thanks once again for the comments.



> StoreFile$Writer.appendGeneralBloomFilter generates extra KV
> ------------------------------------------------------------
>
>                 Key: HBASE-15554
>                 URL: https://issues.apache.org/jira/browse/HBASE-15554
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Performance
>            Reporter: Vladimir Rodionov
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0
>
>         Attachments: HBASE-15554.patch, HBASE-15554_3.patch, HBASE-15554_4.patch
>
>
> Accounts for 10% memory allocation in compaction thread when BloomFilterType is ROWCOL.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message