hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Holstad (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1249) Rearchitecting of server, client, API, key format, etc for 0.20
Date Wed, 18 Mar 2009 17:57:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683111#action_12683111

Erik Holstad commented on HBASE-1249:

Yeah Jonathan, that sounds like a really good approach, we get all the benefits from splitting
deletes and puts and don't have to pay the cost of doing that, very nice.

On the whole put/get/delete issue:

When looking for a value in HBase we have 3 lists that need to be compared to
each other, the list of data, da, in the storefile, the list of keys, k, to look
for and the list of deletes, de.

Today, we compare every da with every k and every match with every de, as far as
I can tell. We get a complexity that looks something like da*k+k*de, which might
be ok, when all those value are small, but if da = k = de = 10 you get 200
comparisons that you have to do.

I'm proposing more like a merge approach where you merge compare da and de and
produce a survivor list, this list is then compared to k. This will result in
de+da + da+k  = 40 in worst case, which seems like a much better way to go.
Can even be made more efficient. 

Think we should add get types into KeyValue, so we can tell the difference between getting

a value for a specific ts and getting all values after a specific ts.

> Rearchitecting of server, client, API, key format, etc for 0.20
> ---------------------------------------------------------------
>                 Key: HBASE-1249
>                 URL: https://issues.apache.org/jira/browse/HBASE-1249
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
> To discuss all the new and potential issues coming out of the change in key format (HBASE-1234):
zero-copy reads, client binary protocol, update of API (HBASE-880), server optimizations,

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message