hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1249) Rearchitecting of server, client, API, key format, etc for 0.20
Date Sun, 29 Mar 2009 18:56:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693602#action_12693602

Jonathan Gray commented on HBASE-1249:

I'm a little bit confused about what you're saying.  First, I don't think you or I have described
any recent changes we've made in the design from the current posted docs.  Namely, separation
of the deletes after all the puts is not actually a good idea.

Rather than separating them out and having a KV sort like (row,type,column,ts) it would be
(row,column,ts,type).  You end up building your delete list as you go but now you can early
out in more cases.  I will update the pdfs later in the week.

Erik has also simplified the return codes of Get.compareTo.

Regarding the DeleteFamily issue... There's some new DeleteSet object now that handles the
merging of deletes and containment checking?  Should be very simple for it to keep around
a single, optional DeleteFamily (really just a timestamp... defaults to 0L and is checked
every time, or set to null and skip it if none found (should be able to get away with an overhead
of a single if == null check when no deletefamily present)... it would get set when reading
a DeleteFamily and then just a single long check for each timestamp.

Regarding #1 above, i don't follow why they can only set to now?  The rule can and should
be, you can do anything for now or anything in the past.  How would setting something in the
past break anything here?

2.  This is what I'm proposing I guess?  What's the downside?  If no deletefamily, you have
one line of code, a single instruction comparison.  This is the least complex and seems efficient.

3.  I think you're saying the entire row should be timestamp ordered here?  As you know, I'm
against that.  :)

> Rearchitecting of server, client, API, key format, etc for 0.20
> ---------------------------------------------------------------
>                 Key: HBASE-1249
>                 URL: https://issues.apache.org/jira/browse/HBASE-1249
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>         Attachments: HBASE-1249-Example-v1.pdf, HBASE-1249-Example-v2.pdf, HBASE-1249-GetQuery-v1.pdf,
HBASE-1249-GetQuery-v2.pdf, HBASE-1249-GetQuery-v3.pdf, HBASE-1249-StoreFile-v1.pdf
> To discuss all the new and potential issues coming out of the change in key format (HBASE-1234):
zero-copy reads, client binary protocol, update of API (HBASE-880), server optimizations,

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message