hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2406) Define semantics of cell timestamps/versions
Date Mon, 19 Jul 2010 17:25:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889926#action_12889926
] 

Jonathan Gray commented on HBASE-2406:
--------------------------------------

bq. Gets lack "...the ability to retrieve the latest version less than or equal to a given
timestamp, thus giving the 'latest' state of the record at a certain point in time."
I commented on this on the blog post.  This is not the case, we do support this by setting
max to be the timestamp+1

bq. Major compactions are not invisible to the user
This is hard to fix and it's not clear what "expected" behavior should be.  Do you ever re-surface
a Put once it's been hidden?  Seems like there's an argument on both sides of this.  If I
want to keep the latest two versions, I might have accidentally inserted a bad version, so
I want to delete it and resurface an older one.  But maybe someone else has an argument that
they never want something to be able to re-appear after being shadowed?

I think the most important fix is to handle duplicate versions (ordered by insertion time,
using memstoreTS and storefile stamps).

Other stuff is less clear what the "right" answer should be.  I also don't think we can attempt
to completely nail-down this stuff until we make a strong determination about what should/should
not be processed during minor compactions.  I did some preliminary benchmarking work on minor
compactions a couple months back, hoping to have an intern pick that work back up so we can
make a decision here.

> Define semantics of cell timestamps/versions
> --------------------------------------------
>
>                 Key: HBASE-2406
>                 URL: https://issues.apache.org/jira/browse/HBASE-2406
>             Project: HBase
>          Issue Type: Task
>          Components: documentation
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.90.0
>
>
> There is a lot of general confusion over the semantics of the cell timestamp. In particular,
a couple questions that often come up:
> - If multiple writes to a cell have the same timestamp, are all versions maintained or
just the last?
> - Is it OK to write cells in a non-increasing timestamp order?
> Let's discuss, figure out what semantics make sense, and then move towards (a) documentation,
(b) unit tests that prove we have those semantics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message