hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ryan rawson (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2265) HFile and Memstore should maintain minimum and maximum timestamps
Date Thu, 25 Feb 2010 08:44:27 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838251#action_12838251

ryan rawson commented on HBASE-2265:

I'm not sure this will help make gets better, there are 2 get cases:

- get a single column for a row.  In this case, if timestamps are written out of order, we
dont know which hfile to start with.  Lets say we start with the 'newest' one, and it has
TS[1], well is the fact that an older file start < TS[1] < end mean we should consult
this file?  I suppose if end < TS[1] (thus the timestamp gotten is newer than the keyvalue
we already got), we'd know there is nothing newer and we could conclusively rule that file
out.  If TS[1] was < beginning of a file, we'd have to consider the file.  With a big spread
of timestamps and keys, we wouldnt get much of an optimization.

- for a complete column family get, we'll have to touch every file, every time. This is because
you are never sure if the next file contains another key/value for the result.  A bloom filter
would help here.

As for the scan, we already know which files are 'newer'.  However, during a compaction, this
information is collapsed, and we end up with the duplicate key/values sitting next to each
other.  We might be able to cause/create an invariant that during compaction the 'newer' one
comes first. The compaction might be able to help straighten this out, since i think we do
minor compactions 'in order', with older files first. Seems like a tricky bit. 

Generally the ideal solution would involve no change to the KeyValue serialization format
(and hence possibly requiring a store-file rewrite).

> HFile and Memstore should maintain minimum and maximum timestamps
> -----------------------------------------------------------------
>                 Key: HBASE-2265
>                 URL: https://issues.apache.org/jira/browse/HBASE-2265
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Todd Lipcon
> In order to fix HBASE-1485 and HBASE-29, it would be very helpful to have HFile and Memstore
track their maximum and minimum timestamps. This has the following nice properties:
> - for a straight Get, if an entry has been already been found with timestamp X, and X
>= HFile.maxTimestamp, the HFile doesn't need to be checked. Thus, the current fast behavior
of get can be maintained for those who use strictly increasing timestamps, but "correct" behavior
for those who sometimes write out-of-order.
> - for a scan, the "latest timestamp" of the storage can be used to decide which cell
wins, even if the timestamp of the cells is equal. In essence, rather than comparing timestamps,
instead you are able to compare tuples of (row timestamp, storage.max_timestamp)
> - in general, min_timestamp(storage A) >= max_timestamp(storage B) if storage A was
flushed after storage B.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message