hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13389) [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
Date Tue, 07 Apr 2015 00:46:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14482322#comment-14482322
] 

Lars Hofhansl commented on HBASE-13389:
---------------------------------------

I think we had comment overlap. :)

bq. ...you are not against changing sort order so that seqid prevails over type are you...?

I would actually be against it, since it breaks the fact that all mutations in HBase are idempotent
- when the client encounters any problem with a batch of updates, it can just do those again,
and the outcome would be identical - within the limits of what HBase defines, i.e. with ms
resolution, now we would complicate that, and need explaining to do.

So with the discussion above in place, can be lower the default time to 3 days? So that we
can be reasonably sure that major compactions would purge the mvcc cruft?

> [REGRESSION] HBASE-12600 undoes skip-mvcc parse optimizations
> -------------------------------------------------------------
>
>                 Key: HBASE-13389
>                 URL: https://issues.apache.org/jira/browse/HBASE-13389
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Performance
>            Reporter: stack
>         Attachments: 13389.txt
>
>
> HBASE-12600 moved the edit sequenceid from tags to instead exploit the mvcc/sequenceid
slot in a key. Now Cells near-always have an associated mvcc/sequenceid where previous it
was rare or the mvcc was kept up at the file level. This is sort of how it should be many
of us would argue but as a side-effect of this change, read-time optimizations that helped
speed scans were undone by this change.
> In this issue, lets see if we can get the optimizations back -- or just remove the optimizations
altogether.
> The parse of mvcc/sequenceid is expensive. It was noticed over in HBASE-13291.
> The optimizations undone by this changes are (to quote the optimizer himself, Mr [~lhofhansl]):
> {quote}
> Looks like this undoes all of HBASE-9751, HBASE-8151, and HBASE-8166.
> We're always storing the mvcc readpoints, and we never compare them against the actual
smallestReadpoint, and hence we're always performing all the checks, tests, and comparisons
that these jiras removed in addition to actually storing the data - which with up to 8 bytes
per Cell is not trivial.
> {quote}
> This is the 'breaking' change: https://github.com/apache/hbase/commit/2c280e62530777ee43e6148fd6fcf6dac62881c0#diff-07c7ac0a9179cedff02112489a20157fR96



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message