accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-4629) Seeking in timestamp range is slow
Date Tue, 25 Apr 2017 20:05:04 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15983522#comment-15983522
] 

Keith Turner commented on ACCUMULO-4629:
----------------------------------------

In order to make it sane, I think enabling scanners to bypass deletes would require a new
table permission.  The option for changing delete behavior at table creation time would not
require a permission, because deletes still can not be bypassed.

Enabling scanners to disable deletes may be useful for debugging?  Not sure about this.  However
it could really gum up scan time iterators.  All iterators written for scan time assumed they
would never see deletes.  Deletes trips up lots of people (myself included) when writing iterators
for compaction.  So if it were useful for debugging, it would be with caveats.

> Seeking in timestamp range is slow
> ----------------------------------
>
>                 Key: ACCUMULO-4629
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-4629
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Keith Turner
>             Fix For: 2.0.0
>
>
> Fluo's internal schema uses the first 4 bits of the timestamp to store different types
of information per column.  These first 4 bits divide the timestamp space up into 16 ranges.
 Fluo has server side iterators that consider information in one of these 16 ranges and then
seek forward to another of the 16 ranges.
> Unfortunately, Accumulo's built in iterator that processes delete marker makes seeking
within the timestamps of a column an O(N^2) operation.  This is because of the way deletes
work in Accumulo.  A delete marker for timestamp X in Accumulo deletes anything with a timestamp
<= X.  
> When seeking to timestamp Y, the Accumulo iterator that handles deletes will scan from
MAX_LONG to Y looking for any deletes that may keep you from seeing data at timestamp Y. 
The following example shows what the delete iterator will do when a user iterator does some
seeks.
>  * User iterator seeks to stamp 1,000,000.  This causes the delete iter to scan from
MAX_LONG to 1,000,000 looking for delete markers.
>  * User iterator seeks to stamp 900,000.  This causes the delete iter to scan from MAX_LONG
to 900,000 looking for delete markers.
>  * User iterator seeks to stamp 500,000.  This causes the delete iter to scan from MAX_LONG
to 500,000 looking for delete markers.
> So Fluo can seek efficiently, it has done some [serious shenanigans|https://github.com/apache/incubator-fluo/blob/rel/fluo-1.0.0-incubating/modules/accumulo/src/main/java/org/apache/fluo/accumulo/iterators/TimestampSkippingIterator.java#L164]
using reflection to remove  the DeleteIterator.  The great work being done on ACCUMULO-3079
will likely break this crazy reflection code.  So I would like to make a change to upstream
Accumulo that allows efficient seeking in the timestamp range.  I have thought of the following
two possible solutions.
>  * Make the DeleteIterator stateful so that it remember what ranges it has scanned for
deletes.  I don't really like this solution because it will add expense to every seek in Accumulo
for an edge case.
>  * Make it possible to create tables with an exact delete behavior. Meaning a delete
for timestamp X will only delete an existing row column with that exact timestamp.  This option
could only be chosen at table creation time and could not be changed.  For this delete behavior,
the delete iterator doesnot need to scan for every seek.
> Are there other possible solutions?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message