cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-8458) Don't give out positions in an sstable beyond its first/last tokens
Date Mon, 15 Dec 2014 16:31:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14246786#comment-14246786
] 

Benedict edited comment on CASSANDRA-8458 at 12/15/14 4:30 PM:
---------------------------------------------------------------

I'm pretty sure this should throw an AssertionError given our new checks on first/last, instead
of simply continuing, and it simplifies the code a little. Also, technically not related to
this ticket, but doesn't it strike you as a bug to use Operation.GT, instead of Operation.GE?
If so, I'm not sure how we don't have tests to catch it.

{code}
            RowIndexEntry idxLeft = getPosition(leftBound, Operator.GT);
            long left = idxLeft == null ? -1 : idxLeft.position;
            if (left == -1)
                // left is past the end of the file
                continue;
{code}

Couldn't we also simplify correcting "right" to a similar approach as you've taken for left,
and use the fact that we know the positions we're looking for exist in both cases? I think
the following snippet for the guts of the method should work but is (perhaps?) a little clearer:

{code}
            // range is never wrap around, unless right is the minimum token
            assert !range.isWrapAround() || range.right.isMinimum();

            // truncate the range so it at most covers the sstable
            AbstractBounds<RowPosition> bounds = range.toRowBounds();
            RowPosition leftBound = bounds.left.compareTo(first) > 0 ? bounds.left : first.getToken().minKeyBound();
            RowPosition rightBound = bounds.right.isMinimum() ? last.getToken().maxKeyBound()
: bounds.right;

            // skip non-overlapping ranges
            if (leftBound.compareTo(last) > 0 || rightBound.compareTo(first) < 0)
                continue;

            // transform into positions known to exist in the file
            long left = getPosition(leftBound, Operator.GE).position;
            long right = (rightBound.compareTo(last) >= 0)
                         ? (openReason == OpenReason.EARLY
                            // if opened early, we overlap with the old sstables by one key,
so we know that the last
                            // (and further) key(s) will be streamed from these if necessary
                            ? getPosition(last.getToken().maxKeyBound(), Operator.GE).position
                            : uncompressedLength())
                         : getPosition(rightBound, Operator.GT).position;
{code}



was (Author: benedict):
There would be a bug if ranges could wrap with the first "continue" potentially filtering
files we actually need to visit. However Range.normalize() _claims_ to remove all wrapped
ranges, in which case we can actually simplify the second check to remove !isWrapRound(),
making it a little clearer.

I'm pretty sure this should throw an AssertionError given our new checks on first/last, and
again it simplifies the code a little. Also, technically not related to this ticket, but doesn't
it strike you as a bug to use Operation.GT, instead of Operation.GE? If so, I'm not sure how
we don't have tests to catch it.

{code}
            RowIndexEntry idxLeft = getPosition(leftBound, Operator.GT);
            long left = idxLeft == null ? -1 : idxLeft.position;
            if (left == -1)
                // left is past the end of the file
                continue;
{code}

Couldn't we also simplify correcting "right" to a similar approach as you've taken for left,
and use the fact that we know the positions we're looking for exist in both cases? I think
the following snippet for the guts of the method should work but is (perhaps?) a little clearer:

{code}
            // range is never wrap around, unless right is the minimum token
            assert !range.isWrapAround() || range.right.isMinimum();

            // truncate the range so it at most covers the sstable
            AbstractBounds<RowPosition> bounds = range.toRowBounds();
            RowPosition leftBound = bounds.left.compareTo(first) > 0 ? bounds.left : first.getToken().minKeyBound();
            RowPosition rightBound = bounds.right.isMinimum() ? last.getToken().maxKeyBound()
: bounds.right;

            // skip non-overlapping ranges
            if (leftBound.compareTo(last) > 0 || rightBound.compareTo(first) < 0)
                continue;

            // transform into positions known to exist in the file
            long left = getPosition(leftBound, Operator.GE).position;
            long right = (rightBound.compareTo(last) >= 0)
                         ? (openReason == OpenReason.EARLY
                            // if opened early, we overlap with the old sstables by one key,
so we know that the last
                            // (and further) key(s) will be streamed from these if necessary
                            ? getPosition(last.getToken().maxKeyBound(), Operator.GE).position
                            : uncompressedLength())
                         : getPosition(rightBound, Operator.GT).position;
{code}


> Don't give out positions in an sstable beyond its first/last tokens
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-8458
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8458
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Marcus Eriksson
>            Assignee: Marcus Eriksson
>             Fix For: 2.1.3
>
>         Attachments: 0001-Make-sure-we-don-t-give-out-positions-from-an-sstabl.patch
>
>
> Looks like we include tmplink sstables in streams in 2.1+, and when we do, sometimes
we get this error message on the receiving side: {{java.io.IOException: Corrupt input data,
block did not start with 2 byte signature ('ZV') followed by type byte, 2-byte length)}}.
I've only seen this happen when a tmplink sstable is included in the stream.
> We can not just exclude the tmplink files when starting the stream - we need to include
the original file, which we might miss since we check if the requested stream range intersects
the sstable range.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message