cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8993) EffectiveIndexInterval calculation is incorrect
Date Thu, 26 Mar 2015 17:44:53 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382302#comment-14382302
] 

Tyler Hobbs commented on CASSANDRA-8993:
----------------------------------------

I'll try to explain a bit about how downsampling works overall so that more people besides
myself understand how it works :)

I can put whatever info is useful into comments for posterity.

bq. If I print out the "original indices" and "effective intervals", it seems that at the
first downsampling level (64)

The sampling level after minimal downsampling is 127, not 64.  The sampling level can be anywhere
between 0 and BASE_SAMPLING_LEVEL.  When a summary moves from sampling level 128 to level
127, it will drop one summary entry with an index between \[0, 127\], one entry between \[127,
255\], and so on for the rest of the summary.  The index to drop is determined by {{Downsampling.getSamplingPattern()}}.
The list of integers returned from {{Downsampling.getSamplingPattern(BASE_SAMPLING_LEVEL)}}
are the indexes that we'll drop for each round of downsampling.

As an example, suppose BASE_SAMPLING_LEVEL is 16 instead of 128.  {{Downsampling.getSamplingPattern(16)}}
returns the following pattern:

{noformat}
15, 7, 11, 3, 13, 5, 9, 1, 14, 6, 10, 2, 12, 4, 8, 0
{noformat}

So, when we move from sampling level 16 to 15, we'll drop the entry at index 15 (and repeat
that for indexes 15 + (16 * 1), 15 + (16 * 2), 15 + (16 * 3), etc).  When we move from sampling
level 15 to 14, we'll drop the entry at index 7 (and repeat as before, but take into account
the fact that we've already dropped the entry at index 15).  This pattern of dropping minimizes
the maximum distance between remaining summary entries.

Now, in practice, we will never move from sampling level 128 directly to level 127 because
of IndexSummaryManager's {{DOWNSAMPLE_THRESHOLD}}.  However, an index summary could go through
multiple rounds of down and upsampling and arrive at level 127, so we need to be able to handle
that.

bq. Further confusion to understanding Downsampling as a whole stems from the permission of
a -1 index into getEffectiveIndexIntervalAfterIndex without explanation

Hmm, yeah, looking at the code, I don't think we actually need to handle that.  I believe
it is leftover logic from earlier in the development of the code when downsampling would remove
the 0th index in an earlier round.  With the current code, the 0th index entry should always
be present.  I'll make some changes to remove that.

bq. and the fact that every effective interval is the same despite there being multiple avenues
for calculating it

I'm not sure what you mean here. 

> EffectiveIndexInterval calculation is incorrect
> -----------------------------------------------
>
>                 Key: CASSANDRA-8993
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8993
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Benedict
>            Priority: Blocker
>             Fix For: 2.1.4
>
>         Attachments: 8993-2.1-v2.txt, 8993-2.1.txt, 8993.txt
>
>
> I'm not familiar enough with the calculation itself to understand why this is happening,
but see discussion on CASSANDRA-8851 for the background. I've introduced a test case to look
for this during downsampling, but it seems to pass just fine, so it may be an artefact of
upgrading.
> The problem was, unfortunately, not manifesting directly because it would simply result
in a failed lookup. This was only exposed when early opening used firstKeyBeyond, which does
not use the effective interval, and provided the result to getPosition().
> I propose a simple fix that ensures a bug here cannot break correctness. Perhaps [~thobbs]
can follow up with an investigation as to how it actually went wrong?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message