Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E9FAC17B6D for ; Thu, 26 Mar 2015 17:44:53 +0000 (UTC) Received: (qmail 89306 invoked by uid 500); 26 Mar 2015 17:44:53 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 89276 invoked by uid 500); 26 Mar 2015 17:44:53 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 89264 invoked by uid 99); 26 Mar 2015 17:44:53 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2015 17:44:53 +0000 Date: Thu, 26 Mar 2015 17:44:53 +0000 (UTC) From: "Tyler Hobbs (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-8993) EffectiveIndexInterval calculation is incorrect MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382302#comment-14382302 ] Tyler Hobbs commented on CASSANDRA-8993: ---------------------------------------- I'll try to explain a bit about how downsampling works overall so that more people besides myself understand how it works :) I can put whatever info is useful into comments for posterity. bq. If I print out the "original indices" and "effective intervals", it seems that at the first downsampling level (64) The sampling level after minimal downsampling is 127, not 64. The sampling level can be anywhere between 0 and BASE_SAMPLING_LEVEL. When a summary moves from sampling level 128 to level 127, it will drop one summary entry with an index between \[0, 127\], one entry between \[127, 255\], and so on for the rest of the summary. The index to drop is determined by {{Downsampling.getSamplingPattern()}}. The list of integers returned from {{Downsampling.getSamplingPattern(BASE_SAMPLING_LEVEL)}} are the indexes that we'll drop for each round of downsampling. As an example, suppose BASE_SAMPLING_LEVEL is 16 instead of 128. {{Downsampling.getSamplingPattern(16)}} returns the following pattern: {noformat} 15, 7, 11, 3, 13, 5, 9, 1, 14, 6, 10, 2, 12, 4, 8, 0 {noformat} So, when we move from sampling level 16 to 15, we'll drop the entry at index 15 (and repeat that for indexes 15 + (16 * 1), 15 + (16 * 2), 15 + (16 * 3), etc). When we move from sampling level 15 to 14, we'll drop the entry at index 7 (and repeat as before, but take into account the fact that we've already dropped the entry at index 15). This pattern of dropping minimizes the maximum distance between remaining summary entries. Now, in practice, we will never move from sampling level 128 directly to level 127 because of IndexSummaryManager's {{DOWNSAMPLE_THRESHOLD}}. However, an index summary could go through multiple rounds of down and upsampling and arrive at level 127, so we need to be able to handle that. bq. Further confusion to understanding Downsampling as a whole stems from the permission of a -1 index into getEffectiveIndexIntervalAfterIndex without explanation Hmm, yeah, looking at the code, I don't think we actually need to handle that. I believe it is leftover logic from earlier in the development of the code when downsampling would remove the 0th index in an earlier round. With the current code, the 0th index entry should always be present. I'll make some changes to remove that. bq. and the fact that every effective interval is the same despite there being multiple avenues for calculating it I'm not sure what you mean here. > EffectiveIndexInterval calculation is incorrect > ----------------------------------------------- > > Key: CASSANDRA-8993 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8993 > Project: Cassandra > Issue Type: Bug > Components: Core > Reporter: Benedict > Assignee: Benedict > Priority: Blocker > Fix For: 2.1.4 > > Attachments: 8993-2.1-v2.txt, 8993-2.1.txt, 8993.txt > > > I'm not familiar enough with the calculation itself to understand why this is happening, but see discussion on CASSANDRA-8851 for the background. I've introduced a test case to look for this during downsampling, but it seems to pass just fine, so it may be an artefact of upgrading. > The problem was, unfortunately, not manifesting directly because it would simply result in a failed lookup. This was only exposed when early opening used firstKeyBeyond, which does not use the effective interval, and provided the result to getPosition(). > I propose a simple fix that ensures a bug here cannot break correctness. Perhaps [~thobbs] can follow up with an investigation as to how it actually went wrong? -- This message was sent by Atlassian JIRA (v6.3.4#6332)