Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B2EB8181EB for ; Wed, 11 Nov 2015 17:47:11 +0000 (UTC) Received: (qmail 14414 invoked by uid 500); 11 Nov 2015 17:47:11 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 14382 invoked by uid 500); 11 Nov 2015 17:47:11 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 14253 invoked by uid 99); 11 Nov 2015 17:47:11 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Nov 2015 17:47:11 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 0ED332C1F61 for ; Wed, 11 Nov 2015 17:47:11 +0000 (UTC) Date: Wed, 11 Nov 2015 17:47:11 +0000 (UTC) From: "Sam Tunnicliffe (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-8505) Invalid results are returned while secondary index are being build MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-8505?page=3Dcom.atlas= sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D= 15000767#comment-15000767 ]=20 Sam Tunnicliffe commented on CASSANDRA-8505: -------------------------------------------- I think the various index states can be reduced to a simple ready/not ready= check. What's more unless we intend to change the established behaviour fa= irly significantly, once an index moves to a ready state it never moves bac= k to being not ready. The only times when we modify the status in the syste= m table are when the index is removed (in which case we have no problem wit= h being able to query using it) or during a rebuild. In the latter case tho= ugh, we probably shouldn't reject queries (and we don't currently), as an i= ndex rebuild is incremental. That is, we don't scrap the existing index tab= les and rebuild everything from scratch, just write new index SSTables to s= upercede the old ones. So although it's certainly possible to get incorrect= results during a rebuild (because of missing/stale entries), the results o= nly get more correct as the rebuild progresses. Changing this so that all q= ueries against that index return errors until all rebuilds complete seems l= ike a step backwards. It seems more reasonable to reject queries until the = initial build has been performed, as per the example in the description, bu= t this only requires a simple boolean to track state between instantiating/= registering the index and its initial build task completing (if one is requ= ired).=20 It would be good to have some test coverage of this, although the best I co= uld come up with is a dtest which inserts many rows, then adds the index an= d queries immediately expecting ReadFailureException, which is fairly lame = and fragile. A couple of points specific to the 3.0 patch: * The fix for CASSANDRA-10595 has been lost. If an index doesn't register i= tself in {{createIndex}}, don't ask it for an initalization task, just set = {{initialBuildTask =3D=3D null}}.=20 * {{SIM::reloadIndex}} has changed since the patch was created (due to CASS= ANDRA-10604) - I think that no changes to this method are now required. I d= id notice though that the current implementation actually makes a redundant= call to {{getMetadataReloadTask}}, so if you could fix that while you're h= ere, that'd be great. bq. Secondary index and their build/not build status are node-local. By con= sequence it is not possible to know on a coordinator node if the index is f= ully build. It can be built on the coordinator but still building on other = nodes For future reference on this point, we also have CASSANDRA-9967 which has a= very similar intent. > Invalid results are returned while secondary index are being build > ------------------------------------------------------------------ > > Key: CASSANDRA-8505 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8505 > Project: Cassandra > Issue Type: Bug > Components: Coordination > Reporter: Benjamin Lerer > Assignee: Benjamin Lerer > Fix For: 2.2.x, 3.0.x > > > If you request an index creation and then execute a query that use the in= dex the results returned might be invalid until the index is fully build. T= his is caused by the fact that the table column will be marked as indexed b= efore the index is ready. > The following unit tests can be use to reproduce the problem: > {code} > @Test > public void testIndexCreatedAfterInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a= , b)))"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > =20 > createIndex("CREATE INDEX ON %s(b)"); > =20 > assertRows(execute("SELECT * FROM %s WHERE b =3D ?;", 1), > row(0, 1, 1), > row(1, 1, 4)); > } > =20 > @Test > public void testIndexCreatedBeforeInsert() throws Throwable > { > createTable("CREATE TABLE %s (a int, b int, c int, primary key((a= , b)))"); > createIndex("CREATE INDEX ON %s(b)"); > =20 > execute("INSERT INTO %s (a, b, c) VALUES (0, 0, 0);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 1, 1);"); > execute("INSERT INTO %s (a, b, c) VALUES (0, 2, 2);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 0, 3);"); > execute("INSERT INTO %s (a, b, c) VALUES (1, 1, 4);"); > assertRows(execute("SELECT * FROM %s WHERE b =3D ?;", 1), > row(0, 1, 1), > row(1, 1, 4)); > } > {code} > The first test will fail while the second will work.=20 > In my opinion the first test should reject the request as invalid (as if = the index was not existing) until the index is fully build. -- This message was sent by Atlassian JIRA (v6.3.4#6332)