Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C47B6E52B for ; Tue, 5 Feb 2013 13:46:12 +0000 (UTC) Received: (qmail 18471 invoked by uid 500); 5 Feb 2013 13:46:10 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 18163 invoked by uid 500); 5 Feb 2013 13:46:07 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 18143 invoked by uid 99); 5 Feb 2013 13:46:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 13:46:06 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of russisk@gmail.com designates 209.85.219.54 as permitted sender) Received: from [209.85.219.54] (HELO mail-oa0-f54.google.com) (209.85.219.54) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Feb 2013 13:46:01 +0000 Received: by mail-oa0-f54.google.com with SMTP id n12so158689oag.13 for ; Tue, 05 Feb 2013 05:45:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=mUGwvkDasEGJWMhdNBu1ujokfzuKfy7uJc98P/ZeYwI=; b=jeHrZmNd3hrhfsFOWCnwaZlnhgTzLy8lI47luA3dKkDKQYjjo/jUtAXNHzndghrWBK k8wkIkq0u+eTt/Fx5pgltfu93huSoCeFMng6xI4o/vIkyXycVUZJZV1FY6Wp0GNPX37D swTv/lf1cLN3sctPmphaCGIR3jBRcWOghPKAgMh684gblKjeFt3QM3m7MUvnYfSBkFPj tMfUvXyiG6oXl1JM4fMHR6lPdU+MgovLhaUyew6nnf53EK/FN5Vb6AuDV6mlQfrN2Yzw IupyQ88yScHRy8UD8PMkJ9RUBcrBVHmd66ce8py6H1vx8MLTX0CXKScKnGZ63wP9yhMe fqmw== MIME-Version: 1.0 X-Received: by 10.60.32.9 with SMTP id e9mr2257311oei.134.1360071941074; Tue, 05 Feb 2013 05:45:41 -0800 (PST) Received: by 10.76.142.100 with HTTP; Tue, 5 Feb 2013 05:45:40 -0800 (PST) In-Reply-To: References: Date: Tue, 5 Feb 2013 14:45:40 +0100 Message-ID: Subject: Re: neither 'nodetool repair' nor 'hinted hanoff/read repair' work for secondary indexes From: Alexei Bakanov To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Made a d-test for easier reproduction and created https://issues.apache.org/jira/browse/CASSANDRA-5223 On 1 February 2013 15:14, Alexei Bakanov wrote: > Hi again, > > Once started playing with CCM it's hard to stop, such a great tool. > My issue with secondary indexes is following: neither explicit > 'nodetool repair' nor implicit 'hinted handoffs/read repairs' resolve > inconsistencies in data I get from secondary indexes. > I observe this for both one- and 2-datacenter deployments, independent > of caching settings. Rebuilding/droping and creating index or > restarting nodes doesn't help. > > In the following scenario I start up 2 nodes and insert some rows with > CL.ONE. During this process I deliberately stop and start the nodes in > order to trigger inconsistencies. > I then query all data by its index with read CL.ONE and stop if I see > that data is missing. I see that none of C* repair mechanisms work for > secondary indexes. > > $ ccm create --cassandra-version 1.2.1 --nodes 2 -p RandomPartitioner > test2ndIndexRepair > $ ccm start > $ ccm node1 cli > -> create keyspace and column family (please find schemas attached) > $ python populate_repair.py (in first terminal) > $ ccm node1 stop; sleep 10; ccm node1 start (in second terminal, > while populate_repair.py runs) > $ ccm node2 stop; sleep 10; ccm node2 start (in second terminal, > while populate_repair.py runs. Hinted Handoffs do the work but > unfortunately not on Secondary Indexes) > > $ python fetcher_repair.py > .... > 254 > 255 > 256 > Traceback (most recent call last): > File "fetcher_repair.py", line 19, in > raise Exception('missing rows for userId %s, data length is > %d'%(userId, len(data))) > Exception: missing rows for userId 256, data length is 0 > > $ ccm cli > [default@unknown] use testks; > Authenticated to keyspace: testks > [default@testks] get cf1 where 'indexedColumn'='userId_256'; > > 0 Row Returned. > Elapsed time: 47 msec(s). > > $ python fetcher_repair.py (running one more time in hope that 'read > repair' kicked in after the last query, but unfortunately no) > .... > 254 > 255 > 256 > Traceback (most recent call last): > File "fetcher_repair.py", line 19, in > raise Exception('missing rows for userId %s, data length is > %d'%(userId, len(data))) > Exception: missing rows for userId 256, data length is 0 > > $ ccm node1 repair > $ ccm node2 repair > $ ccm cli > > [default@unknown] use testks; > Authenticated to keyspace: testks > [default@testks] get cf1 where 'indexedColumn'='userId_256'; > > 0 Row Returned. > > > Both cassandra instances run with -Xms1927M -Xmx1927M -Xmn400M > > Thanks for help. > > Best regards, > Alexei > > ------START cassandra-cli schemas ------------ > create keyspace testks > with placement_strategy = 'NetworkTopologyStrategy' > and strategy_options = {datacenter1 : 2} > and durable_writes = true; > > use testks; > > create column family cf1 > with column_type = 'Standard' > and comparator = 'AsciiType' > and default_validation_class = 'UTF8Type' > and key_validation_class = 'UTF8Type' > and read_repair_chance = 1.0 > and dclocal_read_repair_chance = 1.0 > and gc_grace = 864000 > and min_compaction_threshold = 4 > and max_compaction_threshold = 32 > and replicate_on_write = true > and compaction_strategy = > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' > and caching = 'KEYS_ONLY' > and column_metadata = [ > {column_name : 'indexedColumn', > validation_class : UTF8Type, > index_name : 'INDEX1', > index_type : 0}] > and compression_options = {'sstable_compression' : > 'org.apache.cassandra.io.compress.SnappyCompressor'}; > ------FINISH cassandra-cli schemas ------------ > > ------START populate_repair.py ---------- > import datetime > from pycassa.batch import Mutator > > import pycassa > > pool = pycassa.ConnectionPool('testks', timeout=5, > server_list=['127.0.0.1:9160', '127.0.0.2:9160']) > cf = pycassa.ColumnFamily(pool, 'cf1') > > for userId in xrange(0, 2000): > print userId > b = Mutator(pool, queue_size=200) > for itemId in xrange(20): > rowKey = 'userId_%s:itemId_%s'%(userId, itemId) > for message_number in xrange(10): > b.insert(cf, rowKey, {'indexedColumn': 'userId_%s'%userId, > str(message_number): str(message_number)}) > b.send() > > pool.dispose() > ------FINISH populate_repair.py ---------- > > ------START fetcher_repair.py ---------- > import pycassa > from pycassa.columnfamily import ColumnFamily > from pycassa.pool import ConnectionPool > from pycassa.index import * > > pool = pycassa.ConnectionPool('testks', server_list=['127.0.0.1:9160', > '127.0.0.2:9160']) > cf = pycassa.ColumnFamily(pool, 'cf1') > > for userId in xrange(2000): > print userId > index_expr = create_index_expression('indexedColumn', 'userId_%s'%userId) > index_clause = create_index_clause([index_expr], count=10000000) > data = list(cf.get_indexed_slices(index_clause=index_clause)) > if len(data) != 20: > raise Exception('missing rows for userId %s, data length is > %d'%(userId, len(data))) > pool.dispose() > > ------FINISH fetcher_repair.py ----------