Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8AFF5200C4D for ; Tue, 21 Mar 2017 20:29:47 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 89E0E160B6E; Tue, 21 Mar 2017 19:29:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9ED94160B81 for ; Tue, 21 Mar 2017 20:29:46 +0100 (CET) Received: (qmail 84167 invoked by uid 500); 21 Mar 2017 19:29:45 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 83927 invoked by uid 99); 21 Mar 2017 19:29:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Mar 2017 19:29:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 5BED31AF8E0 for ; Tue, 21 Mar 2017 19:29:45 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.349 X-Spam-Level: X-Spam-Status: No, score=-99.349 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_NEUTRAL=0.652, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id dI12MisyvsTN for ; Tue, 21 Mar 2017 19:29:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id ED3ED60CEB for ; Tue, 21 Mar 2017 19:29:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 00200E05CE for ; Tue, 21 Mar 2017 19:29:41 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id B4F9A254D3 for ; Tue, 21 Mar 2017 19:29:41 +0000 (UTC) Date: Tue, 21 Mar 2017 19:29:41 +0000 (UTC) From: "Alex Petrov (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-12962) SASI: Index are rebuilt on restart MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 21 Mar 2017 19:29:47 -0000 [ https://issues.apache.org/jira/browse/CASSANDRA-12962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15935163#comment-15935163 ] Alex Petrov commented on CASSANDRA-12962: ----------------------------------------- [~jjirsa] thanks for the hint. I have indeed missed that little insert with two columns that do not include {{c}} (and ran tests that didn't account for this edge case). I think in this case we have to build a placeholder index or something similar that'd make sure we don't rebuild it. [~iksaif] should I take it or did you plan to work on it? > SASI: Index are rebuilt on restart > ---------------------------------- > > Key: CASSANDRA-12962 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12962 > Project: Cassandra > Issue Type: Improvement > Components: sasi > Reporter: Corentin Chary > Priority: Minor > Fix For: 3.11.x > > > Apparently when cassandra any index that does not index a value in *every* live SSTable gets rebuild. The offending code can be found in the constructor of SASIIndex. > You can easilly reproduce it: > {code} > CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true; > CREATE TABLE test.test ( > a text PRIMARY KEY, > b text, > c text > ) WITH bloom_filter_fp_chance = 0.01 > AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'} > AND comment = '' > AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'} > AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND crc_check_chance = 1.0 > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 0.0 > AND speculative_retry = '99PERCENTILE'; > CREATE CUSTOM INDEX test_b_idx ON test.test (b) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > CREATE CUSTOM INDEX test_c_idx ON test.test (c) USING 'org.apache.cassandra.index.sasi.SASIIndex'; > INSERT INTO test.test (a, b) VALUES ('a', 'b'); > {code} > Log (I added additional traces): > {code} > INFO [main] 2016-11-28 15:32:21,191 ColumnFamilyStore.java:406 - Initializing test.test > DEBUG [SSTableBatchOpen:1] 2016-11-28 15:32:21,192 SSTableReader.java:505 - Opening /mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big (0.034KiB) > DEBUG [main] 2016-11-28 15:32:21,194 SASIIndex.java:118 - index: org.apache.cassandra.schema.IndexMetadata@2f661b1a[id=6b00489b-7010-396e-9348-9f32f5167f88,name=test_b_idx,kind=CUSTOM,options={class_name=org.a\ > pache.cassandra.index.sasi.SASIIndex, target=b}], base CFS(Keyspace='test', ColumnFamily='test'), tracker org.apache.cassandra.db.lifecycle.Tracker@15900b83 > INFO [main] 2016-11-28 15:32:21,194 DataTracker.java:152 - SSTableIndex.open(column: b, minTerm: value, maxTerm: value, minKey: key, maxKey: key, sstable: BigTableReader(path='/mnt/ssd/tmp/data/data/test/test\ > -229e6380b57711e68407158fde22e121/mc-1-big-Data.db')) > DEBUG [main] 2016-11-28 15:32:21,195 SASIIndex.java:129 - Rebuilding SASI Indexes: {} > DEBUG [main] 2016-11-28 15:32:21,195 ColumnFamilyStore.java:895 - Enqueuing flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap > DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204 Memtable.java:465 - Writing Memtable-IndexInfo@748981977(0.054KiB serialized bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\ > 372036854775808), max(9223372036854775807)] > DEBUG [PerDiskMemtableFlushWriter_0:1] 2016-11-28 15:32:21,204 Memtable.java:494 - Completed flushing /mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db (0.035KiB) for\ > commitlog position CommitLogPosition(segmentId=1480343535479, position=15652) > DEBUG [MemtableFlushWriter:1] 2016-11-28 15:32:21,224 ColumnFamilyStore.java:1200 - Flushed to [BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4256-big-Data.db\ > ')] (1 sstables, 4.838KiB), biggest 4.838KiB, smallest 4.838KiB > DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:118 - index: org.apache.cassandra.schema.IndexMetadata@12f3d291[id=45fcb286-b87a-3d18-a04b-b899a9880c91,name=test_c_idx,kind=CUSTOM,options={class_name=org.a\ > pache.cassandra.index.sasi.SASIIndex, target=c}], base CFS(Keyspace='test', ColumnFamily='test'), tracker org.apache.cassandra.db.lifecycle.Tracker@15900b83 > DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:121 - to rebuild: index: BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db'), sstable: org.apache.cassa\ > ndra.index.sasi.conf.ColumnIndex@6cbb6b0e > DEBUG [main] 2016-11-28 15:32:21,224 SASIIndex.java:129 - Rebuilding SASI Indexes: {BigTableReader(path='/mnt/ssd/tmp/data/data/test/test-229e6380b57711e68407158fde22e121/mc-1-big-Data.db')={c=org.apache.cassa\ > ndra.index.sasi.conf.ColumnIndex@6cbb6b0e}} > DEBUG [main] 2016-11-28 15:32:21,225 ColumnFamilyStore.java:895 - Enqueuing flush of IndexInfo: 0.386KiB (0%) on-heap, 0.000KiB (0%) off-heap > DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235 Memtable.java:465 - Writing Memtable-IndexInfo@951411443(0.054KiB serialized bytes, 1 ops, 0%/0% of on/off-heap limit), flushed range = (min(-9223\ > 372036854775808), max(9223372036854775807)] > DEBUG [PerDiskMemtableFlushWriter_0:2] 2016-11-28 15:32:21,235 Memtable.java:494 - Completed flushing /mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db (0.035KiB) for\ > commitlog position CommitLogPosition(segmentId=1480343535479, position=15720) > DEBUG [MemtableFlushWriter:2] 2016-11-28 15:32:21,254 ColumnFamilyStore.java:1200 - Flushed to [BigTableReader(path='/mnt/ssd/tmp/data/data/system/IndexInfo-9f5c6374d48532299a0a5094af9ad1e3/mc-4257-big-Data.db\ > ')] (1 sstables, 4.836KiB), biggest 4.836KiB, smallest 4.836KiB > {code} > I think a better behavior would be to ask users to explicitly rebuild indexes if they remove the files, that's fine as long as we handle correctly the case of new indexes. -- This message was sent by Atlassian JIRA (v6.3.15#6346)