Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7CCDC17791 for ; Thu, 4 Jun 2015 13:59:43 +0000 (UTC) Received: (qmail 63883 invoked by uid 500); 4 Jun 2015 13:59:38 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 63845 invoked by uid 500); 4 Jun 2015 13:59:38 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 63828 invoked by uid 99); 4 Jun 2015 13:59:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Jun 2015 13:59:38 +0000 Date: Thu, 4 Jun 2015 13:59:38 +0000 (UTC) From: "Philip Thompson (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-9540) Cql query doesn't return right information when using IN on columns for some keys MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-9540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Thompson updated CASSANDRA-9540: --------------------------------------- Assignee: Carl Yeksigian > Cql query doesn't return right information when using IN on columns for some keys > --------------------------------------------------------------------------------- > > Key: CASSANDRA-9540 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9540 > Project: Cassandra > Issue Type: Bug > Components: API > Environment: Cassandra 2.1.5 > Reporter: Mathijs Vogelzang > Assignee: Carl Yeksigian > Fix For: 2.1.x > > > We are investigating a weird issue where one of our clients doesn't get data on his dashboard. It seems Cassandra is not returning data for a particular key ("brokenkey" from now on). > Some background: > We have a row where we store a "metadata" column and data in columns "bucket/0", "bucket/1", "bucket/2", etc. Depending on the date selection of the UI, we know that we only need to retrieve bucket/0, bucket/0 and bucket/1 etc. (we always need to retrieve "metadata"). > A typical query may look like this (using SELECT column1 to just show what is returned, normally we would of course do SELECT value): > {noformat} > cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey'); > blobAsText(column1) > --------------------- > bucket/0 > metadata > (2 rows) > cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey'); > blobAsText(column1) > --------------------- > bucket/0 > metadata > (2 rows) > {noformat} > These two queries work as expected, and return the information that we actually stored. > However, when we "filter" for certain columns, the brokenkey starts behaving very weird: > {noformat} > cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2')); > blobAsText(column1) > --------------------- > bucket/0 > metadata > (2 rows) > cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/workingkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf')); > blobAsText(column1) > --------------------- > bucket/0 > metadata > (2 rows) > *** As expected, querying for more information doesn't really matter for the working key *** > cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2')); > blobAsText(column1) > --------------------- > bucket/0 > (1 rows) > *** Cassandra stops giving us the metadata column when asking for a few more columns! *** > cqlsh:AppBrain> select blobAsText(column1) from "GroupedSeries" where key=textAsBlob('install/brokenkey') and column1 IN (textAsBlob('metadata'),textAsBlob('bucket/0'),textAsBlob('bucket/1'),textAsBlob('bucket/2'),textAsBlob('asdfasdfasdf')); > key | column1 | value > -----+---------+------- > (0 rows) > *** Adding the bogus column name even makes it return nothing from this row anymore! *** > {noformat} > There are at least two rows that malfunction like this in our table (which is quite old already and has gone through a bunch of Cassandra upgrades). I've upgraded our whole cluster to 2.1.5 (we were on 2.1.2 when I discovered this problem) and compacted, repaired and scrubbed this column family, which hasn't helped. > Our table structure is: > {noformat} > cqlsh:AppBrain> describe table "GroupedSeries"; > CREATE TABLE "AppBrain"."GroupedSeries" ( > key blob, > column1 blob, > value blob, > PRIMARY KEY (key, column1) > ) WITH COMPACT STORAGE > AND CLUSTERING ORDER BY (column1 ASC) > AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}' > AND comment = '' > AND compaction = {'min_threshold': '4', 'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32'} > AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} > AND dclocal_read_repair_chance = 0.1 > AND default_time_to_live = 0 > AND gc_grace_seconds = 864000 > AND max_index_interval = 2048 > AND memtable_flush_period_in_ms = 0 > AND min_index_interval = 128 > AND read_repair_chance = 1.0 > AND speculative_retry = 'NONE'; > {noformat} > Let me know if I can give more information that may be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)