Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BFB1710FC0 for ; Mon, 7 Oct 2013 20:42:53 +0000 (UTC) Received: (qmail 80025 invoked by uid 500); 7 Oct 2013 20:42:49 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 80002 invoked by uid 500); 7 Oct 2013 20:42:47 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 79861 invoked by uid 99); 7 Oct 2013 20:42:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Oct 2013 20:42:45 +0000 Date: Mon, 7 Oct 2013 20:42:45 +0000 (UTC) From: "Constance Eustace (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-6137) CQL3 SELECT IN CLAUSE inconsistent MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-6137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Constance Eustace updated CASSANDRA-6137: ----------------------------------------- Description: We are encountering inconsistent results from CQL3 queries with column keys using IN clause in WHERE. This has been reproduced in cqlsh and the jdbc driver. Rowkey is e_entid Column key is p_prop This returns roughly 21 rows for 21 column keys that match p_prop. cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'; These three queries each return one row for the requested single column key in the IN clause: SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:all:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:fail:count'); This query returns ONLY ONE ROW (one column key), not three as I would expect from the three-column-key IN clause: cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); This query does return two rows however for the requested two column keys: cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ( 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); cqlsh> describe table internal_submission.entity_job; CREATE TABLE entity_job ( e_entid text, p_prop text, describes text, dndcondition text, e_entlinks text, e_entname text, e_enttype text, ingeststatus text, ingeststatusdetail text, p_flags text, p_propid text, p_proplinks text, p_storage text, p_subents text, p_val text, p_vallang text, p_vallinks text, p_valtype text, p_valunit text, p_vars text, partnerid text, referenceid text, size int, sourceip text, submitdate bigint, submitevent text, userid text, version text, PRIMARY KEY (e_entid, p_prop) ) WITH bloom_filter_fp_chance=0.010000 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.000000 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.100000 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes); CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job (dndcondition); CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job (ingeststatus); CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job (ingeststatusdetail); CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job (referenceid); CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid); CREATE INDEX internal_submission__JobVersionIDX ON entity_job (version); ------------------------------- My suspicion is that the three-column-key IN Clause is translated (improperly or not) to a two-column key range with the assumption that the third column key is present in that range, but it isn't... was: We are encountering inconsistent results from CQL3 queries with column keys using IN clause in WHERE. This has been reproduced in cqlsh. Rowkey is e_entid Column key is p_prop This returns roughly 21 rows for 21 column keys that match p_prop. cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'; These three queries each return one row for the requested single column key in the IN clause: SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:all:count'); SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:fail:count'); This query returns ONLY ONE ROW (one column key), not three as I would expect from the three-column-key IN clause: cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); This query does return two rows however for the requested two column keys: cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ( 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); cqlsh> describe table internal_submission.entity_job; CREATE TABLE entity_job ( e_entid text, p_prop text, describes text, dndcondition text, e_entlinks text, e_entname text, e_enttype text, ingeststatus text, ingeststatusdetail text, p_flags text, p_propid text, p_proplinks text, p_storage text, p_subents text, p_val text, p_vallang text, p_vallinks text, p_valtype text, p_valunit text, p_vars text, partnerid text, referenceid text, size int, sourceip text, submitdate bigint, submitevent text, userid text, version text, PRIMARY KEY (e_entid, p_prop) ) WITH bloom_filter_fp_chance=0.010000 AND caching='KEYS_ONLY' AND comment='' AND dclocal_read_repair_chance=0.000000 AND gc_grace_seconds=864000 AND index_interval=128 AND read_repair_chance=0.100000 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND default_time_to_live=0 AND speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'}; CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes); CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job (dndcondition); CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job (ingeststatus); CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job (ingeststatusdetail); CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job (referenceid); CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid); CREATE INDEX internal_submission__JobVersionIDX ON entity_job (version); ------------------------------- My suspicion is that the three-column-key IN Clause is translated (improperly or not) to a two-column key range with the assumption that the third column key is present in that range, but it isn't... > CQL3 SELECT IN CLAUSE inconsistent > ---------------------------------- > > Key: CASSANDRA-6137 > URL: https://issues.apache.org/jira/browse/CASSANDRA-6137 > Project: Cassandra > Issue Type: Bug > Components: Core > Environment: Ubuntu AWS Cassandra 2.0.1 SINGLE NODE > Reporter: Constance Eustace > Fix For: 2.0.1 > > > We are encountering inconsistent results from CQL3 queries with column keys using IN clause in WHERE. This has been reproduced in cqlsh and the jdbc driver. > Rowkey is e_entid > Column key is p_prop > This returns roughly 21 rows for 21 column keys that match p_prop. > cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB'; > These three queries each return one row for the requested single column key in the IN clause: > SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count'); > SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:all:count'); > SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:fail:count'); > This query returns ONLY ONE ROW (one column key), not three as I would expect from the three-column-key IN clause: > cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ('urn:bby:pcm:job:ingest:content:complete:count','urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); > This query does return two rows however for the requested two column keys: > cqlsh> SELECT e_entid,e_entname,e_enttype,p_prop,p_flags,p_propid,e_entlinks,p_proplinks,p_subents,p_val,p_vallinks,p_vars FROM internal_submission.Entity_Job WHERE e_entid = '845b38f1-2b91-11e3-854d-126aad0075d4-CJOB' AND p_prop in ( 'urn:bby:pcm:job:ingest:content:all:count','urn:bby:pcm:job:ingest:content:fail:count'); > cqlsh> describe table internal_submission.entity_job; > CREATE TABLE entity_job ( > e_entid text, > p_prop text, > describes text, > dndcondition text, > e_entlinks text, > e_entname text, > e_enttype text, > ingeststatus text, > ingeststatusdetail text, > p_flags text, > p_propid text, > p_proplinks text, > p_storage text, > p_subents text, > p_val text, > p_vallang text, > p_vallinks text, > p_valtype text, > p_valunit text, > p_vars text, > partnerid text, > referenceid text, > size int, > sourceip text, > submitdate bigint, > submitevent text, > userid text, > version text, > PRIMARY KEY (e_entid, p_prop) > ) WITH > bloom_filter_fp_chance=0.010000 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.000000 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=0.100000 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='NONE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'LZ4Compressor'}; > CREATE INDEX internal_submission__JobDescribesIDX ON entity_job (describes); > CREATE INDEX internal_submission__JobDNDConditionIDX ON entity_job (dndcondition); > CREATE INDEX internal_submission__JobIngestStatusIDX ON entity_job (ingeststatus); > CREATE INDEX internal_submission__JobIngestStatusDetailIDX ON entity_job (ingeststatusdetail); > CREATE INDEX internal_submission__JobReferenceIDIDX ON entity_job (referenceid); > CREATE INDEX internal_submission__JobUserIDX ON entity_job (userid); > CREATE INDEX internal_submission__JobVersionIDX ON entity_job (version); > ------------------------------- > My suspicion is that the three-column-key IN Clause is translated (improperly or not) to a two-column key range with the assumption that the third column key is present in that range, but it isn't... -- This message was sent by Atlassian JIRA (v6.1#6144)