Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 76860 invoked from network); 15 Apr 2011 06:43:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Apr 2011 06:43:32 -0000 Received: (qmail 52388 invoked by uid 500); 15 Apr 2011 06:43:31 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 51670 invoked by uid 500); 15 Apr 2011 06:43:30 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 51662 invoked by uid 500); 15 Apr 2011 06:43:29 -0000 Delivered-To: apmail-incubator-cassandra-user@incubator.apache.org Received: (qmail 51659 invoked by uid 99); 15 Apr 2011 06:43:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Apr 2011 06:43:28 +0000 X-ASF-Spam-Status: No, hits=3.0 required=5.0 tests=FORGED_YAHOO_RCVD,FREEMAIL_FROM,RFC_ABUSE_POST,SPF_NEUTRAL,T_TO_NO_BRKTS_FREEMAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 15 Apr 2011 06:43:21 +0000 Received: from jim.nabble.com ([192.168.236.80]) by sam.nabble.com with esmtp (Exim 4.69) (envelope-from ) id 1QAck8-0005dO-Sm for cassandra-user@incubator.apache.org; Thu, 14 Apr 2011 23:43:00 -0700 Date: Thu, 14 Apr 2011 23:43:00 -0700 (PDT) From: sam_ To: cassandra-user@incubator.apache.org Message-ID: <1302849780869-6275394.post@n2.nabble.com> Subject: Duplicate result of get_indexed_slices, depending on indexClause.count MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi All, I have been using Cassandra 0.7.2 and 0.7.4 with Thrift API (using Java). I noticed that if I am querying a Column Family with indexed columns sometimes I get a duplicate result in get_indexed_slices depending on the number of rows in the CF and the count that I set in IndexClause.count. It also depends on the order of rows in CF. For example consider the following CF that I call Attributes: create column family Attributes with comparator=UTF8Type and column_metadata=[ {column_name: range_id, validation_class: LongType, index_type: KEYS}, {column_name: attr_key, validation_class: UTF8Type, index_type: KEYS}, {column_name: attr_val, validation_class: BytesType, index_type: KEYS} ]; And suppose I have the following rows in the CF: key range_id attr_key attr_val "1/@1/0", 1, "A", "1" "1/5/0", 1, "B", "1000" "3/@1/0", 2, "A", "1" "3/5/0", 2, "B", "1001" "5/@1/0", 3, "A", "2" "5/5/0", 3, "B", "1002" "7/@1/0", 4, "A", "2" "7/5/0", 4, "B", "1003" Now if I have a query with IndexClause like this (in pseudo code): attr_key == "A" AND attr_val == "1" with indexClause.count = 4; Then I ill get the rows with the following keys from get_indexed_slices : "1/@1/0", "3/@1/0", "3/@1/0" The last key is a duplicate! This is very sensitive to the order of rows in the CF and the number of rows and the number you set in indexClause.count. I noticed when the number of rows in the CF is twice the indexClause.count this issue might happen depending on the order of rows in CF! This seems a bug. And it occurs in both 0.7.2 and 0.7.4. Is there a solution to this problem? Many Thanks, Sam -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Duplicate-result-of-get-indexed-slices-depending-on-indexClause-count-tp6275394p6275394.html Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.