Return-Path: X-Original-To: apmail-cassandra-dev-archive@www.apache.org Delivered-To: apmail-cassandra-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CF43B9C14 for ; Thu, 14 Mar 2013 07:53:06 +0000 (UTC) Received: (qmail 12362 invoked by uid 500); 14 Mar 2013 07:53:06 -0000 Delivered-To: apmail-cassandra-dev-archive@cassandra.apache.org Received: (qmail 11955 invoked by uid 500); 14 Mar 2013 07:52:57 -0000 Mailing-List: contact dev-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list dev@cassandra.apache.org Received: (qmail 11899 invoked by uid 99); 14 Mar 2013 07:52:56 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 07:52:56 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,UNPARSEABLE_RELAY X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pushkar.prasad@airtightnetworks.net designates 64.78.61.187 as permitted sender) Received: from [64.78.61.187] (HELO mail21.intermedia.net) (64.78.61.187) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 07:52:50 +0000 Received: from airtightnetworks.net (unknown [115.113.149.71]) (Authenticated sender: smtp@airtightnetworks.net) by mail21.intermedia.net (Postfix) with ESMTPA id A52111C63 for ; Thu, 14 Mar 2013 00:52:28 -0700 (PDT) Received: from ([127.0.0.1]) with MailEnable ESMTP; Thu, 14 Mar 2013 13:27:00 +0530 From: "Pushkar Prasad" To: Subject: Slow search on secondary index Date: Thu, 14 Mar 2013 13:22:25 +0530 Message-ID: <50A20D426EE2450197DABADD2DDAE750@pune.wibhu.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_047D_01CE20B6.F8216D40" X-Mailer: Microsoft Office Outlook 11 Thread-Index: Ac4giNBw+MFqMwKjTsOF91aXpfRPnAAAAf7Q X-MimeOLE: Produced By Microsoft MimeOLE V6.1.7600.16385 X-Virus-Checked: Checked by ClamAV on apache.org ------=_NextPart_000_047D_01CE20B6.F8216D40 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Hi, I have the following schema in Cassandra 1.2.1: + TimeStamp + MACAddress + Data Transfer + LocationID + MacAddressCopy // Copy of MAC Address ** Primary KEY(TimeStamp, MacAddress) // Composite key, partitioned on TimeStamp There are close to 500K different MAC Address, and 10K timestamps. So a total of 5 billion records are there. Each record is 50 bytes, so total size of the data is 250 GB. I have a 4 node cluster with no replication where all this data is stored. When I created a secondary index on MacAddressCopy, and search for a particular value of MAC, then I expect to get back 10K records (with different timestamps) for that MAC Address. Since it is indexed, I expect it to give a quick response, however, I am experiencing RPC Timeouts, and the query does not respond. Is there any reason why this should be so slow? Is there too much of disk seek which is causing such timeouts? Is getting 10K records asking for too much? - Pushkar ------=_NextPart_000_047D_01CE20B6.F8216D40--