Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1CC42F21E for ; Wed, 20 Mar 2013 20:08:21 +0000 (UTC) Received: (qmail 92739 invoked by uid 500); 20 Mar 2013 20:08:18 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 92709 invoked by uid 500); 20 Mar 2013 20:08:18 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 92700 invoked by uid 99); 20 Mar 2013 20:08:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Mar 2013 20:08:18 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [192.174.58.134] (HELO XEDGEA.nrel.gov) (192.174.58.134) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Mar 2013 20:08:13 +0000 Received: from XHUBB.nrel.gov (10.20.4.59) by XEDGEA.nrel.gov (192.174.58.134) with Microsoft SMTP Server (TLS) id 8.3.245.1; Wed, 20 Mar 2013 14:07:39 -0600 Received: from MAILBOX2.nrel.gov ([fe80::19a0:6c19:6421:12f]) by XHUBB.nrel.gov ([::1]) with mapi; Wed, 20 Mar 2013 14:07:51 -0600 From: "Hiller, Dean" To: "user@cassandra.apache.org" Date: Wed, 20 Mar 2013 14:07:50 -0600 Subject: Re: Unable to fetch large amount of rows Thread-Topic: Unable to fetch large amount of rows Thread-Index: Ac4lppn/JFg3EpN/QnejUQSZRp2/tg== Message-ID: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.3.2.130206 acceptlanguage: en-US Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Is your use case reading from a single partition? If so, you may want to s= witch to something like playorm which does virtual partitions so you still = get the performance of multiple disks when reading from a single partition.= My understanding is a single cassandra partition exists on a single node.= Anyways, just an option if that is your use-case. Later, Dean From: Pushkar Prasad > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, March 20, 2013 11:41 AM To: "user@cassandra.apache.org" > Subject: RE: Unable to fetch large amount of rows Hi aaron. I added pagination, and things seem to have started performing much better.= With 1000 page size, now able to fetch 500K records in 25-30 seconds. Howe= ver, I=92d like to point you to some interesting observations: + Did run cfhistograms, the results are interesting (Note: row cache is dis= abled): +++ When query made on node on which all the records are present + 75% time is spent on disk latency + Example: When 50 K entries were fetched, it took 2.65 seconds, ou= t of which 1.92 seconds were spent in disk latency +++ When query made on node on which all the records are not present + Considerable amount of time is spent on things other than disk la= tency (probably deserialization/serialization, network, etc.) + Example: When 50 K entries were fetched, it took 5.74 seconds, ou= t of which 2.21 seconds were spent in disk latency. I=92ve used Astyanax to run the above queries. The results were same when r= un with different data points. Compaction has not been done after data popu= lation yet. I=92ve a few questions: 1) Is it necessary to fetch the records in natural order of comparator colu= mn in order to get a high throughput? I=92m trying to fetch all the records= for a particular partition ID without any ordering on comparator column. W= ould that slow down the response? Consider that timestamp is partitionId, a= nd MacAddress is natural comparator column. + If my query is - select * from schema where timestamp =3D =91..=92 ORDER BY MacA= ddress, would that be faster than, say - select * from schema where timestamp =3D =91..=92 2) Why does response time suffer when query is made on a node on which reco= rds to be returned are not present? In order to be able to get better respo= nse when queried from a different node, can something be done? Thanks Pushkar ________________________________ From: aaron morton [mailto:aaron@thelastpickle.com] Sent: 20 March 2013 15:02 To: user@cassandra.apache.org Subject: Re: Unable to fetch large amount of rows The query returns fine if I request for lesser number of entries (takes 15 seconds for returning 20K records). That feels a little slow, but it depends on the data model, the query type = and the server and a bunch of other things. However, as I increase the limit on number of entries, the response begins to slow down. It results in TimedOutException. Make many smaller requests. This is often faster. Isn't it the case that all the data for a partitionID is stored sequentiall= y in disk? Yes and no. In each file all the columns on one partition / row are stored in comparato= r order. But there may be many files. If that is so, then why does fetching this data take such a long amount of time? You need to work out where the time is being spent. Add timing to your app, use nodetool proxyhistograms to see how long the re= quests takes at the co-ordinator, use nodetool histograms to see how long i= t takes at the disk level. Look at your data model, are you reading data in the natural order of the c= omparator. If disk throughput is 40 MB/s, then assuming sequential reads, the response should come pretty quickly. There is more involved than doing one read from disk and returning it. If it is stored sequentially, why does C* take so much time to return the records? It is always going to take time to read 500,000 columns. It will take time = on the client to allocate the 2 to 4 million objects needed to represent th= em. And once it comes to allocating those objects it will probably take mor= e than 40MB in ram. Do some tests at a smaller scale, start with 500 or 1000 columns then get b= igger, to get a feel for what is practical in your environment. Often it's = better to make many smaller / constant size requests. Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 19/03/2013, at 9:38 PM, Pushkar Prasad > wrote: Aaron, Thanks for your reply. Here are the answers to questions you had asked: I am trying to read all the rows which have a particular TimeStamp. In my data base, there are 500 K entries for a particular TimeStamp. That means about 40 MB of data. The query returns fine if I request for lesser number of entries (takes 15 seconds for returning 20K records). However, as I increase the limit on number of entries, the response begins to slow down. It results in TimedOutException. Isn't it the case that all the data for a partitionID is stored sequentiall= y in disk? If that is so, then why does fetching this data take such a long amount of time? If disk throughput is 40 MB/s, then assuming sequential reads, the response should come pretty quickly. Is it not the case that the data I am trying to fetch would be sequentially stored? If it is stored sequentially, why does C* take so much time to return the records? And if data is stored sequentially, is there any alternative that would allow me t= o fetch all the records quickly (by sequential disk fetch)? Thanks Pushkar -----Original Message----- From: aaron morton [mailto:aaron@thelastpickle.com] Sent: 19 March 2013 13:11 To: user@cassandra.apache.org Subject: Re: Unable to fetch large amount of rows I have 1000 timestamps, and for each timestamp, I have 500K different MACAddress. So you are trying to read about 2 million columns ? 500K MACAddresses each with 3 other columns? When I run the following query, I get RPC Timeout exceptions: What is the exception? Is it a client side socket timeout or a server side TimedOutException ? If my understanding is correct then try reading fewer columns and/or check the server side for logs. It sounds like you are trying to read too much though. Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 19/03/2013, at 3:51 AM, Pushkar Prasad > wrote: Hi, I have following schema: TimeStamp MACAddress Data Transfer Data Rate LocationID PKEY is (TimeStamp, MACAddress). That means partitioning is on TimeStamp, and data is ordered by MACAddress, and stored together physically (let me know if my understanding is wrong). I have 1000 timestamps, and for each timestamp, I have 500K different MACAddress. When I run the following query, I get RPC Timeout exceptions: Select * from db_table where Timestamp=3D'...' >From my understanding, this should give all the rows with just one disk seek, as all the records for a particular timeStamp. This should be very quick, however, clearly, that doesn't seem to be the case. Is there something I am missing here? Your help would be greatly appreciated. Thanks PP