Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 924B71068B for ; Thu, 23 Jan 2014 00:06:56 +0000 (UTC) Received: (qmail 17063 invoked by uid 500); 23 Jan 2014 00:06:52 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 17033 invoked by uid 500); 23 Jan 2014 00:06:52 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 17025 invoked by uid 99); 23 Jan 2014 00:06:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jan 2014 00:06:52 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [209.65.160.93] (HELO nbfkord-smmo07.seg.att.com) (209.65.160.93) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Jan 2014 00:06:41 +0000 Received: from unknown [144.160.229.24] (EHLO alpi155.enaf.aldc.att.com) by nbfkord-smmo07.seg.att.com(mxl_mta-7.2.1-0) over TLS secured channel with ESMTP id cfc50e25.0.5295058.00-2329.13807919.nbfkord-smmo07.seg.att.com (envelope-from ); Thu, 23 Jan 2014 00:06:21 +0000 (UTC) X-MXL-Hash: 52e05cfd1d51a2c8-fdf66d3040dc611e4ee51248cc9de95f6738991f Received: from enaf.aldc.att.com (localhost [127.0.0.1]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id s0N06IFp025274 for ; Wed, 22 Jan 2014 19:06:19 -0500 Received: from mlpi409.sfdc.sbc.com (mlpi409.sfdc.sbc.com [130.9.128.241]) by alpi155.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id s0N069iO025067 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 22 Jan 2014 19:06:14 -0500 Received: from MOKSCY3MSGHUBAC.ITServices.sbc.com (MOKSCY3MSGHUBAC.itservices.sbc.com [135.188.226.27]) by mlpi409.sfdc.sbc.com (RSA Interceptor) for ; Thu, 23 Jan 2014 00:05:53 GMT Received: from MOKSCY3MSGUSR9C.ITServices.sbc.com ([135.188.11.40]) by MOKSCY3MSGHUBAC.ITServices.sbc.com ([135.188.226.27]) with mapi id 14.03.0174.001; Wed, 22 Jan 2014 18:05:53 -0600 From: "NEWBROUGH, JONATHAN" To: "user@cassandra.apache.org" Subject: RE: cassandra read performance jumps from one row to next Thread-Topic: cassandra read performance jumps from one row to next Thread-Index: Ac8XzJgKa5KAitOTRxSFj5BhszwlBgAAj6Ww Date: Thu, 23 Jan 2014 00:05:53 +0000 Message-ID: <5C6DB4B30617934B853F6371E5F0A2A001C27DAA@MOKSCY3MSGUSR9C.ITServices.sbc.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [135.70.127.109] Content-Type: multipart/alternative; boundary="_000_5C6DB4B30617934B853F6371E5F0A2A001C27DAAMOKSCY3MSGUSR9C_" MIME-Version: 1.0 X-RSA-Inspected: yes X-RSA-Classifications: public X-AnalysisOut: [v=2.0 cv=NvxmhbhJ c=1 sm=1 a=dhB6nF3YHL5t/Ixux6cINA==:17 a] X-AnalysisOut: [=gYbJ5Rr6IEQA:10 a=ofMgfj31e3cA:10 a=RHB0qHMi4SIA:10 a=BLc] X-AnalysisOut: [eEmwcHowA:10 a=zQP7CpKOAAAA:8 a=XIqpo32RAAAA:8 a=AY53RDZFm] X-AnalysisOut: [1IA:10 a=aS3wRoD095VC8EIaJjUA:9 a=CjuIK1q_8ugA:10 a=-TSwsW] X-AnalysisOut: [u1cilCRzzZ:21 a=J-alHFdbYGUXWNSf:21 a=yMhMjlubAAAA:8 a=SSm] X-AnalysisOut: [OFEACAAAA:8 a=cfGBjvp4egHZAJKxDIEA:9 a=gKO2Hq4RSVkA:10 a=U] X-AnalysisOut: [iCQ7L4-1S4A:10 a=hTZeC7Yk6K0A:10 a=frz4AuCg-hUA:10 a=vrvvL] X-AnalysisOut: [FuuHc0aGxQn:21] X-Spam: [F=0.2000000000; CM=0.500; S=0.200(2010122901)] X-MAIL-FROM: X-SOURCE-IP: [144.160.229.24] X-Virus-Checked: Checked by ClamAV on apache.org --_000_5C6DB4B30617934B853F6371E5F0A2A001C27DAAMOKSCY3MSGUSR9C_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Trying to find out why a cassandra read is taking so long, I used tracing a= nd limited the number of rows. Strangely, when I query 600 rows, I get resu= lts in ~50 milliseconds. But 610 rows takes nearly 1 second! cqlsh> select containerdefinitionid from containerdefinition limit 600; ... lots of output ... Tracing session: 6b506cd0-83bc-11e3-96e8-e182571757d7 activity = | timestamp | source | source_elapsed ---------------------------------------------------------------------------= ----------------------+--------------+---------------+---------------- = execute_cql3_query | 15:25:02,878 | 130.4.147.116 | 0 = Parsing statement | 15:25:02,878 | 130.4.147.116 | 39 = Peparing statement | 15:25:02,878 | 130.4.147.116 | 101 Determin= ing replicas to query | 15:25:02,878 | 130.4.147.116 | 152 Executing seq scan across 1 sstables for [min(-9223372036854775808), min(-9= 223372036854775808)] | 15:25:02,879 | 130.4.147.116 | 1021 Scanned 755= rows and matched 755 | 15:25:02,933 | 130.4.147.116 | 55169 = Request complete | 15:25:02,934 | 130.4.147.116 | 56300 cqlsh> select containerdefinitionid from containerdefinition limit 610; ... just about the same output and trace info, except... Scanned 766= rows and matched 766 | 15:25:58,908 | 130.4.147.116 | 739141 There seems to be nothing unusual about the data in those particular rows: = - values are similar to those before and after. - using the COPY command I = can export the whole table and import on a different cluster and performanc= e is fine. - these rows are the first example, but there seem to be other p= laces where query time jumps as well. Whole table is only ~3000 rows but ta= kes ~15sec to list all primary keys. There does seem to be something unusual about the data STORAGE: - snapshot = copied to another cluster and imported gives same results with same limits = - COPY data to CSV and then into another cluster does not, performance is g= reat Have tried compaction, repair, reindex, cleanup and refresh. No effect. I realize I could "fix" by copying data out and in, but I'm trying to figur= e out what is going on here to avoid it happening in production on a table = too big to fix with COPY. Table has 17 columns, 3 indices, TEXT primary key, two LIST columns and two= TIMESTAMP columns; the rest are TEXT. Can reproduce issue with both Simple= Strategy and DC-aware replication. Can reproduce with 4 copies of data on 4= servers, 2 copies on 2 servers and 1 copy on 2 servers (so doesn't matter = if query is performed locally or involves multiple servers). Cassandra-1.2 = with cqlsh. Any ideas? Suggestions? --_000_5C6DB4B30617934B853F6371E5F0A2A001C27DAAMOKSCY3MSGUSR9C_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

 

Trying to find out why a cassandra read is taking so= long, I used tracing and limited the number of rows. Strangely, when I que= ry 600 rows, I get results in ~50 milliseconds. But 610 rows takes nearly 1 second!

cqlsh> select contain= erdefinitionid from containerdefinition limit 600;

... lots of output ...

 =

Tracing session: 6b506cd= 0-83bc-11e3-96e8-e182571757d7

 =

activity  &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;            &= nbsp;           &nbs= p;          | timestamp &= nbsp;  | source        | source_ela= psed

------------------------= -------------------------------------------------------------------------&#= 43;--------------+---------------+----------------

    =             &nb= sp;            =             &nb= sp;            =             &nb= sp;           execute_cql= 3_query | 15:25:02,878 | 130.4.147.116 |      = ;        0

    =             &nb= sp;            =             &nb= sp;            =             &nb= sp;            Parsi= ng statement | 15:25:02,878 | 130.4.147.116 |     =         39

    =             &nb= sp;            =              &n= bsp;            = ;            &n= bsp;          Peparing st= atement | 15:25:02,878 | 130.4.147.116 |      = ;      101

    =             &nb= sp;            =             &nb= sp;            =              De= termining replicas to query | 15:25:02,878 | 130.4.147.116 |  &nb= sp;         152

Executing seq scan acros= s 1 sstables for [min(-9223372036854775808), min(-9223372036854775808)] | 1= 5:25:02,879 | 130.4.147.116 |       &nbs= p;   1021

    =             &nb= sp;            =             &nb= sp;            =           Scanned 755 rows and= matched 755 | 15:25:02,933 | 130.4.147.116 |     =      55169

    =             &nb= sp;            =             &nb= sp;            =             &nb= sp;            = Request complete | 15:25:02,934 | 130.4.147.116 |    &= nbsp;     56300

cqlsh> select contain= erdefinitionid from containerdefinition limit 610;

... just about the same = output and trace info, except...

 =

    =             &nb= sp;            =             &nb= sp;            =           Scanned 766 rows and= matched 766 | 15:25:58,908 | 130.4.147.116 |     =     739141

There seems to be nothing unusual about the data in = those particular rows: - values are similar to those before and after. - us= ing the COPY command I can export the whole table and import on a different cluster and performance is fine. - these rows are th= e first example, but there seem to be other places where query time jumps a= s well. Whole table is only ~3000 rows but takes ~15sec to list all primary= keys.

There does seem to be something unusual about the da= ta STORAGE: - snapshot copied to another cluster and imported gives same re= sults with same limits - COPY data to CSV and then into another cluster does not, performance is great

Have tried compaction, repair, reindex, cleanup and = refresh. No effect.

I realize I could "fix" by copying data ou= t and in, but I'm trying to figure out what is going on here to avoid it ha= ppening in production on a table too big to fix with COPY.

Table has 17 columns, 3 indices, TEXT primary key, t= wo LIST columns and two TIMESTAMP columns; the rest are TEXT. Can reproduce= issue with both SimpleStrategy and DC-aware replication. Can reproduce with 4 copies of data on 4 servers, 2 copies on 2 servers an= d 1 copy on 2 servers (so doesn't matter if query is performed locally or i= nvolves multiple servers). Cassandra-1.2 with cqlsh.

Any ideas? Suggestions?

 

--_000_5C6DB4B30617934B853F6371E5F0A2A001C27DAAMOKSCY3MSGUSR9C_--