Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 270D01119F for ; Fri, 20 Jun 2014 11:51:14 +0000 (UTC) Received: (qmail 39093 invoked by uid 500); 20 Jun 2014 11:51:11 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 39058 invoked by uid 500); 20 Jun 2014 11:51:11 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 39048 invoked by uid 99); 20 Jun 2014 11:51:11 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jun 2014 11:51:11 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [209.85.212.179] (HELO mail-wi0-f179.google.com) (209.85.212.179) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jun 2014 11:51:07 +0000 Received: by mail-wi0-f179.google.com with SMTP id cc10so656095wib.6 for ; Fri, 20 Jun 2014 04:50:46 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=/WgQaSoIt93nKTF+7bZy4x1UjUZ5Rys022TXDffGr/M=; b=cfHwF1IkO4tjzrHStFnvfFeE+U54SeARVdQj9HQqlnO1/bUY+kmUptQu8jflO4Qs9t nRKFF+yRHBUrxuVRjhGwNwCxKQ8uUKmGinYjiARU9CD5FIKnLPZOxpTQJOWsyWF1dve2 whZeBBYvrg5nhV3scVb49VSkw8DeqjmUL/6E/3Mb6uTDiHN1J3YcQRYfxK/h5l8V+s2+ L3lNALQkAfvG9A3cv8ZDD84Ur/vgAv8dbZO9uTvXKfCeGQW4t9U2sfKI7qvvkVJBQMyY kHR46gXI+OktYX44DAkkPVyIoXq2/BBaeZ2vkmp+BzLj+PgRq37DnZVJlliqfsobv7JA m1Uw== X-Gm-Message-State: ALoCoQmLzRY3udcmbAaBKFg2TS2lkLLN8DbnxYErzkCxgWP/6vedrQniNwx23BV+RwLnKGg3cvCF MIME-Version: 1.0 X-Received: by 10.194.89.168 with SMTP id bp8mr3865199wjb.73.1403265046265; Fri, 20 Jun 2014 04:50:46 -0700 (PDT) Received: by 10.194.0.198 with HTTP; Fri, 20 Jun 2014 04:50:46 -0700 (PDT) In-Reply-To: References: Date: Fri, 20 Jun 2014 07:50:46 -0400 Message-ID: Subject: Re: Batch of prepared statements exceeding specified threshold From: Pavel Kogan To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7bf10a1c9bc45f04fc431af8 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bf10a1c9bc45f04fc431af8 Content-Type: text/plain; charset=UTF-8 The cluster is new, so no updates were done. Version 2.0.8. It happened when I did many writes (no reads). Writes are done in small batches of 2 inserts (writing to 2 column families). The values are big blobs (up to 100Kb). Any clues? Pavel On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle < marcelo@s1mbi0se.com.br> wrote: > Pavel, > > Out of curiosity, did it start to happen before some update? Which version > of Cassandra are you using? > > []s > > > 2014-06-19 16:10 GMT-03:00 Pavel Kogan : > >> What a coincidence! Today happened in my cluster of 7 nodes as well. >> >> Regards, >> Pavel >> >> >> On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle < >> marcelo@s1mbi0se.com.br> wrote: >> >>> I have a 10 node cluster with cassandra 2.0.8. >>> >>> I am taking this exceptions in the log when I run my code. What my code >>> does is just reading data from a CF and in some cases it writes new data. >>> >>> WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391 >>> BatchStatement.java (line 228) Batch of prepared statements for >>> [identification1.entity, identification1.entity_lookup] is of size 6165, >>> exceeding specified threshold of 5120 by 1045. >>> WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152 >>> BatchStatement.java (line 228) Batch of prepared statements for >>> [identification1.entity, identification1.entity_lookup] is of size 21266, >>> exceeding specified threshold of 5120 by 16146. >>> WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229 >>> BatchStatement.java (line 228) Batch of prepared statements for >>> [identification1.entity, identification1.entity_lookup] is of size 22978, >>> exceeding specified threshold of 5120 by 17858. >>> INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line 481) >>> CFS(Keyspace='OpsCenter', ColumnFamily='rollups300') liveRatio is >>> 14.249755859375 (just-counted was 9.85302734375). calculation took 3ms for >>> 1024 cells >>> >>> After some time, one node of the cluster goes down. Then it goes back >>> after some seconds and another node goes down. It keeps happening and there >>> is always a node down in the cluster, when it goes back another one falls. >>> >>> The only exceptions I see in the log is "connected reset by the peer", >>> which seems to be relative to gossip protocol, when a node goes down. >>> >>> Any hint of what could I do to investigate this problem further? >>> >>> Best regards, >>> Marcelo Valle. >>> >> >> > --047d7bf10a1c9bc45f04fc431af8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
The cluster is new, so no updates were done. Version 2.0.8= .
It happened when I did many writes (no reads). Writes are done in sma= ll batches of 2 inserts (writing to 2 column families). The values are big = blobs (up to 100Kb).

Any clues?

Pavel


On Thu, Jun 19,= 2014 at 8:07 PM, Marcelo Elias Del Valle <marcelo@s1mbi0se.com.br> wrote:
Pavel,=C2=A0

=
Out of curiosity, did it start to happen before some update? Which ver= sion of Cassandra are you using?

[]s


What a coincidence! Today happened in my cluster of 7 node= s as well.

Regards,
=C2=A0 Pavel


On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle <marcelo@s= 1mbi0se.com.br> wrote:
I have a 10 node clust= er with cassandra 2.0.8.

I am taking this exceptio= ns in the log when I run my code. What my code does is just reading data fr= om a CF and in some cases it writes new data.

=C2=A0WARN [Native-Transport-Requests:553] 2014-06-18 1= 1:04:51,391 BatchStatement.java (line 228) Batch of prepared statements for= [identification1.entity, identification1.entity_lookup] is of size 6165, e= xceeding specified threshold of 5120 by 1045.
=C2=A0WARN [Native-Transport-Requests:583] 2014-06-18 11:05:01,152 Bat= chStatement.java (line 228) Batch of prepared statements for [identificatio= n1.entity, identification1.entity_lookup] is of size 21266, exceeding speci= fied threshold of 5120 by 16146.
=C2=A0WARN [Native-Transport-Requests:581] 2014-06-18 11:05:20,229 Bat= chStatement.java (line 228) Batch of prepared statements for [identificatio= n1.entity, identification1.entity_lookup] is of size 22978, exceeding speci= fied threshold of 5120 by 17858.
=C2=A0INFO [MemoryMeter:1] 2014-06-18 11:05:32,682 Memtable.java (line= 481) CFS(Keyspace=3D'OpsCenter', ColumnFamily=3D'rollups300= 9;) liveRatio is 14.249755859375 (just-counted was 9.85302734375). =C2=A0ca= lculation took 3ms for 1024 cells

After some time, one node of the cluster goes down. The= n it goes back after some seconds and another node goes down. It keeps happ= ening and there is always a node down in the cluster, when it goes back ano= ther one falls.

The only exceptions I see in the log is "connected= reset by the peer", which seems to be relative to gossip protocol, wh= en a node goes down.

Any hint of what could I do t= o investigate this problem further?

Best regards,
Marcelo Valle.



--047d7bf10a1c9bc45f04fc431af8--