Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 31418 invoked from network); 16 Dec 2010 18:57:37 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 16 Dec 2010 18:57:37 -0000 Received: (qmail 42816 invoked by uid 500); 16 Dec 2010 18:57:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 42754 invoked by uid 500); 16 Dec 2010 18:57:34 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 42746 invoked by uid 99); 16 Dec 2010 18:57:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Dec 2010 18:57:34 +0000 X-ASF-Spam-Status: No, hits=4.0 required=10.0 tests=FREEMAIL_FROM,FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of javier.canillas@gmail.com designates 209.85.160.172 as permitted sender) Received: from [209.85.160.172] (HELO mail-gy0-f172.google.com) (209.85.160.172) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Dec 2010 18:57:28 +0000 Received: by gyd12 with SMTP id 12so1931431gyd.31 for ; Thu, 16 Dec 2010 10:57:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=/BSt7aExeSvM9+2gRDW3kp+IWR5zKSTnjP2q5CAVmS4=; b=v2mC3DdkZniQdcjEOyQq+dc4MeFtfc/BI/mCK4DUHrQ2x/SIO2Ejpbw46K3hs5X1/q oDryKBo+Pdh+IPflCKwmhFSnCpgvX5WeHwaBdiX56lhrMMshzpiSPbILfL+r8gd35zUr 8HJ3DwpKEoPSRVai6JUJ1mHAVGOiQqSs/6CBc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=vKWlVPBE/Hm/xJvo9p5R400jeFoylOh4Tb9WvmKCM0T6Clg5HpjYQ6mLsCqddxJeY+ N6HKO+yoFpywoVW7H940EgIJ75hkIpzjnpb8JTlJTTyCsRn1HY/S88t1tOakP9SFkGvW L3Jvq5NVFTieZvq5EqnMLJ7O6MqfsaNr3AxRY= MIME-Version: 1.0 Received: by 10.236.108.145 with SMTP id q17mr1693478yhg.70.1292525826352; Thu, 16 Dec 2010 10:57:06 -0800 (PST) Received: by 10.236.103.132 with HTTP; Thu, 16 Dec 2010 10:57:06 -0800 (PST) In-Reply-To: References: Date: Thu, 16 Dec 2010 15:57:06 -0300 Message-ID: Subject: Re: Too many open files Exception + java.lang.ArithmeticException: / by zero From: Kani To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=90e6ba53a53ebe52b104978b9e51 X-Virus-Checked: Checked by ClamAV on apache.org --90e6ba53a53ebe52b104978b9e51 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Ya, that happens when some operation throws a time out or any other sort of operation (connection refuse, etc). There is a failback logic that will try to discover all the nodes within the Cluster (not only the ones you configured) in order to reach the cluster and execution the operation. Have you seen the log of Cassandra? You might be having a problem on your client when a Compactation is kicked. Doing so, all your connections to the node will be much slower. This may introduce a a file handling problem, since your client will continue to stack up connections against Cassandra. If you see that this is the problem that you are having, see how you can manage to raise such limitation to a higher value that will let you get through Cassandra compactation. Kani On Thu, Dec 16, 2010 at 2:48 PM, Germ=C3=A1n Kondolf wrote: > Indeed Hector has a connection pool behind it, I think it uses 50 > connectios per node. > But also uses a node to discover the others, I assume that, as I saw > connections from my app to nodes that I didn't configure in Hector. > > So, you may check the fds in OS level to see if there is a bottleneck > there. > > On Thu, Dec 16, 2010 at 2:39 PM, Amin Sakka, Novapost > wrote: > > > > I'm using a unique client instance (using Hector) and a unique connecti= on > to > > cassandra. > > For each insertion I'm using a new mutator and then I release it. > > I have 473 sstable "Data.db", the average size of each is 30Mo. > > > > > > > > 2010/12/16 Ryan King > >> > >> Are you creating a new connection for each row you insert (and if so > >> are you closing it)? > >> > >> -ryan > >> > >> On Wed, Dec 15, 2010 at 8:13 AM, Amin Sakka, Novapost > >> wrote: > >> > Hello, > >> > I'm using cassandra 0.7.0 rc1, a single node configuration, > replication > >> > factor 1, random partitioner, 2 GO heap size. > >> > I ran my hector client to insert 5.000.000 rows but after a couple o= f > >> > hours, > >> > the following Exception occurs : > >> > > >> > WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolServer.java > (line > >> > 104) > >> > Transport error occurred during acceptance of message. > >> > org.apache.thrift.transport.TTransportException: > >> > java.net.SocketException: > >> > Too many open files > >> > at > >> > > >> > > org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:1= 24) > >> > at > >> > > >> > > org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerS= ocket.java:67) > >> > at > >> > > >> > > org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TCustomServerS= ocket.java:38) > >> > at > >> > > >> > > org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java= :31) > >> > at > >> > > >> > > org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(CustomTThreadPo= olServer.java:98) > >> > at > >> > > >> > > org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:12= 0) > >> > at > >> > > >> > > org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCas= sandraDaemon.java:229) > >> > at > >> > > >> > > org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134= ) > >> > Caused by: java.net.SocketException: Too many open files > >> > at java.net.PlainSocketImpl.socketAccept(Native Method) > >> > at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384) > >> > at java.net.ServerSocket.implAccept(ServerSocket.java:453) > >> > at java.net.ServerSocket.accept(ServerSocket.java:421) > >> > at > >> > > >> > > org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:1= 19) > >> > > >> > When I try to restart Cassandra, I have the following exception : > >> > > >> > ERROR 16:42:26,573 Exception encountered during startup. > >> > java.lang.ArithmeticException: / by zero > >> > at > >> > > >> > > org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex(SSTable.jav= a:233) > >> > at > >> > > >> > > org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:284= ) > >> > at > >> > > >> > > org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:200= ) > >> > at > >> > > >> > > org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:2= 25) > >> > at > >> > > >> > > org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnF= amilyStore.java:449) > >> > at > >> > > >> > > org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFamilyStore.java= :306) > >> > at > >> > > >> > > org.apache.cassandra.db.ColumnFamilyStore.(ColumnFamilyStore.java:2= 46) > >> > at > >> > > >> > > org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnF= amilyStore.java:449) > >> > at > >> > > >> > > org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnF= amilyStore.java:437) > >> > at org.apache.cassandra.db.Table.initCf(Table.java:341) > >> > at org.apache.cassandra.db.Table.(Table.java:283) > >> > at org.apache.cassandra.db.Table.open(Table.java:114) > >> > at > >> > > >> > > org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassan= draDaemon.java:138) > >> > at > >> > > >> > > org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55= ) > >> > at > >> > > >> > > org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCas= sandraDaemon.java:216) > >> > at > >> > > >> > > org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134= ) > >> > > >> > I am looking for advice on how to debug this. > >> > > >> > Thanks, > >> > -- > >> > > >> > Amin > >> > > >> > > >> > > >> > > >> > > > > > > > > > -- > > Amin > > > > > > > > > > > > -- > //GK > german.kondolf@gmail.com > // sites > http://twitter.com/germanklf > http://www.facebook.com/germanklf > http://ar.linkedin.com/in/germankondolf > --90e6ba53a53ebe52b104978b9e51 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Ya, that happens when some operation throws a time out or any other sort of= operation (connection refuse, etc). There is a failback logic that will tr= y to discover all the nodes within the Cluster (not only the ones you confi= gured) in order to reach the =C2=A0cluster and execution the operation.

Have you seen the log of Cassandra? You might be having a problem = on your client when a Compactation is kicked. Doing so, all your connection= s to the node will be much slower. This may introduce a a file handling pro= blem, since your client will continue to stack up connections against Cassa= ndra.

If you see that this is the problem that you are having= , see how you can manage to raise such limitation to a higher value that wi= ll let you get through Cassandra compactation.

Kani

On Thu, Dec 16, 2010 at 2:48 = PM, Germ=C3=A1n Kondolf <german.kondolf@gmail.com> wrote:
Indeed Hector has a connection pool behind it, I think it uses 50
connectios per node.
But also uses a node to discover the others, I assume that, as I saw
connections from my app to nodes that I didn't configure in Hector.

So, you may check the fds in OS level to see if there is a bottleneck there= .

On Thu, Dec 16, 2010 at 2:39 PM, Amin Sakka, Novapost
<amin.sakka@novapost.fr> wrote:
>
> I'm using a unique client instance (using Hector) and a unique con= nection to
> cassandra.
> For each insertion I'm using a new mutator and then I release it.<= br> > I have 473 =C2=A0sstable "Data.db", the average size of each= is 30Mo.
>
>
>
> 2010/12/16 Ryan King <ryan@twit= ter.com>
>>
>> Are you creating a new connection for each row you insert (and if = so
>> are you closing it)?
>>
>> -ryan
>>
>> On Wed, Dec 15, 2010 at 8:13 AM, Amin Sakka, Novapost
>> <amin.sakka@novapost.= fr> wrote:
>> > Hello,
>> > I'm using cassandra 0.7.0 rc1, a single node configuratio= n, replication
>> > factor 1, random partitioner, 2 GO heap size.
>> > I ran my hector client to insert 5.000.000 rows but after a c= ouple of
>> > hours,
>> > the following Exception occurs :
>> >
>> > =C2=A0WARN [main] 2010-12-15 16:38:53,335 CustomTThreadPoolSe= rver.java (line
>> > 104)
>> > Transport error occurred during acceptance of message.
>> > org.apache.thrift.transport.TTransportException:
>> > java.net.SocketException:
>> > Too many open files
>> > at
>> >
>> > org.apache.thrift.transport.TServerSocket.acceptImpl(TServerS= ocket.java:124)
>> > at
>> >
>> > org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TC= ustomServerSocket.java:67)
>> > at
>> >
>> > org.apache.cassandra.thrift.TCustomServerSocket.acceptImpl(TC= ustomServerSocket.java:38)
>> > at
>> >
>> > org.apache.thrift.transport.TServerTransport.accept(TServerTr= ansport.java:31)
>> > at
>> >
>> > org.apache.cassandra.thrift.CustomTThreadPoolServer.serve(Cus= tomTThreadPoolServer.java:98)
>> > at
>> >
>> > org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDa= emon.java:120)
>> > at
>> >
>> > org.apache.cassandra.service.AbstractCassandraDaemon.activate= (AbstractCassandraDaemon.java:229)
>> > at
>> >
>> > org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDae= mon.java:134)
>> > Caused by: java.net.SocketException: Too many open files
>> > at java.net.PlainSocketImpl.socketAccept(Native Method)
>> > at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:384)<= br> >> > at java.net.ServerSocket.implAccept(ServerSocket.java:453) >> > at java.net.ServerSocket.accept(ServerSocket.java:421)
>> > at
>> >
>> > org.apache.thrift.transport.TServerSocket.acceptImpl(TServerS= ocket.java:119)
>> >
>> > When I try to restart Cassandra, I have the following excepti= on :
>> >
>> > ERROR 16:42:26,573 Exception encountered during startup.
>> > java.lang.ArithmeticException: / by zero
>> > at
>> >
>> > org.apache.cassandra.io.sstable.SSTable.estimateRowsFromIndex= (SSTable.java:233)
>> > at
>> >
>> > org.apache.cassandra.io.sstable.SSTableReader.load(SSTableRea= der.java:284)
>> > at
>> >
>> > org.apache.cassandra.io.sstable.SSTableReader.open(SSTableRea= der.java:200)
>> > at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.<init>(Column= FamilyStore.java:225)
>> > at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyS= tore(ColumnFamilyStore.java:449)
>> > at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.addIndex(ColumnFami= lyStore.java:306)
>> > at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.<init>(Column= FamilyStore.java:246)
>> > at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyS= tore(ColumnFamilyStore.java:449)
>> > at
>> >
>> > org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyS= tore(ColumnFamilyStore.java:437)
>> > at org.apache.cassandra.db.Table.initCf(Table.java:341)
>> > at org.apache.cassandra.db.Table.<init>(Table.java:283)=
>> > at org.apache.cassandra.db.Table.open(Table.java:114)
>> > at
>> >
>> > org.apache.cassandra.service.AbstractCassandraDaemon.setup(Ab= stractCassandraDaemon.java:138)
>> > at
>> >
>> > org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDa= emon.java:55)
>> > at
>> >
>> > org.apache.cassandra.service.AbstractCassandraDaemon.activate= (AbstractCassandraDaemon.java:216)
>> > at
>> >
>> > org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDae= mon.java:134)
>> >
>> > I am looking for advice on how to debug this.
>> >
>> > Thanks,
>> > --
>> >
>> > Amin
>> >
>> >
>> >
>> >
>> >
>
>
>
> --
> Amin
>
>
>
>




--90e6ba53a53ebe52b104978b9e51--