Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 97996 invoked from network); 18 Aug 2010 13:09:15 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 Aug 2010 13:09:15 -0000 Received: (qmail 12483 invoked by uid 500); 18 Aug 2010 13:09:13 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 12227 invoked by uid 500); 18 Aug 2010 13:09:10 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 12214 invoked by uid 99); 18 Aug 2010 13:09:09 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Aug 2010 13:09:09 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of drew.dahlke@bronto.com designates 209.85.161.172 as permitted sender) Received: from [209.85.161.172] (HELO mail-gx0-f172.google.com) (209.85.161.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Aug 2010 13:09:04 +0000 Received: by gxk20 with SMTP id 20so217488gxk.31 for ; Wed, 18 Aug 2010 06:08:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.151.107.13 with SMTP id j13mr359944ybm.85.1282136923450; Wed, 18 Aug 2010 06:08:43 -0700 (PDT) Received: by 10.150.178.13 with HTTP; Wed, 18 Aug 2010 06:08:43 -0700 (PDT) In-Reply-To: References: Date: Wed, 18 Aug 2010 09:08:43 -0400 Message-ID: Subject: Re: Pig + Cassandra = Connection errors From: Drew Dahlke To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 What's your cassandra timeout configured to? It's not uncommon to raise that to 30sec if you're getting timeouts. On Wed, Aug 18, 2010 at 8:17 AM, Christian Decker wrote: > Hi all, > I'm trying to get Pig scripts to work on data in Cassandra and right now I > want to simply run the example-script.pig on a different Keyspace/CF > containing ~6'000'000 entries. I got it running but then the job aborts > after quite some time, and when I look at the logs I see hundreds of these: >> >> java.lang.RuntimeException: >> org.apache.thrift.transport.TTransportException: java.net.ConnectException: >> Connection refused >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:133) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:224) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:101) >> at >> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) >> at >> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:95) >> at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown >> Source) >> at >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:142) >> at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) >> at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> at org.apache.hadoop.mapred.Child.main(Child.java:170) >> Caused by: org.apache.thrift.transport.TTransportException: >> java.net.ConnectException: Connection refused >> at org.apache.thrift.transport.TSocket.open(TSocket.java:185) >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:129) >> ... 13 more >> Caused by: java.net.ConnectException: Connection refused >> at java.net.PlainSocketImpl.socketConnect(Native Method) >> at >> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:310) >> at >> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:176) >> at >> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:163) >> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:381) >> at java.net.Socket.connect(Socket.java:537) >> at java.net.Socket.connect(Socket.java:487) >> at org.apache.thrift.transport.TSocket.open(TSocket.java:180) >> ... 14 more > > and >> >> >> >> java.lang.RuntimeException: TimedOutException() >> >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:174) >> >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:224) >> >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNext(ColumnFamilyRecordReader.java:101) >> >> at >> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:135) >> >> at >> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:130) >> >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFamilyRecordReader.java:95) >> >> at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown >> Source) >> >> at >> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:142) >> >> at >> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423) >> >> at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) >> >> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) >> >> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) >> >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> >> at org.apache.hadoop.mapred.Child.main(Child.java:170) >> >> Caused by: TimedOutException() >> >> at >> org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassandra.java:11030) >> >> at >> org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassandra.java:623) >> >> at >> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:597) >> >> at >> org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit(ColumnFamilyRecordReader.java:151) >> >> ... 13 more > > I checked that the cassandra cluster is running and all my 3 nodes are up > and working. As far as I see it the Jobtracker retries when it get those > errors but aborts once a large portion have failed. Any idea on why the > Cluster keeps dropping connections or timing out? > Regards, > Chris > -- > Christian Decker > Software Architect > http://blog.snyke.net >