Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=from
	:mime-version:content-type:subject:date:in-reply-to:to
	:references:message-id; q=dns; s=thelastpickle.com; b=PA3UJemXoP
	2qMdz8g2SLmF5qIlmZKbJKTYPNEPoSOw9L0ZBXDeMXnFEfjmdkSLsN06uoFCDMC1
	E6uxXT3dwGmyX87DSvgZVm8T/i4PYY5JSAm/y711MOWjk4s6aFZW7JUph+l7tR04
	ectuWPgr3GGp5I2WB/nLjv2KPD3FkzOPc=
From: aaron morton <aaron@thelastpickle.com>
Mime-Version: 1.0 (Apple Message framework v1081)
Content-Type: multipart/alternative; boundary=Apple-Mail-1--20400729
Subject: Re: Cassandra hadoop Thrift Time out
Date: Sat, 25 Sep 2010 13:09:20 +1200
In-Reply-To: <B7C95AF424C29E49B49D3D1CB9AE6A0702484358@agmail02.corp.inq.com>
To: user@cassandra.apache.org
References: <B7C95AF424C29E49B49D3D1CB9AE6A0702484358@agmail02.corp.inq.com>
Message-Id: <B245E6F1-2E50-4D21-AAD6-485F418FA9EB@thelastpickle.com>


--Apple-Mail-1--20400729
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=windows-1252

There is some information on the wiki =
http://wiki.apache.org/cassandra/HadoopSupport about a resource leak =
before 0.6.2 versions that can result in a TimeoutException. But you're =
on 0.6.5 so should be ok.=20

I had a quick look at the Hadoop code and could not see where to change =
the timeout (that would be the obvious thing to try). If you have a look =
in the ConfigHelper.java though it says=20

 /**
     * The number of rows to request with each get range slices request.
     * Too big and you can either get timeouts when it takes Cassandra =
too
     * long to fetch all the data. Too small and the performance
     * will be eaten up by the overhead of each request.
     *
     * @param conf      Job configuration you are about to run
     * @param batchsize Number of rows to request each time
     */
    public static void setRangeBatchSize(Configuration conf, int =
batchsize)
    {
        conf.setInt(RANGE_BATCH_SIZE_CONFIG, batchsize);
    }

The config item name is ""cassandra.range.batch.size".

Try reducing the batch size first and see if the timeouts go away. =
Though it does not sound like you have a lot of data. =20

An 0.7 beta2 may be out this week. But it's still beta.=20

Hope that helps.=20
Aaron


 =20
On 25 Sep 2010, at 07:17, Saket Joshi wrote:

> Hi Experts,
> =20
> I need help on an exception integrating cassandra-hadoop. I am getting =
the following exception, when running a Hadoop Map reduce job =
http://pastebin.com/RktaqDnj
> I am using cassandra 0.6.5 , 3 node cluster. I don=92t get any =
exception when the data I am processing is very small  < 5 rows and 100 =
columns,  but get the error with modest data is > 5 rows 500 columns. I =
went though some of the forums where people have experienced the same =
issue.
> =
http://www.listware.net/201005/cassandra-user/21897-timeout-while-running-=
simple-hadoop-job.html . Is this a bug with Cassandra-hadoop classes and =
is that fixed in 0.7 for sure? how stable is 0.7 beta ? In the =
system.log I see a lot of =94  index has reached its threshold; =
switching in a fresh Memtable=94 messages
> =20
> Has Anyone faced a similar issue and solved it? Is migrating to 0.7  =
the only solution?
> =20
> Thanks,
> Saket
> =20
> Stack Trace of the Exception:
> {ava.lang.RuntimeException: TimedOutException()
>         at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit=
(ColumnFamilyRecordReader.java:186)
>         at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNe=
xt(ColumnFamilyRecordReader.java:236)
>         at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNe=
xt(ColumnFamilyRecordReader.java:104)
>         at =
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterat=
or.java:135)
>         at =
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:1=
30)
>         at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFa=
milyRecordReader.java:98)
>         at =
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapT=
ask.java:423)
>         at =
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>         at =
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>         at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: TimedOutException()
>         at =
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassand=
ra.java:11094)
>         at =
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassand=
ra.java:628)
>         at =
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.ja=
va:602)
>         at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit=
(ColumnFamilyRecordReader.java:164)}


--Apple-Mail-1--20400729
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=windows-1252

<html><head><base href=3D"x-msg://17/"></head><body style=3D"word-wrap: =
break-word; -webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space; ">There is some information on the wiki&nbsp;<a =
href=3D"http://wiki.apache.org/cassandra/HadoopSupport">http://wiki.apache=
.org/cassandra/HadoopSupport</a> about a resource leak before 0.6.2 =
versions that can result in a TimeoutException. But you're on 0.6.5 so =
should be ok.&nbsp;<div><br></div><div>I had a quick look at the Hadoop =
code and could not see where to change the timeout (that would be the =
obvious thing to try). If you have a look in the ConfigHelper.java =
though it =
says&nbsp;</div><div><br></div><div><div>&nbsp;/**</div><div>&nbsp;&nbsp; =
&nbsp; * The number of rows to request with each get range slices =
request.</div><div>&nbsp;&nbsp; &nbsp; * Too big and you can either get =
timeouts when it takes Cassandra too</div><div>&nbsp;&nbsp; &nbsp; * =
long to fetch all the data. Too small and the =
performance</div><div>&nbsp;&nbsp; &nbsp; * will be eaten up by the =
overhead of each request.</div><div>&nbsp;&nbsp; &nbsp; =
*</div><div>&nbsp;&nbsp; &nbsp; * @param conf &nbsp; &nbsp; &nbsp;Job =
configuration you are about to run</div><div>&nbsp;&nbsp; &nbsp; * =
@param batchsize Number of rows to request each =
time</div><div>&nbsp;&nbsp; &nbsp; */</div><div>&nbsp;&nbsp; =
&nbsp;public static void setRangeBatchSize(Configuration conf, int =
batchsize)</div><div>&nbsp;&nbsp; &nbsp;{</div><div>&nbsp;&nbsp; &nbsp; =
&nbsp; &nbsp;conf.setInt(RANGE_BATCH_SIZE_CONFIG, =
batchsize);</div><div>&nbsp;&nbsp; &nbsp;}</div><div><br></div><div>The =
config item name is =
""cassandra.range.batch.size".</div><div><br></div><div>Try reducing the =
batch size first and see if the timeouts go away. Though it does not =
sound like you have a lot of data. &nbsp;</div><div><br></div><div>An =
0.7 beta2 may be out this week. But it's still =
beta.&nbsp;</div><div><br></div><div>Hope that =
helps.&nbsp;</div><div>Aaron</div><div><br></div><div><br></div><div>&nbsp=
;&nbsp;<br><div><div>On 25 Sep 2010, at 07:17, Saket Joshi =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div lang=3D"EN-US" link=3D"blue" vlink=3D"purple"><div =
class=3D"WordSection1" style=3D"page: WordSection1; "><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">Hi Experts,<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; "><o:p>&nbsp;</o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; ">I =
need help on an exception integrating cassandra-hadoop. I am getting the =
following exception, when running a Hadoop Map reduce job<span =
class=3D"Apple-converted-space">&nbsp;</span><a =
href=3D"http://pastebin.com/RktaqDnj" style=3D"color: blue; =
text-decoration: underline; =
">http://pastebin.com/RktaqDnj</a><o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; ">I =
am using cassandra 0.6.5 , 3 node cluster. I don=92t get any exception =
when the data I am processing is very small &nbsp;&lt; 5 rows and 100 =
columns, &nbsp;but get the error with modest data is &gt; 5 rows 500 =
columns. I went though some of the forums where people have experienced =
the same issue.<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; "><a =
href=3D"http://www.listware.net/201005/cassandra-user/21897-timeout-while-=
running-simple-hadoop-job.html" style=3D"color: blue; text-decoration: =
underline; =
">http://www.listware.net/201005/cassandra-user/21897-timeout-while-runnin=
g-simple-hadoop-job.html</a><span =
class=3D"Apple-converted-space">&nbsp;</span>. Is this a bug with =
Cassandra-hadoop classes and is that fixed in 0.7 for sure? how stable =
is 0.7 beta ? In the system.log I see a lot of =94 &nbsp;index has =
reached its threshold; switching in a fresh Memtable=94 =
messages<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; "><o:p>&nbsp;</o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">Has Anyone faced a similar issue and solved it? Is migrating to =
0.7&nbsp; the only solution?<o:p></o:p></div><div style=3D"margin-top: =
0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 11pt; font-family: Calibri, sans-serif; =
"><o:p>&nbsp;</o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; ">Thanks,<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">Saket<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; "><o:p>&nbsp;</o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
"><b>Stack Trace of the Exception:<o:p></o:p></b></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">{ava.lang.RuntimeException: TimedOutException()<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit=
(ColumnFamilyRecordReader.java:186)<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNe=
xt(ColumnFamilyRecordReader.java:236)<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.computeNe=
xt(ColumnFamilyRecordReader.java:104)<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterat=
or.java:135)<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:1=
30)<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; font-family: =
Calibri, sans-serif; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader.nextKeyValue(ColumnFa=
milyRecordReader.java:98)<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapT=
ask.java:423)<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)<o:=
p></o:p></div><div style=3D"margin-top: 0in; margin-right: 0in; =
margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; font-family: =
Calibri, sans-serif; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)<o:p></o:p></div><d=
iv style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:583)<o:p></o:p>=
</div><div style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: =
0.0001pt; margin-left: 0in; font-size: 11pt; font-family: Calibri, =
sans-serif; ">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)<o:p></o:p></div><di=
v style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.hadoop.mapred.Child.main(Child.java:170)<o:p></o:p></div><div =
style=3D"margin-top: 0in; margin-right: 0in; margin-bottom: 0.0001pt; =
margin-left: 0in; font-size: 11pt; font-family: Calibri, sans-serif; =
">Caused by: TimedOutException()<o:p></o:p></div><div style=3D"margin-top:=
 0in; margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; =
font-size: 11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.cassandra.thrift.Cassandra$get_range_slices_result.read(Cassand=
ra.java:11094)<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.cassandra.thrift.Cassandra$Client.recv_get_range_slices(Cassand=
ra.java:628)<o:p></o:p></div><div style=3D"margin-top: 0in; =
margin-right: 0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: =
11pt; font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.ja=
va:602)<o:p></o:p></div><div style=3D"margin-top: 0in; margin-right: =
0in; margin-bottom: 0.0001pt; margin-left: 0in; font-size: 11pt; =
font-family: Calibri, sans-serif; =
">&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; at =
org.apache.cassandra.hadoop.ColumnFamilyRecordReader$RowIterator.maybeInit=
(ColumnFamilyRecordReader.java:164)}<o:p></o:p></div></div></div></blockqu=
ote></div><br></div></div></body></html>=

--Apple-Mail-1--20400729--