incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <jeremy.hanna1...@gmail.com>
Subject Hadoop settings if running into blacklisted task trackers with Cassandra
Date Sun, 25 Sep 2011 02:51:32 GMT
I thought I would share something valuable that Jacob Perkins (who recently started with us)
shared.  We were seeing blacklisted task trackers and occasionally failed jobs.  These were
almost always based on TimedOutExceptions from Cassandra.  We've been fixing underlying reasons
for those exceptions.  However, one thing Jacob found when getting timeout errors with elastic
search + hadoop, if he gave elastic search a few more tries before failing the jobs, things
finished.  So he cranked those up.  Granted if you crank them too high, your jobs that might
have otherwise failed, don't have a chance to fail.  But for us, it was that we just needed
to generally give Cassandra a few more tries.  We're still getting the gremlins out here and
there, but you can set this at the job level or on the task trackers themselves.  It gives
Cassandra a few more tries for each task for that job so that it doesn't blacklist that node
for the job as quickly and doesn't fail the job as easily.  An example configuration (for
job configuration or for the task trackers' mapred-site.xml) is:

<property>
  <name>mapred.max.tracker.failures</name>
  <value>20</value>
</property>
<property>
  <name>mapred.max.tracker.failures</name>
  <value>20</value>
</property>
<property>
  <name>mapred.map.max.attempts</name>
  <value>20</value>
</property>
<property>
  <name>mapred.reduce.max.attempts</name>
  <value>20</value>
</property>

Just thought I would share this because I've seen others experience this problem.  It's not
a complete solution but it can come in handy if you want to make Hadoop more fault tolerant
with Cassandra.
Mime
View raw message