cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Moores <>
Subject Throttling ColumnFamilyRecordReader
Date Tue, 19 Oct 2010 20:22:54 GMT
Does it make sense to add some kind of throttle capability on the ColumnFamilyRecordReader
for Hadoop?

If I have 60 or so Map tasks running at the same time when the cluster is already heavily
loaded with OLTP operations, I can get some decreased on-line performance
that may not be acceptable.  (I'm loading an 8 node cluster with 2000 TPS.)  By default my
cluster of 8 nodes (which are also the Hadoop JobTracker nodes) has 8 Map tasks per node making
the get_range_slices call, based on what the ColumnFamilyInputFormat has calculated from my
token ranges. 
I can increase the inputSplitSize  (ConfigHelper.setInputSplitSIze()) so that there 
is only one Map task per node, and this helps quite a bit.

But is it reasonable to provide a configurable sleep to cause a wait in between smaller size
range queries?  That would stretch out the Map time
and let the OLTP processing be less affected.


View raw message