accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Slater, David M." <>
Subject Straggler problem in Accumulo BatchScans
Date Wed, 21 Aug 2013 23:09:05 GMT
Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4.

When I run large BatchScanner operations, the number of tablets scanned per node is not uniform,
leading to the overloaded nodes taking much longer to finish than the others. For queries
that require all of the scans to finish before returning, this is a major latency issue. What
are some practical means of load-balancing this to reduce delay?

Is it possible for tablets to be hosted on multiple tablet servers, up to the replication
factor of the underlying hdfs? Are there reasons this might be an undesirable design?

Thanks in advance,

View raw message