accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Hughes <jn...@virginia.edu>
Subject Re: Straggler problem in Accumulo BatchScans
Date Wed, 21 Aug 2013 23:29:04 GMT
David,

Each tablet is hosted by one tablet server, and there's no way around
that.  (This is actually quite reasonably; otherwise, we would receive
duplicate results from multiple tablet servers.)

One strategy to deal with imbalanced data is to add a random partition
prefix to your row Ids.  This does complicate building queries, but in
general, you'll be able to leverage all of your nodes.  I did some testing
with the nodes of such random shard ids, and it seems like having 1-2x as
many shards as tablet servers worked pretty well.  (I'd suggest 2x in case
you ever grow your cloud.)

In particular, if you can reingest your data, prepend a random "01-14~" to
the beginning of each row Id, and see if that helps.  After that, you can
"help" Accumulo decide where it should split tablets with addSplits 01 02
<etc> 14 from the Accumulo shell (or programmatically with the addSplits).
After that, you can make sure that your 14+ splits are distributed across
the 7 nodes in a reasonable way.

I hope that helps,

Jim

http://accumulo.apache.org/1.4/apidocs/org/apache/accumulo/core/client/admin/TableOperations.html#addSplits%28java.lang.String,%20java.util.SortedSet%29



On Wed, Aug 21, 2013 at 7:09 PM, Slater, David M.
<David.Slater@jhuapl.edu>wrote:

> Hey, I have a 7 node network running accumulo 1.4.1 and hadoop 1.0.4.****
>
> ** **
>
> When I run large BatchScanner operations, the number of tablets scanned
> per node is not uniform, leading to the overloaded nodes taking much longer
> to finish than the others. For queries that require all of the scans to
> finish before returning, this is a major latency issue. What are some
> practical means of load-balancing this to reduce delay?****
>
> ** **
>
> Is it possible for tablets to be hosted on multiple tablet servers, up to
> the replication factor of the underlying hdfs? Are there reasons this might
> be an undesirable design?****
>
> ** **
>
> Thanks in advance,
> David ****
>

Mime
View raw message