accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Krishmin Rai <>
Subject Re: Number of partitions for sharded table
Date Tue, 30 Oct 2012 19:28:15 GMT
I should clarify that I've been pre-splitting tables at each shard so that each tablet consists
of a single row.

On Oct 30, 2012, at 3:06 PM, Krishmin Rai wrote:

> Hi All, 
>  We're working with an index table whose row is a shardId (an integer, like the wiki-search
or IndexedDoc examples). I was just wondering what the right strategy is for choosing a number
of partitions, particularly given a cluster that could potentially grow.
>  If I simply set the number of shards equal to the number of slave nodes, additional
nodes would not improve query performance (at least over the data already ingested). But starting
with more partitions than slave nodes would result in multiple tablets per tablet server…
I'm not really sure how that would impact performance, particularly given that all queries
against the table will be batchscanners with an infinite range.
>  Just wondering how others have addressed this problem, and if there are any performance
rules of thumb regarding the ratio of tablets to tablet servers.
> Thanks!
> Krishmin

View raw message