Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C01FBDB52 for ; Tue, 30 Oct 2012 19:28:43 +0000 (UTC) Received: (qmail 83509 invoked by uid 500); 30 Oct 2012 19:28:43 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 83384 invoked by uid 500); 30 Oct 2012 19:28:43 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 83376 invoked by uid 99); 30 Oct 2012 19:28:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Oct 2012 19:28:43 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.212.41] (HELO mail-vb0-f41.google.com) (209.85.212.41) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Oct 2012 19:28:37 +0000 Received: by mail-vb0-f41.google.com with SMTP id v13so819134vbk.0 for ; Tue, 30 Oct 2012 12:28:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=missionfoc.us; s=google; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=cMZrSZOK6pPR6PGQIbZShIOVGfgZfKgFHEKbyY38NtM=; b=RSmqfR4v4yo8V2bTMWnNzQdl3U8KhQuLQpErCQ1URUucguEOsVF4oVPXXnMzFCHD9P H7WJYzlvrC4xWP+l/M3CxZIGUMxxtmjgplT/qA3y0Rha+G087UevUbVUync4TTNFdXp2 nsGCn/Oh/R9AucijS3aGJq0dDPyr7mbsbIStk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer :x-gm-message-state; bh=cMZrSZOK6pPR6PGQIbZShIOVGfgZfKgFHEKbyY38NtM=; b=o1RJRNnbkxZOZDmohgSGQmHl7r1gjVNaEeZQ+TEmKzQwz8DdxTL9idI2h5IdtDXXvj 9kmpxW91+muVeAvaZGqrCw2GGX0pemMqGomhXWr0Ys4FhcpSR4uyzcMY2igDQZk16kCu gMhjnt0z60ISb1rG2N07UM76zoGr085KjOr6UTVfXHK+ouF0HVDF56f6acUUidF0QMMu 3M68i1+T+I6ahqzKFxHaF4/9q0vrsuhvr7fo17pa8UhTmR9HJy1meEljphezaizcgSZf k/ZMN8bNfqhgGxdUd1trkUJbrTftEn7uBWhvHTnnW9v1pnT5YpYgFJCw5kaPAxSVIcPc v1rQ== Received: by 10.52.16.110 with SMTP id f14mr43998614vdd.8.1351625295599; Tue, 30 Oct 2012 12:28:15 -0700 (PDT) Received: from [192.168.1.103] (66-44-91-160.c3-0.bth-ubr1.lnh-bth.md.cable.rcn.com. [66.44.91.160]) by mx.google.com with ESMTPS id d4sm786236vew.7.2012.10.30.12.28.14 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 30 Oct 2012 12:28:14 -0700 (PDT) Content-Type: text/plain; charset=windows-1252 Mime-Version: 1.0 (Apple Message framework v1283) Subject: Re: Number of partitions for sharded table From: Krishmin Rai In-Reply-To: <4C975E4B-4EB5-4E96-95DA-E28E9BAF2424@missionfoc.us> Date: Tue, 30 Oct 2012 15:28:15 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4C975E4B-4EB5-4E96-95DA-E28E9BAF2424@missionfoc.us> To: user@accumulo.apache.org X-Mailer: Apple Mail (2.1283) X-Gm-Message-State: ALoCoQkgqYGSmdNiieCFv0nIzpuGD0W0ZPkwgvN4lA0goAxMtGqP/r0h5sYfY+nAy1AmEuuGlZ7G X-Virus-Checked: Checked by ClamAV on apache.org I should clarify that I've been pre-splitting tables at each shard so = that each tablet consists of a single row. On Oct 30, 2012, at 3:06 PM, Krishmin Rai wrote: > Hi All,=20 > We're working with an index table whose row is a shardId (an integer, = like the wiki-search or IndexedDoc examples). I was just wondering what = the right strategy is for choosing a number of partitions, particularly = given a cluster that could potentially grow. >=20 > If I simply set the number of shards equal to the number of slave = nodes, additional nodes would not improve query performance (at least = over the data already ingested). But starting with more partitions than = slave nodes would result in multiple tablets per tablet server=85 I'm = not really sure how that would impact performance, particularly given = that all queries against the table will be batchscanners with an = infinite range. >=20 > Just wondering how others have addressed this problem, and if there = are any performance rules of thumb regarding the ratio of tablets to = tablet servers. >=20 > Thanks! > Krishmin