Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CBA719C65 for ; Mon, 16 Apr 2012 19:09:34 +0000 (UTC) Received: (qmail 66996 invoked by uid 500); 16 Apr 2012 19:09:34 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 66970 invoked by uid 500); 16 Apr 2012 19:09:34 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 66962 invoked by uid 99); 16 Apr 2012 19:09:34 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Apr 2012 19:09:34 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [206.112.75.238] (HELO iron-d-outbound.osis.gov) (206.112.75.238) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Apr 2012 19:09:27 +0000 X-IronPort-AV: E=Sophos;i="4.75,430,1330923600"; d="scan'208";a="97851413" Received: from netmgmt.ext.intelink.gov (HELO ww4.ugov.gov) ([172.16.11.235]) by iron-d-outbound.osis.gov with ESMTP; 16 Apr 2012 15:06:53 -0400 Date: Mon, 16 Apr 2012 19:09:01 +0000 (GMT+00:00) From: Billie J Rinaldi To: user@accumulo.apache.org Message-ID: <1560648745.394487.1334603341934.JavaMail.root@linzimmb04o.imo.intelink.gov> In-Reply-To: Subject: Re: Using AccumuloOutputFormat, All Records Stored In One Tablet (Node) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [10.2.188.66] X-Mailer: Zimbra 6.0.7_GA_2476.RHEL4 (ZimbraWebClient - SAF3 (Mac)/6.0.7_GA_2473.RHEL5_64) On Monday, April 16, 2012 2:55:48 PM, "David Medinets" wrote: > argh ... Just to be clear. The splits are essentially partitions of > the row id? Yes, specified by the end of the range. > Can I add splits after the data is ingested? If so, how can I > redistribute? Yes. You can either add specific split points, or you can lower the split threshold based on the size of the table. For example, if the table size is S bytes, and you ideally want to have T tablets, then set the table's split threshold to S/T. These calculations are rarely exact, so I would start high on the split threshold, let it split out, see if the number of tablets is ok, then lower again if necessary. Billie > On Mon, Apr 16, 2012 at 2:45 PM, Eric Newton > wrote: > > Create the table with splits, but this requires you to know > > something about > > the distribution of your data. > > > > -Eric > > > > > > On Mon, Apr 16, 2012 at 2:38 PM, David Medinets > > > > wrote: > >> > >> Hopefully I am doing something wrong that can be easily rectified. > >> I > >> have an hadoop job that is sending well over 200M entries into > >> accumulo. But every entry is being sent to a single node. The table > >> was created by the hadoop job. > >> > >> How can I get the entries to be spread over several nodes? > > > >