Return-Path: X-Original-To: apmail-incubator-accumulo-user-archive@minotaur.apache.org Delivered-To: apmail-incubator-accumulo-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CE0919876 for ; Wed, 2 Nov 2011 18:37:09 +0000 (UTC) Received: (qmail 29297 invoked by uid 500); 2 Nov 2011 18:37:09 -0000 Delivered-To: apmail-incubator-accumulo-user-archive@incubator.apache.org Received: (qmail 29252 invoked by uid 500); 2 Nov 2011 18:37:09 -0000 Mailing-List: contact accumulo-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: accumulo-user@incubator.apache.org Delivered-To: mailing list accumulo-user@incubator.apache.org Received: (qmail 29243 invoked by uid 99); 2 Nov 2011 18:37:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Nov 2011 18:37:09 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.161.47] (HELO mail-fx0-f47.google.com) (209.85.161.47) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Nov 2011 18:37:02 +0000 Received: by faas16 with SMTP id s16so803465faa.6 for ; Wed, 02 Nov 2011 11:36:42 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.91.143 with SMTP id n15mr10170495fam.23.1320259002616; Wed, 02 Nov 2011 11:36:42 -0700 (PDT) Received: by 10.223.83.3 with HTTP; Wed, 2 Nov 2011 11:36:42 -0700 (PDT) In-Reply-To: <4EB0528C.6090909@digitalreasoning.com> References: <4EA8665A.6060805@digitalreasoning.com> <4EB0528C.6090909@digitalreasoning.com> Date: Wed, 2 Nov 2011 14:36:42 -0400 Message-ID: Subject: Re: Scanning for rows using columnfamily only From: Keith Turner To: accumulo-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org On Tue, Nov 1, 2011 at 4:11 PM, Keith Massey wrote: > Thanks for the tips. We tried using one locality group per column family (I > think there are 20-25). It has definitely sped up queries for all data in a > single column family. The first batch comes back in about 5 seconds rather > than 120 seconds without the locality groups. Our data load time doubled > though from 7 hours to 14 hours. I don't have any evidence at this point > that it is related to the locality groups. But there were very few > differences between the 7-hour load and the 14-hour load. Any thoughts about > whether this could be a side effect of loading data into 25 locality groups? > Or am I looking in the wrong place? > Thanks again. > > Keith > I ran some experiments w/ different numbers of locality groups, it had a noticeable effect on minor compactions times. The results are in a comment in ticket ACCUMULO-112. I suspect the locality group change is behind the slowdown in ingest. https://issues.apache.org/jira/browse/ACCUMULO-112