incubator-accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: Scanning for rows using columnfamily only
Date Wed, 02 Nov 2011 18:36:42 GMT
On Tue, Nov 1, 2011 at 4:11 PM, Keith Massey
<keith.massey@digitalreasoning.com> wrote:
> Thanks for the tips. We tried using one locality group per column family (I
> think there are 20-25). It has definitely sped up queries for all data in a
> single column family. The first batch comes back in about 5 seconds rather
> than 120 seconds without the locality groups. Our data load time doubled
> though from 7 hours to 14 hours. I don't have any evidence at this point
> that it is related to the locality groups. But there were very few
> differences between the 7-hour load and the 14-hour load. Any thoughts about
> whether this could be a side effect of loading data into 25 locality groups?
> Or am I looking in the wrong place?
> Thanks again.
>
> Keith
>

I ran some experiments w/ different numbers of locality groups, it had
a noticeable effect on minor compactions times.  The results are in a
comment in ticket ACCUMULO-112. I suspect the locality group change is
behind the slowdown in ingest.

https://issues.apache.org/jira/browse/ACCUMULO-112

Mime
View raw message