accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <>
Subject Re: Scanning for rows using columnfamily only
Date Wed, 02 Nov 2011 18:36:42 GMT
On Tue, Nov 1, 2011 at 4:11 PM, Keith Massey
<> wrote:
> Thanks for the tips. We tried using one locality group per column family (I
> think there are 20-25). It has definitely sped up queries for all data in a
> single column family. The first batch comes back in about 5 seconds rather
> than 120 seconds without the locality groups. Our data load time doubled
> though from 7 hours to 14 hours. I don't have any evidence at this point
> that it is related to the locality groups. But there were very few
> differences between the 7-hour load and the 14-hour load. Any thoughts about
> whether this could be a side effect of loading data into 25 locality groups?
> Or am I looking in the wrong place?
> Thanks again.
> Keith

I ran some experiments w/ different numbers of locality groups, it had
a noticeable effect on minor compactions times.  The results are in a
comment in ticket ACCUMULO-112. I suspect the locality group change is
behind the slowdown in ingest.

View raw message