accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: Many locality groups
Date Wed, 18 Sep 2013 15:35:26 GMT
I ran some test before and after partitioning tablet memory in
ACCUMULO-112.  I commented on the performance numbers I saw.  I checked in
the code I used to test.

test/src/main/java/org/apache/accumulo/test/IMMLGBenchmark.java

Looking back at the test, one thing I did not time was reading all of the
locality groups in scan.


On Wed, Sep 18, 2013 at 11:02 AM, Josh Elser <josh.elser@gmail.com> wrote:

> I have a use case in which I'm investigating setting a locality group on
> every column family in a table which has very "dense" rows (many columns
> appear within the same tablet).
>
> When scanning over a single column, I see a slow-down as one might expect
> (filtering out the columns I don't care about). Setting each column into
> its own locality group helps speed things up again for that single column
> query case.
>
> I'm curious if anyone has any insight to when/if I'm going to start paying
> a penalty for having many locality groups. Glancing back over RFile.Reader,
> I have to read each LocalityGroupMetadata and its multi-level index (which
> shouldn't be bad if I remember Keith's talks) and then I should get log(n)
> reads across the locality groups I need to open.
>
> Is the same true for writing data to many a table with many locality
> groups? Nothing terrible pops out at me looking at the code.
>
> I was planning to write some tests to try and simulate this, but figured I
> can poll the community as well to see if anyone has experimented in this
> scenario before.
>
> Thanks!
>
> - Josh
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message