accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Many locality groups
Date Wed, 18 Sep 2013 15:02:10 GMT
I have a use case in which I'm investigating setting a locality group on
every column family in a table which has very "dense" rows (many columns
appear within the same tablet).

When scanning over a single column, I see a slow-down as one might expect
(filtering out the columns I don't care about). Setting each column into
its own locality group helps speed things up again for that single column
query case.

I'm curious if anyone has any insight to when/if I'm going to start paying
a penalty for having many locality groups. Glancing back over RFile.Reader,
I have to read each LocalityGroupMetadata and its multi-level index (which
shouldn't be bad if I remember Keith's talks) and then I should get log(n)
reads across the locality groups I need to open.

Is the same true for writing data to many a table with many locality
groups? Nothing terrible pops out at me looking at the code.

I was planning to write some tests to try and simulate this, but figured I
can poll the community as well to see if anyone has experimented in this
scenario before.


- Josh

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message