accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3918) Different locality groups can compact with different iterator stacks
Date Thu, 25 Jun 2015 18:02:05 GMT


Christopher Tubbs commented on ACCUMULO-3918:

bq. you can log the iterator stack for the file (which would be follow on work)

This sounds interesting for debugging purposes. Is there an existing ticket for this? Also,
it seems you could just do the same thing per locality group, too.

bq. files are apparent to users. It's how users think about the data on disk.

I disagree. Files have nothing to do with the user's data model, and unless they are administering
their own cluster, I don't know many users who even realize their is a relationshp between
their data and the files stored on disk. Locality groups, on the other hand, at least have
a loose relationship with the data model, because they are sets of column families. That's
not to say they're not important... especially for performance and administration/debugging,
of course. In any case, in your scenario, you're still going to have multiple current files
for a tablet which have been potentially compacted with different iterator stacks. I don't
see how that's fundamentally different than locality groups being compacted with different
iterator stacks, unless you're only talking about full major compactions. Are you only concerned
about those? (If so, that would go a long way towards my understanding your scenario.)

bq. And telling people to offline their tables is horrible advise. That's a issue we should
strengthen, not tell people to suffer through.

I don't think it's "horrible advise[sic]". It's the best advice possible, given that it's
pretty much the best guarantee we can give. I do agree it's insufficient, since there's not
an online sync option available. I believe it's been proposed before to offer an online solution
for ZK property sync, via the API... maybe an option to clear ZooCache for all the per-table

> Different locality groups can compact with different iterator stacks
> --------------------------------------------------------------------
>                 Key: ACCUMULO-3918
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.6.0
>            Reporter: John Vines
> While looking through the compactor code, I noticed that we load the iterator stack for
each locality group written and drop it when we're done. This means if a user reconfigures
iterators while a locality group is being written, the following locality groups will be compacted
inconsistently with the rest of the file.
> We should really read the stack once and be consistent for the entire file written.

This message was sent by Atlassian JIRA

View raw message