accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-4692) CompactionDriver leaves abandoned metadata scans
Date Fri, 04 Aug 2017 18:49:00 GMT


Christopher Tubbs commented on ACCUMULO-4692:

[~afuchs] do you have a sense of the performance impact this has on a cluster, and whether
that impact is sufficiently mitigated by the seemingly significant increase in code complexity
required to handle this edge case?

It seems there's some complexity trade-offs in the REPO changes you suggest (tracking merges,
and fall back to larger scan, etc.,). The iterator solution or closing the scanner seems like
simpler solution to support this edge case, that might be more maintainable in the long-term.

> CompactionDriver leaves abandoned metadata scans
> ------------------------------------------------
>                 Key: ACCUMULO-4692
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: fate
>            Reporter: Adam Fuchs
> We wrote a tool to kick off tablet compactions in the background while minimizing compaction
load per-server. The tool uses range compaction on one tablet per call. We're seeing a high
number of scans on the metadata table (~7,000 on a ~100 node cluster).
> The metadata query in the isReady() method of CompactionDriver that is used to see if
the compaction has completed uses a range that goes to the end of the metadata entries for
the given table, but it stops consuming the results of the scanner at the end of the compaction
range. isReady gets called in a pretty tight loop, especially with hundreds of compactions
running concurrently. Seems like we should limit the scan to the metadata range associated
with the compaction so that the scan can get cleaned up quickly.

This message was sent by Atlassian JIRA

View raw message