accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-1696) deep copy in the compaction scope iterators can throw off the stats
Date Fri, 06 Sep 2013 22:38:51 GMT


ASF subversion and git services commented on ACCUMULO-1696:

Commit de24f8322f1ef2d4817b335ae90b131f0a7b2c1c in branch refs/heads/master from [~keith_turner]
[;h=de24f83 ]

ACCUMULO-1696 fixed compaction debug counts for deep copies

> deep copy in the compaction scope iterators can throw off the stats
> -------------------------------------------------------------------
>                 Key: ACCUMULO-1696
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Adam Fuchs
>            Priority: Minor
> When application-level iterators deep copy the source iterator in a major compaction,
the stats can be significantly off. We count two things in a major compaction:
> 1. Entries read. This is done using a counting iterator sitting just above the system
> 2. Entries written. This is done by counting the entries that are written to the RFile.
> Here's an example of what we see in the Accumulo logs:
> {code}
> 2013-09-06 11:53:31,371 [tabletserver.Compactor] DEBUG: Compaction k;row11;row10 20 read
| 382,629 written |      3 entries/sec |  5.337 secs
> {code}
> In this case, we're only counting 20 entries read, presumably because the iterators have
been deep copied and the counting iterator that is being polled does not get a complete view
of how many entries were read. Instead of 3 entries/sec we should have registered close to
72k entries/sec.
> To fix this, should we be counting all reads coming from any of the deep copies of the
source iterators? This could be done by using a CountingIterator that keeps one counter for
all deep copies. Thread-level counters could be used for lock-free counts in case multiple
threads are ever used.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message