accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (ACCUMULO-1696) deep copy in the compaction scope iterators can throw off the stats
Date Fri, 06 Sep 2013 22:40:52 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-1696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Keith Turner resolved ACCUMULO-1696.
------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.6.0
         Assignee: Keith Turner
    
> deep copy in the compaction scope iterators can throw off the stats
> -------------------------------------------------------------------
>
>                 Key: ACCUMULO-1696
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1696
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>    Affects Versions: 1.5.0
>            Reporter: Adam Fuchs
>            Assignee: Keith Turner
>            Priority: Minor
>             Fix For: 1.6.0
>
>
> When application-level iterators deep copy the source iterator in a major compaction,
the stats can be significantly off. We count two things in a major compaction:
> 1. Entries read. This is done using a counting iterator sitting just above the system
iterators.
> 2. Entries written. This is done by counting the entries that are written to the RFile.
> Here's an example of what we see in the Accumulo logs:
> {code}
> 2013-09-06 11:53:31,371 [tabletserver.Compactor] DEBUG: Compaction k;row11;row10 20 read
| 382,629 written |      3 entries/sec |  5.337 secs
> {code}
> In this case, we're only counting 20 entries read, presumably because the iterators have
been deep copied and the counting iterator that is being polled does not get a complete view
of how many entries were read. Instead of 3 entries/sec we should have registered close to
72k entries/sec.
> To fix this, should we be counting all reads coming from any of the deep copies of the
source iterators? This could be done by using a CountingIterator that keeps one counter for
all deep copies. Thread-level counters could be used for lock-free counts in case multiple
threads are ever used.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message