accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-3079) improve system iterator performance by collapsing call stack
Date Tue, 02 Sep 2014 18:51:20 GMT


Josh Elser commented on ACCUMULO-3079:

bq. I moved the synchronization to the filter, so the functionality is still there

Right, I was just noting that you came to the same conclusion about being able to move it.

bq. I did play around with removing the SyncronizedIterator altogether in the case where no
client iterators are present

Hrmm. I wonder how that works for compaction time iterators? Mostly a rhetorical question
since you didn't include that change here.

bq. I could subjectively say that the performance difference between not having the Synchronized
iterator at all and moving its functionality into the VisibilityFilter was small enough to
be hidden in the testing noise, but I don't have evidence to objectively state that.

Ok, I didn't necessarily expect you to have specifics for that either. I brought it up mostly
because I was curious. While having the synchronized "barrier" in the iterator stack does
make isolate user iterators from mucking up the system iterators, I think I was really wondering
if it even makes sense to "encourage" users to write multi-threaded iterators. I know I personally
haven't come across any use cases where multiple threads are used effectively by an iterator.
Would it make sense to remove the synchronized iterator(filter) from the default "pipeline"
and wire up something through the client API and/or table configuration that can let users
enable it.

Not implying that this should be done for this ticket, it's just a parallel thought that has
some relevancy here.

> improve system iterator performance by collapsing call stack
> ------------------------------------------------------------
>                 Key: ACCUMULO-3079
>                 URL:
>             Project: Accumulo
>          Issue Type: Improvement
>            Reporter: Adam Fuchs
>            Assignee: Adam Fuchs
>             Fix For: 1.6.1, 1.7.0
>         Attachments: iterator_performance_20140822_1.patch, iterator_performance_test_harness.tar.gz
> System iterators are at the core of the tightest loops in Accumulo, handling every key/value
pair that traverses through a scan or a compaction. In many cases, iterators are the current
performance bottleneck for Accumulo. Every bit that we can improve performance in the iterators
translates into better performance for Accumulo.
> There are several strategies that can be applied to the current code base to improve
performance, including:
>  # Inlining calls that are hard for the JVM to inline at runtime
>  # Moving checks for null outside of tight loops when they are invariants within the
>  # Eliminating "no-op" iterators at iterator tree construction time
>  # Making frequently used and assigned-once objects final (like iterator sources)

This message was sent by Atlassian JIRA

View raw message