accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adam Fuchs (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ACCUMULO-3067) scan performance degrades after compaction
Date Thu, 21 Aug 2014 14:47:11 GMT

     [ https://issues.apache.org/jira/browse/ACCUMULO-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Adam Fuchs updated ACCUMULO-3067:
---------------------------------

    Attachment: jit_log_during_compaction.txt

Attached is a sample from tserver_localhost.out when -XX:+PrintCompilation is enabled. There
are many compilation notifications early on, then a quiet period. At around 192 seconds into
the execution of the tserver I kick off a compaction. This log includes all of the compilation
events around that time. It shows several methods being deoptimized ("made not entrant") and
then some being optimized again. See https://gist.github.com/chrisvest/2932907 for a pretty
good explanation of how to read this log.

> scan performance degrades after compaction
> ------------------------------------------
>
>                 Key: ACCUMULO-3067
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3067
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>         Environment: Macbook Pro 2.6 GHz Intel Core i7, 16GB RAM, SSD, OSX 10.9.4, single
tablet server process, single client process
>            Reporter: Adam Fuchs
>         Attachments: Screen Shot 2014-08-19 at 4.19.37 PM.png, accumulo_query_perf_test.tar.gz,
jit_log_during_compaction.txt
>
>
> I've been running some scan performance tests on 1.6.0, and I'm running into an interesting
situation in which query performance starts at a certain level and then degrades by ~15% after
an event. The test follows roughly the following scenario:
>  # Single tabletserver instance
>  # Load 100M small (~10byte) key/values into a tablet and let it finish major compacting
>  # Disable the garbage collector (this makes the time to _the event_ longer)
>  # Restart the tabletserver
>  # Repeatedly scan from the beginning to the end of the table in a loop
>  # Something happens on the tablet server, like one of {idle compaction of metadata table,
forced flush of metadata table, forced compaction of metadata table, forced flush of trace
table}
>  # Observe that scan rates dropped by 15-20%
>  # Observe that restarting the scan will not improve performance back to original level.
Performance only gets better upon restarting the tablet server.
> I've been able to get this not to happen by removing iterators from the iterator tree.
It doesn't seem to matter which iterators, but removing a certain number both improves performance
(significantly) and eliminates the degradation problem. The default iterator tree includes:

>  * SourceSwitchingIterator
>  ** VersioningIterator
>  *** SynchronizedIterator
>  **** VisibilityFilter
>  ***** ColumnQualifierFilter
>  ****** ColumnFamilySkippingIterator
>  ******* DeletingIterator
>  ******** StatsIterator
>  ********* MultiIterator
>  ********** MemoryIterator
>  ********** ProblemReportingIterator
>  *********** HeapIterator
>  ************ RFile.LocalityGroupReader
> We can eliminate the weird condition by narrowing the set of iterators to:
>  * SourceSwitchingIterator
>  ** VisibilityFilter
>  *** ColumnFamilySkippingIterator
>  **** DeletingIterator
>  ***** StatsIterator
>  ****** MultiIterator
>  ******* MemoryIterator
>  ******* ProblemReportingIterator
>  ******** HeapIterator
>  ********* RFile.LocalityGroupReader
> There are other combinations that also perform much better than the default. I haven't
been able to isolate this problem to a single iterator, despite removing each iterator one
at a time.
> Anybody know what might be happening here? Best theory so far: the JVM learns that iterators
can be used in a different way after a compaction, and some JVM optimization like JIT compilation,
branch prediction, or automatic inlining stops happening.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message