Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm
Precedence: bulk
Reply-To: jira@apache.org
Date: Tue, 19 Aug 2014 21:37:19 +0000 (UTC)
From: "Josh Elser (JIRA)" <jira@apache.org>
To: notifications@accumulo.apache.org
Message-ID: <JIRA.12735177.1408482854391.36562.1408484239280@arcas>
In-Reply-To: <JIRA.12735177.1408482854391@arcas>
References: <JIRA.12735177.1408482854391@arcas>
Subject: [jira] [Commented] (ACCUMULO-3067) scan performance degrades after
 compaction
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/ACCUMULO-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102888#comment-14102888 ] 

Josh Elser commented on ACCUMULO-3067:
--------------------------------------

bq. in your screen shot it looks like you have nearly filled DFS. That usually leads to bad things.

Of the HDFS space being used, Accumulo is using 99.9% of it. This is different than HDFS being full.

> scan performance degrades after compaction
> ------------------------------------------
>
>                 Key: ACCUMULO-3067
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3067
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>         Environment: Macbook Pro 2.6 GHz Intel Core i7, 16GB RAM, SSD, OSX 10.9.4, single tablet server process, single client process
>            Reporter: Adam Fuchs
>         Attachments: Screen Shot 2014-08-19 at 4.19.37 PM.png, accumulo_query_perf_test.tar.gz
>
>
> I've been running some scan performance tests on 1.6.0, and I'm running into an interesting situation in which query performance starts at a certain level and then degrades by ~15% after an event. The test follows roughly the following scenario:
>  # Single tabletserver instance
>  # Load 100M small (~10byte) key/values into a tablet and let it finish major compacting
>  # Disable the garbage collector (this makes the time to _the event_ longer)
>  # Restart the tabletserver
>  # Repeatedly scan from the beginning to the end of the table in a loop
>  # Something happens on the tablet server, like one of {idle compaction of metadata table, forced flush of metadata table, forced compaction of metadata table, forced flush of trace table}
>  # Observe that scan rates dropped by 15-20%
>  # Observe that restarting the scan will not improve performance back to original level. Performance only gets better upon restarting the tablet server.
> I've been able to get this not to happen by removing iterators from the iterator tree. It doesn't seem to matter which iterators, but removing a certain number both improves performance (significantly) and eliminates the degradation problem. The default iterator tree includes (VersioningIterator, SynchronizedIterator, VisibilityFilter, ColumnQualifierFilter, ColumnFamilySkippingIterator, DeletingIterator, StatsIterator, MultiIterator, (MemoryIterator*, RFile.Reader*)). Narrowing this to VisibilityFilter, ColumnFamilySkippingIterator, DeletingIterator, StatsIterator, MultiIterator, (MemoryIterator*, RFile.Reader*)) eliminates the weird condition. There are also other combinations that perform much better than the default. I haven't been able to isolate this problem to a single iterator, despite removing each iterator one at a time.
> Anybody know what might be happening here? Best theory so far: the JVM learns that iterators can be used in a different way after a compaction, and some JVM optimization like JIT compilation, branch prediction, or automatic inlining stops happening.


--
This message was sent by Atlassian JIRA
(v6.2#6252)