Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C5D3B200C70 for ; Thu, 4 May 2017 20:50:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C48E6160BC7; Thu, 4 May 2017 18:50:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 1613A160BB0 for ; Thu, 4 May 2017 20:50:08 +0200 (CEST) Received: (qmail 82450 invoked by uid 500); 4 May 2017 18:50:08 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 82430 invoked by uid 99); 4 May 2017 18:50:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 May 2017 18:50:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A62851AA295 for ; Thu, 4 May 2017 18:50:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id EdiwtS56nKbA for ; Thu, 4 May 2017 18:50:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 2AEC05FB29 for ; Thu, 4 May 2017 18:50:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 5A78FE099E for ; Thu, 4 May 2017 18:50:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 8685621DF7 for ; Thu, 4 May 2017 18:50:04 +0000 (UTC) Date: Thu, 4 May 2017 18:50:04 +0000 (UTC) From: "Ben Manes (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-4626) improve cache hit rate via weak reference map MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 04 May 2017 18:50:09 -0000 [ https://issues.apache.org/jira/browse/ACCUMULO-4626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15997213#comment-15997213 ] Ben Manes commented on ACCUMULO-4626: ------------------------------------- While immature, [Apache Mnemonic|https://github.com/apache/incubator-mnemonic] could be really nice one day. It allows for off-heap without the serialization penalty, though if blocks are byte[] already that wouldn't matter much. > improve cache hit rate via weak reference map > --------------------------------------------- > > Key: ACCUMULO-4626 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4626 > Project: Accumulo > Issue Type: Improvement > Components: tserver > Reporter: Adam Fuchs > Labels: performance, stability > Time Spent: 1h > Remaining Estimate: 0h > > When a single iterator tree references the same RFile blocks in different branches we sometimes get cache misses for one iterator even though the requested block is held in memory by another iterator. This is particularly important when using something like the IntersectingIterator to intersect many deep copies. Instead of evicting completely, keeping evicted blocks into a WeakReference value map can avoid re-reading blocks that are currently referenced by another deep copied source iterator. > We've seen this in the field for some of Sqrrl's queries against very large tablets. The total memory usage for these queries can be equal to the size of all the iterator block reads times the number of readahead threads times the number of files times the number of IntersectingIterator children when cache miss rates are high. This might work out to something like: > {code} > 16 readahead threads * 200 deep copied children * 99% cache miss rate * 20 files * 252KB per reader = ~16GB of memory > {code} > In most cases, evicting to a weak reference value map changes the cache miss rate from very high to very low and has a dramatic effect on total memory usage. -- This message was sent by Atlassian JIRA (v6.3.15#6346)