Return-Path: X-Original-To: apmail-accumulo-notifications-archive@minotaur.apache.org Delivered-To: apmail-accumulo-notifications-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1B634106EE for ; Fri, 11 Oct 2013 17:28:51 +0000 (UTC) Received: (qmail 67583 invoked by uid 500); 11 Oct 2013 17:28:45 -0000 Delivered-To: apmail-accumulo-notifications-archive@accumulo.apache.org Received: (qmail 67517 invoked by uid 500); 11 Oct 2013 17:28:44 -0000 Mailing-List: contact notifications-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jira@apache.org Delivered-To: mailing list notifications@accumulo.apache.org Received: (qmail 67445 invoked by uid 99); 11 Oct 2013 17:28:43 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Oct 2013 17:28:43 +0000 Date: Fri, 11 Oct 2013 17:28:43 +0000 (UTC) From: "Josh Elser (JIRA)" To: notifications@accumulo.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ACCUMULO-1770) out of memory error on very long running tablet server MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ACCUMULO-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792838#comment-13792838 ] Josh Elser commented on ACCUMULO-1770: -------------------------------------- I re-ran my tiny, contrived test and definitely see excessive RSS usage. I haven't dug into it yet; I wanted to post these first. {code:java} BatchWriter bw = c.createBatchWriter("foo", new BatchWriterConfig()); for (int i = 0; i < 2500000; i++) { Mutation m = new Mutation(Integer.toString(i)); for (int j = 0; j < 10; j++) { for (int k = 0; k < 10; k++) { m.put(Integer.toString(j), Integer.toString(k), ""); } } bw.addMutation(m); } bw.close(); {code} I took the initial cold memory usage. Started the above code, taking the usage around 150M entries ("During"). Then, I waited for minor compaction to finish ("End Pre-MajC"). Finally, I issued a major compaction for the table ("End Post-MajC"). ||Time||Virtual||Resident|| |Start|26535192|550236| |During|41551148|15791996| |End Pre-MajC|42466608|16690456| |End Post-MajC|40567092|14770068| Virtual and Resident are in KB. I think I only had one or two minor compactions with 16G given to the memory maps. I also grabbed the output of 'pmap -x' for each of the timings in the table above. Perhaps the size of the value isn't the issue? > out of memory error on very long running tablet server > ------------------------------------------------------ > > Key: ACCUMULO-1770 > URL: https://issues.apache.org/jira/browse/ACCUMULO-1770 > Project: Accumulo > Issue Type: Bug > Components: tserver > Reporter: Eric Newton > Assignee: Eric Newton > Attachments: FragmentTest.java, memory-usage.png > > > On a large cluster it was noticed that a few of the tablet servers had been pushed into swap. This didn't effect the performance of the server until it ran out of memory, and the process was killed. The gc reports in the debug log showed the system had plenty of heap space for the JVM. The number of threads in the server were not excessive (dozens). This cluster ingests some large values (megabytes). The tablet server had been up for a month prior to running out of memory. MALLOC_ARENA_MAX had already been set to 1. > * Investigate the effect of fragmentation on memory usage for large value inserts. -- This message was sent by Atlassian JIRA (v6.1#6144)