accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1770) out of memory error on very long running tablet server
Date Fri, 11 Oct 2013 17:28:43 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792838#comment-13792838
] 

Josh Elser commented on ACCUMULO-1770:
--------------------------------------

I re-ran my tiny, contrived test and definitely see excessive RSS usage. I haven't dug into
it yet; I wanted to post these first.

{code:java}
      BatchWriter bw = c.createBatchWriter("foo", new BatchWriterConfig());
      for (int i = 0; i < 2500000; i++) {
        Mutation m = new Mutation(Integer.toString(i));
        for (int j = 0; j < 10; j++) {
          for (int k = 0; k < 10; k++) {
            m.put(Integer.toString(j), Integer.toString(k), "");
          }
        }
        bw.addMutation(m);
      }
      
      bw.close();
{code}

I took the initial cold memory usage. Started the above code, taking the usage around 150M
entries ("During"). Then, I waited for minor compaction to finish ("End Pre-MajC"). Finally,
I issued a major compaction for the table ("End Post-MajC").

||Time||Virtual||Resident||
|Start|26535192|550236|
|During|41551148|15791996|
|End Pre-MajC|42466608|16690456|
|End Post-MajC|40567092|14770068|

Virtual and Resident are in KB. I think I only had one or two minor compactions with 16G given
to the memory maps. I also grabbed the output of 'pmap -x' for each of the timings in the
table above.

Perhaps the size of the value isn't the issue?

> out of memory error on very long running tablet server
> ------------------------------------------------------
>
>                 Key: ACCUMULO-1770
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1770
>             Project: Accumulo
>          Issue Type: Bug
>          Components: tserver
>            Reporter: Eric Newton
>            Assignee: Eric Newton
>         Attachments: FragmentTest.java, memory-usage.png
>
>
> On a large cluster it was noticed that a few of the tablet servers had been pushed into
swap.  This didn't effect the performance of the server until it ran out of memory, and the
process was killed.  The gc reports in the debug log showed the system had plenty of heap
space for the JVM.  The number of threads in the server were not excessive (dozens).  This
cluster ingests some large values (megabytes).  The tablet server had been up for a month
prior to running out of memory.  MALLOC_ARENA_MAX had already been set to 1.
> * Investigate the effect of fragmentation on memory usage for large value inserts.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message