accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jayesh Patel <jpa...@keywcorp.com>
Subject RE: Fwd: why compaction failure on one table brings other tables offline, how to recover
Date Tue, 12 Apr 2016 15:35:49 GMT
Well the total RAM on the VM is 6GB with no swap space, so the OS and other 
Accumulo processes have enough.  I meant that 300MB is currently available for 
tserver process to use as reported by 'free'.  tserver.sort.buffer.size is set 
to 100MB.  I was able to start it up today, some change in the dynamics I 
guess.

-----Original Message-----
From: Josh Elser [mailto:josh.elser@gmail.com]
Sent: Tuesday, April 12, 2016 11:11 AM
To: user@accumulo.apache.org
Subject: Re: Fwd: why compaction failure on one table brings other tables 
offline, how to recover

Jayesh Patel wrote:
> Josh, The OOM tserver process was killed by the kernel, it didn't hang
> around.  I tried restarting it manually, but it ran out of memory
> right away and was killed again leaving the tablet offline.  It must
> have a huge "recovery" log to go through.  HDFS
> /accumulo/wal/instance-accumulo+9997/24e08581-a081-4b41-afc5-d75bdda6c
> f15 is about 42MB, and machine has about 300MB free and apparently not
> enough for tserver.
>

Ok, cool. If you're that constrained on resources, you can also try reducing 
the property tserver.sort.buffer.size in accumulo-site.xml. It defaults to 
200M, you could try 25M or 50M instead.

This is a buffer size that is used for sorting log edits during the recovery 
process. This might help if you never make it through the recovery process.

300MB is a little low in general as far as headroom goes (especially when 
you're already not giving Accumulo enough RAM). Typically, you want to ensure 
that you give the operating system at least 1G of memory for itself.

Mime
View raw message