Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 72892 invoked from network); 29 Apr 2010 04:47:05 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Apr 2010 04:47:05 -0000 Received: (qmail 752 invoked by uid 500); 29 Apr 2010 04:47:04 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 654 invoked by uid 500); 29 Apr 2010 04:47:04 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 645 invoked by uid 99); 29 Apr 2010 04:47:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Apr 2010 04:47:03 +0000 X-ASF-Spam-Status: No, hits=-1.1 required=10.0 tests=AWL,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 209.85.218.222 as permitted sender) Received: from [209.85.218.222] (HELO mail-bw0-f222.google.com) (209.85.218.222) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Apr 2010 04:46:58 +0000 Received: by bwz22 with SMTP id 22so14111298bwz.25 for ; Wed, 28 Apr 2010 21:46:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=F5PpTk4UiaWVdIF1M2nkQgA20aiIRtKx1lilJYSvazw=; b=IgwudSopPmvGGyS4yerkUMmu42mG+gfGZHktPs1vZnP0xNPFHR1VoTe0lsH6OV7Tf3 UKANXVEZ/a13tuuVdz89AuC6AZm3drNF6hxJEOdCiH5en2x9sW73JJ3EzCkhXcUsJbKw 4uSilBGNWy5R7jm9Drpb9FT/gp+nAbSpI4BWo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=CjvtXwPwQ9ckS3Q6U0nxwVpUK4k1FqxGdEQI/rq/rsdtkhZ9NTBmkjjWlQxG0OHzJv Pk/sNgHHmDpTZSW4rsVEAxRylXEV7GoGyoh/m5B2uTQoOpyWhh1OrQWMPYuQkAYCBkSU MoFADN9QYrjpg2mP66bHZC1P9/H7iubwwC8KY= Received: by 10.204.142.207 with SMTP id r15mr821980bku.134.1272516397283; Wed, 28 Apr 2010 21:46:37 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.116.78 with HTTP; Wed, 28 Apr 2010 21:46:16 -0700 (PDT) In-Reply-To: References: From: Jonathan Ellis Date: Wed, 28 Apr 2010 23:46:16 -0500 Message-ID: Subject: Re: error during snapshot To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Interesting. Googling your error turns up http://stackoverflow.com/questions/1124771/how-to-solve-java-io-ioexception= -error12-cannot-allocate-memory-calling-runt Why not just leave the swap on? It's usually a Good Thing to be able to page out unused memory, and use the ram for buffer cache instead. On Wed, Apr 28, 2010 at 9:46 AM, Lee Parker wrote: > The thing is, that I'm not running close to being out of memory. =A0The d= ata > from nodetool info is showing that only about half of the available heap > space is being used and running free from the command line shows that I h= ave > plenty of RAM available and some usage of the 1G swap space which is alwa= ys > on. > nodetool info: > Load =A0 =A0 =A0 =A0 =A0 =A0 : 73.24 GB > Generation No =A0 =A0: 1271626230 > Uptime (seconds) : 839414 > Heap Memory (MB) : 2584.36 / 5461.38 > free -m: > =A0=A0 =A0 =A0 =A0 =A0 =A0 total =A0 =A0 =A0 used =A0 =A0 =A0 free =A0 = =A0 shared =A0 =A0buffers =A0 =A0 cached > Mem: =A0 =A0 =A0 =A0 =A07680 =A0 =A0 =A0 7640 =A0 =A0 =A0 =A0 39 =A0 =A0 = =A0 =A0 =A00 =A0 =A0 =A0 =A0 =A08 =A0 =A0 =A0 2364 > -/+ buffers/cache: =A0 =A0 =A0 5266 =A0 =A0 =A0 2413 > Swap: =A0 =A0 =A0 =A0 1023 =A0 =A0 =A0 =A0388 =A0 =A0 =A0 =A0635 > Lee Parker > > On Wed, Apr 28, 2010 at 9:18 AM, Jonathan Ellis wrote= : >> >> If you're running so close to the edge of running out of memory that >> creating a ln process pushes you over the edge, you should fix the >> broader cause instead of the specific symptom. :) >> >> On Tue, Apr 27, 2010 at 10:09 PM, Lee Parker wrot= e: >> > So, after reading the thread which Eric posted earlier, I have created= a >> > workaround for the issue. =A0In my backup script, I add a swapfile wit= h >> > swapon, tell cassandra to create the snapshots, then remove the swapfi= le >> > with swapoff. =A0Then I continue with the rest of the work the backup >> > script >> > needs to do in gathering up the snapshots into a tarball and pushing i= t >> > to >> > S3. >> > >> > Lee Parker >> > >> > On Tue, Apr 27, 2010 at 9:01 PM, Lee Parker >> > wrote: >> >> >> >> The system is a ubuntu server running 8.04 LTS. =A0Now, I'm getting t= he >> >> problem again this evening even with the addition of the swap space. >> >> >> >> Lee Parker >> >> >> >> On Tue, Apr 27, 2010 at 1:13 PM, Jonathan Shook >> >> wrote: >> >>> >> >>> The allocation of memory may have failed depending on the available >> >>> virtual memory, whether or not the memory would have been subsequent= ly >> >>> accessed by the process.=A0 Some systems do the work of allocating >> >>> physical >> >>> pages only when they are accessed for the first time. I'm not sure i= f >> >>> yours >> >>> is one of them. >> >>> >> >>> On Tue, Apr 27, 2010 at 10:45 AM, Lee Parker >> >>> wrote: >> >>>> >> >>>> Adding a swapfile fixed the error, but it doesn't look as though th= e >> >>>> process is even using the swap file at all. >> >>>> >> >>>> Lee Parker >> >>>> >> >>>> On Tue, Apr 27, 2010 at 9:49 AM, Eric Hauser >> >>>> wrote: >> >>>>> >> >>>>> Have you read this? >> >>>>> http://forums.sun.com/thread.jspa?messageID=3D9734530 >> >>>>> I don't think EC2 instances have any swap. >> >>>>> >> >>>>> >> >>>>> On Tue, Apr 27, 2010 at 10:16 AM, Lee Parker >> >>>>> wrote: >> >>>>>> >> >>>>>> Can anyone help with this? =A0It is preventing me from getting >> >>>>>> backups >> >>>>>> of our cluster. >> >>>>>> >> >>>>>> Lee Parker >> >>>>>> >> >>>>>> On Mon, Apr 26, 2010 at 10:02 PM, Lee Parker >> >>>>>> wrote: >> >>>>>>> >> >>>>>>> I was attempting to get a snapshot on our cassandra nodes. =A0I = get >> >>>>>>> the >> >>>>>>> following error every time I run nodetool ... snapshot. >> >>>>>>> Exception in thread "main" java.io.IOException: Cannot run progr= am >> >>>>>>> "ln": java.io.IOException: error=3D12, Cannot allocate memory >> >>>>>>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:459) >> >>>>>>> at >> >>>>>>> >> >>>>>>> org.apache.cassandra.io.util.FileUtils.createHardLink(FileUtils.= java:221) >> >>>>>>> at >> >>>>>>> >> >>>>>>> org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyS= tore.java:1060) >> >>>>>>> at org.apache.cassandra.db.Table.snapshot(Table.java:256) >> >>>>>>> at >> >>>>>>> >> >>>>>>> org.apache.cassandra.service.StorageService.takeAllSnapshot(Stor= ageService.java:1005) >> >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >>>>>>> at >> >>>>>>> >> >>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessor= Impl.java:39) >> >>>>>>> at >> >>>>>>> >> >>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethod= AccessorImpl.java:25) >> >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>>>>>> at >> >>>>>>> >> >>>>>>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Stand= ardMBeanIntrospector.java:93) >> >>>>>>> at >> >>>>>>> >> >>>>>>> com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(Stand= ardMBeanIntrospector.java:27) >> >>>>>>> at >> >>>>>>> >> >>>>>>> com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospec= tor.java:208) >> >>>>>>> at >> >>>>>>> com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:12= 0) >> >>>>>>> at >> >>>>>>> com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:26= 2) >> >>>>>>> at >> >>>>>>> >> >>>>>>> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(Def= aultMBeanServerInterceptor.java:836) >> >>>>>>> at >> >>>>>>> >> >>>>>>> com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.jav= a:761) >> >>>>>>> at >> >>>>>>> >> >>>>>>> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMICon= nectionImpl.java:1426) >> >>>>>>> at >> >>>>>>> >> >>>>>>> javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConn= ectionImpl.java:72) >> >>>>>>> at >> >>>>>>> >> >>>>>>> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperatio= n.run(RMIConnectionImpl.java:1264) >> >>>>>>> at >> >>>>>>> >> >>>>>>> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperat= ion(RMIConnectionImpl.java:1359) >> >>>>>>> at >> >>>>>>> >> >>>>>>> javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnecti= onImpl.java:788) >> >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >>>>>>> at >> >>>>>>> >> >>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessor= Impl.java:39) >> >>>>>>> at >> >>>>>>> >> >>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethod= AccessorImpl.java:25) >> >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >> >>>>>>> at >> >>>>>>> >> >>>>>>> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:3= 05) >> >>>>>>> at sun.rmi.transport.Transport$1.run(Transport.java:159) >> >>>>>>> at java.security.AccessController.doPrivileged(Native Method) >> >>>>>>> at sun.rmi.transport.Transport.serviceCall(Transport.java:155) >> >>>>>>> at >> >>>>>>> >> >>>>>>> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.j= ava:535) >> >>>>>>> at >> >>>>>>> >> >>>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTra= nsport.java:790) >> >>>>>>> at >> >>>>>>> >> >>>>>>> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTran= sport.java:649) >> >>>>>>> at >> >>>>>>> >> >>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoo= lExecutor.java:886) >> >>>>>>> at >> >>>>>>> >> >>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExe= cutor.java:908) >> >>>>>>> at java.lang.Thread.run(Thread.java:619) >> >>>>>>> Caused by: java.io.IOException: java.io.IOException: error=3D12, >> >>>>>>> Cannot >> >>>>>>> allocate memory >> >>>>>>> at java.lang.UNIXProcess.(UNIXProcess.java:148) >> >>>>>>> at java.lang.ProcessImpl.start(ProcessImpl.java:65) >> >>>>>>> at java.lang.ProcessBuilder.start(ProcessBuilder.java:452) >> >>>>>>> ... 34 more >> >>>>>>> The nodes are both Amazon EC2 Large instances with 7.5G RAM (6 >> >>>>>>> allocated for Java heap) with two cores and only 70G of data in >> >>>>>>> casssandra. >> >>>>>>> =A0They have plenty of available RAM and HD space. =A0Has anyone= else >> >>>>>>> run into >> >>>>>>> this error? >> >>>>>>> >> >>>>>>> Lee Parker >> >>>>> >> >>>> >> >>> >> >> >> > >> > >> >> >> >> -- >> Jonathan Ellis >> Project Chair, Apache Cassandra >> co-founder of Riptano, the source for professional Cassandra support >> http://riptano.com > > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com