Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B30C8E5F for ; Sun, 11 Sep 2011 22:36:29 +0000 (UTC) Received: (qmail 65770 invoked by uid 500); 11 Sep 2011 22:36:27 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 65712 invoked by uid 500); 11 Sep 2011 22:36:26 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 65704 invoked by uid 99); 11 Sep 2011 22:36:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Sep 2011 22:36:26 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 209.85.215.170 as permitted sender) Received: from [209.85.215.170] (HELO mail-ey0-f170.google.com) (209.85.215.170) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Sep 2011 22:36:21 +0000 Received: by eyd10 with SMTP id 10so2415228eyd.1 for ; Sun, 11 Sep 2011 15:36:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; bh=NA7L+2L+MoK3cJMrEXO/vigeJIaz34B3XXNvQ+ckD0A=; b=rO15FreIfLF5+1fM0XVHzdRlF8JSWwRxSI9gatxVIlFo3W2+2rYefSCDMNMWbhSvhx 5zO2uD/zD4J1/hZ5y0SUBiFLB9c98YouwZvE8mrOkPzqmpI84akWQNqclYENUCtPgk+q w9H1UpXUTKMgQdYL3n+LaMyEcO8LJPbrQSG28= Received: by 10.213.13.65 with SMTP id b1mr95400eba.43.1315780559197; Sun, 11 Sep 2011 15:35:59 -0700 (PDT) MIME-Version: 1.0 Received: by 10.213.7.13 with HTTP; Sun, 11 Sep 2011 15:35:39 -0700 (PDT) In-Reply-To: References: From: Jonathan Ellis Date: Sun, 11 Sep 2011 17:35:39 -0500 Message-ID: Subject: Re: SIGSEGV during compaction? To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable The problem would be because we're trying to access a mappedbytebuffer that has been unmapped, not because of OS deletion semantics. On Sun, Sep 11, 2011 at 2:12 PM, Yang wrote: > unfortunately it appeared again, I confirmed it appeared with the > -XX:-UseCompressedOops =A0. > > so, if it's due to accessing unmapped file, the problem should > disappear if I use direct random access file, I'll try that. > > also if a process A (not only java) has a file mmapped, and another > process deletes the file, I thought Unix should keep the file open for > A, no ?? =A0if this is true, then we shouldn't get a SEGV due to forcing > unmmap ???? > > > Thanks > Yang > On Thu, Sep 8, 2011 at 12:07 AM, Sylvain Lebresne = wrote: >> Are you using current trunk ? Or 0.8 ? >> >> Because if on trunk, a SIGSEGV could also be due to CASSANDRA-2521, >> if we happen to force the unmapping of a file but tries to access it aft= erwards >> (which shouldn't happen but ...). >> >> -- >> Sylvain >> >> On Thu, Sep 8, 2011 at 7:36 AM, Yang wrote: >>> hmmmm, all other things remaining the same, I put jna.jar into classpat= h, >>> now it successfully completed a compaction without problems >>> >>> On Wed, Sep 7, 2011 at 10:06 PM, Yang wrote: >>>> thanks Jonathan. >>>> >>>> I tried openJdk too, same , filed bug to both Oracle and openJdk >>>> >>>> >>>> tried -XX:-UseCompressedOops , same SEGV >>>> >>>> Oracle bug site asks "does it appear with -server and -Xint", I tried >>>> these options, so far no SEGV yet, maybe slower, but haven't measured >>>> exactly >>>> >>>> >>>> >>>> On Wed, Sep 7, 2011 at 8:56 PM, Jonathan Ellis wro= te: >>>>> You should report a bug to Oracle. >>>>> >>>>> In the meantime you could try turning off compressed oops -- that's >>>>> been a source of a lot of GC bugs in the past. >>>>> >>>>> On Wed, Sep 7, 2011 at 8:22 PM, Yang wrote: >>>>>> some info in the debug file that JVM exported: >>>>>> >>>>>> # >>>>>> # A fatal error has been detected by the Java Runtime Environment: >>>>>> # >>>>>> # =A0SIGSEGV (0xb) at pc=3D0x00002aaaab37cbfa, pid=3D7236, tid=3D117= 9806016 >>>>>> # >>>>>> # JRE version: 6.0_27-b07 >>>>>> # Java VM: Java HotSpot(TM) 64-Bit Server VM (20.2-b06 mixed mode >>>>>> linux-amd64 compressed oops) >>>>>> # Problematic frame: >>>>>> # J =A0com.cgm.whisky.filter.WithinLimitIpFrequencyCap.isValid(Lorg/= apache/avro/specific/SpecificRecord;)Lcom/cgm/whisky/EventsFilter$ValidityC= ode; >>>>>> # >>>>>> # If you would like to submit a bug report, please visit: >>>>>> # =A0 http://java.sun.com/webapps/bugreport/crash.jsp >>>>>> # >>>>>> >>>>>> --------------- =A0T H R E A D =A0--------------- >>>>>> >>>>>> Current thread (0x00002aaab80e2800): =A0JavaThread "pool-3-thread-8" >>>>>> [_thread_in_Java, id=3D7669, >>>>>> stack(0x0000000046426000,0x0000000046527000)] >>>>>> >>>>>> siginfo:si_signo=3DSIGSEGV: si_errno=3D0, si_code=3D1 (SEGV_MAPERR), >>>>>> si_addr=3D0x00002aaabc000000 >>>>>> >>>>>> Registers: >>>>>> RAX=3D0x00000007914355e8, RBX=3D0x000000000000058a, >>>>>> RCX=3D0x0000000791461b38, RDX=3D0x0000000000000000 >>>>>> RSP=3D0x00000000465259f0, RBP=3D0x00000000f222b894, >>>>>> RSI=3D0x0000000791433f20, RDI=3D0x00002aaaab37ca60 >>>>>> R8 =3D0x00000000d0931f61, R9 =3D0x00000000f2286ab2, >>>>>> R10=3D0x0000000000000000, R11=3D0x00002aaabc000000 >>>>>> R12=3D0x0000000000000000, R13=3D0x00000000465259f0, >>>>>> R14=3D0x0000000000000002, R15=3D0x00002aaab80e2800 >>>>>> RIP=3D0x00002aaaab37cbfa, EFLAGS=3D0x0000000000010202, >>>>>> CSGSFS=3D0x0100000000000033, ERR=3D0x0000000000000004 >>>>>> =A0TRAPNO=3D0x000000000000000e >>>>>> >>>>>> Top of Stack: (sp=3D0x00000000465259f0) >>>>>> 0x00000000465259f0: =A0 000000068a828dc8 0000000791433f20 >>>>>> 0x0000000046525a00: =A0 000000079145ee60 000005890000058a >>>>>> >>>>>> >>>>>> On Wed, Sep 7, 2011 at 6:21 PM, Yang wrote: >>>>>>> I started compaction using nodetool, >>>>>>> then always reproducibly, I get a SEGV in a code that I added to th= e >>>>>>> Cassandra code, which simply calls get_slice(). >>>>>>> >>>>>>> have you seen SEGV associated with compaction? anyone could suggest= a >>>>>>> route on how to debug this? >>>>>>> >>>>>>> I filed a bug on sun website, right now the only possible approach = I >>>>>>> can try is to use another JDK >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> Yang >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Jonathan Ellis >>>>> Project Chair, Apache Cassandra >>>>> co-founder of DataStax, the source for professional Cassandra support >>>>> http://www.datastax.com >>>>> >>>> >>> >> > --=20 Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com