Return-Path: X-Original-To: apmail-jackrabbit-users-archive@minotaur.apache.org Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8A6BD97A9 for ; Fri, 8 Mar 2013 10:09:30 +0000 (UTC) Received: (qmail 88160 invoked by uid 500); 8 Mar 2013 10:09:29 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 88064 invoked by uid 500); 8 Mar 2013 10:09:29 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 88054 invoked by uid 99); 8 Mar 2013 10:09:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Mar 2013 10:09:29 +0000 X-ASF-Spam-Status: No, hits=0.3 required=5.0 tests=FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_QUOTING X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of d.hobi@gmx.ch designates 212.227.15.15 as permitted sender) Received: from [212.227.15.15] (HELO mout.gmx.net) (212.227.15.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Mar 2013 10:09:22 +0000 Received: from mailout-de.gmx.net ([10.1.76.2]) by mrigmx.server.lan (mrigmx001) with ESMTP (Nemesis) id 0LsMuE-1UtaiL36OO-0122KR for ; Fri, 08 Mar 2013 11:09:01 +0100 Received: (qmail invoked by alias); 08 Mar 2013 10:09:01 -0000 Received: from 84-74-15-94.dclient.hispeed.ch (EHLO [192.168.0.69]) [84.74.15.94] by mail.gmx.net (mp002) with SMTP; 08 Mar 2013 11:09:01 +0100 X-Authenticated: #12164079 X-Provags-ID: V01U2FsdGVkX1/HEL0wgJHHLJxkZBIZaK1ES1VH4jS0I/ATwU5MFf VH3wqYpHtOfz9Z Content-Type: text/plain; charset=utf-8 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: JVM hangs during startup (indexing) (2) From: Daniel Hobi In-Reply-To: <4jH.akDJ.1NdOuqckW9.1HERGL@seznam.cz> Date: Fri, 8 Mar 2013 11:09:00 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <381781CA-AA11-403F-BB2D-2D064DE4036A@gmx.ch> References: <1g5.akCz.3dTpkPzrk7z.1HEB4W@seznam.cz> <003e01ce1b34$d0ad9810$7208c830$@gmx.ch> <1bu.akCZ.DBSvXJQkRv.1HEAsy@seznam.cz> <2BD4291B-A544-455B-8D70-1A8F6C10EEDE@gmx.ch> <4YV.akD1.61miUvUu8IQ.1HEQnI@seznam.cz> <4jH.akDJ.1NdOuqckW9.1HERGL@seznam.cz> To: users@jackrabbit.apache.org X-Mailer: Apple Mail (2.1499) X-Y-GMX-Trusted: 0 X-Virus-Checked: Checked by ClamAV on apache.org > Does it happen also on another machine?=20 Yes, but only once (of 5=E2=80=A66 reindexing is running) > Try latest Oracle JDK to see if it makes any difference Same effect with latest Oracle JVM (JRE) CPU / IO looks ok to me as long as the JVM is running. As soon as the JVM freezes/hangs CPU/IO is idle root@miraculix:~# ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 15774 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 9000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 15774 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Thanks for your input so far! Daniel Am 08.03.2013 um 10:49 schrieb : > It is really weird that JVM process is frozen and you cannot get = thread=20 > dump. I never saw such problem. It looks like HW or OS problem. > Does it happen also on another machine? =46rom your log I also see you = use=20 > OpenJDK. Try latest Oracle JDK to see if it makes any difference. >=20 >=20 >=20 >=20 > If you monitor frozen JVM process is there any activity? (CPU/io). Is = there=20 > any OS limit on process mem size? Check ulimit settings. (ulimit -a) >=20 >=20 >=20 >=20 > Marek >=20 >=20 > "Hi Marek >=20 > I use jstack -F to get thread dumps. > - Except once the output of jstack is the only one I get (=3D the = first dump I > provided)=20 > - Once (the jvm was not completely frozen, just the indexing made no = further > progress) jstack also made a dump in our log file (=3D second dump) >=20 > Another result after killing the jvm process with "kill -4 " = (kill -3=20 > does not work due to a frozen jvm) > http://pastebin.com/qc2bUkbk >=20 > Daniel >=20 > Am 08.03.2013 um 10:16 schrieb : >=20 >> I checked 2 thread dumps you provided so I can only comment what I = saw=20 >> there. I did not see any deadlock. Only >> problem I saw was in main thread where it was waiting for connection. >>=20 >>=20 >>=20 >>=20 >>=20 >> 1.=20 >> "main" prio=3D10 tid=3D0x0000000040feb000 nid=3D0x4fb0 in = Object.wait() [0x >> 00007fefd3812000] >> 2.=20 >> java.lang.Thread.State: WAITING (on object monitor) >> 3.=20 >> at java.lang.Object.$$YJP$$wait(Native Method) >> 4.=20 >> at java.lang.Object.wait(Object.java) >> 5.=20 >> at java.lang.Object.wait(Object.java:485) >> 6.=20 >> at org.apache.commons.pool.impl.GenericObjectPool.borrowObject >> (GenericObjectPool.java:1104) >> 7.=20 >> - locked <0x00000000d9b90c40> (a org.apache.commons.pool.impl. >> GenericObjectPool$Latch) >> 8.=20 >> at org.apache.commons.dbcp.PoolingDataSource.getConnection >> (PoolingDataSource.java:106) >> 9.=20 >> at org.apache.commons.dbcp.BasicDataSource.getConnection >> (BasicDataSource.java:1044) >> 10.=20 >> at org.apache.jackrabbit.core.util.db.ConnectionHelper. >> getConnection(ConnectionHelper.java:439) >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >> You should be able to get thread dump any time unless JVM process is=20= >> completely frozen IMO. Do you use jstack to get thread dump? >>=20 >> Marek >>=20 >>=20 >> ---------- P=C5=AFvodn=C3=AD zpr=C3=A1va ---------- >> Od: Daniel Hobi >> Datum: 8. 3. 2013 >> P=C5=99edm=C4=9Bt: Re: JVM hangs during startup (indexing) (2) >>=20 >> "Hi Marek >>=20 >> Thanks for your suggestions! >>=20 >> Unfortunately, changing the JVM gc options so that the JVM will no = longer=20 >> hang just worked ONCE. >> Therefore I was not able to get a "nice" stack trace anymore :-( >>=20 >> Connection Pool: >> We had set the max connection pool size to 5. As you suggested I set = it to >=20 >> 15. Same effect. Even 50 or 100 did not help. >> Interestingly, postgres shows just 3 open connection for jackrabbit. = Is=20 > this >> a suspicious behavior or does it simply not need more connections and = is=20 >> therefore not spawning new connections? >>=20 >> Enabling debug, setting login and socket timeout on the = postgres-jdbc- > driver >> did not help either. >> I even changed the pool implementation from dbcp to c3p0 in = jackrabbit- > core- >> 2.6.0 to be "sure" it is not a pool bug. >> Further we are not able to reproduce the hang with: >> - Slower machine, postgres 8.3 >> - Very fast machine, postgres 8.4 (only hanged 1 of 5 times so far) >> - Faster machine, mysql >>=20 >> This issue is slowly driving me nuts ;-) >>=20 >> Thanks! >> Daniel >>=20 >>=20 >>=20 >> Am 07.03.2013 um 16:24 schrieb mslama@email.cz: >>=20 >>> Also randomly we had problem with not released connections in JBoss=20= >>> connection pool and Postgres. Now long time no occurrence. >>> In any case you can check number/list of open connections in psql . = In=20 > our >>=20 >>> case it was >>> 97 open connection (by default 100 open connections in postgres and = 3 are >=20 >>> reserved for admin). It happenned even if we had much smaller >>> max connection pool size. As it happened randomly we did not find = any=20 >>> solution. We tried to change connection pool size and it does not >>> happen any more. Still no idea what was cause of this. In any case = using=20 >>> psql you can find out easily if it is also your problem. >>>=20 >>> Query is for 9.x: SELECT * FROM pg_stat_activity; >>> Marek >>>=20 >>> --=20 >>> Marek Slama >>> mslama@email.cz >>>=20 >>>=20 >>>=20 >>> ---------- P=C5=AFvodn=C3=AD zpr=C3=A1va ---------- >>> Od: mslama@email.cz >>> Datum: 7. 3. 2013 >>> P=C5=99edm=C4=9Bt: Re: JVM hangs during startup (indexing) (2) >>>=20 >>> "Quick look: It hangs in connection pool - connection pool if it = cannot=20 >>> return free connection to caller it waits forever by default. >>> How big is your connection pool? (Maximum size? It would be good to = patch >=20 >>> connection pool to see who/how many connections is >>> used and why connections are used not release. In our config we use=20= >>> connection max pool size 15 and it is enough.) So first >>> fast fix would be to increase max connection pool size but if there = is=20 >>> problem with not released connection it will not help for long. >>>=20 >>> Good news is it is not problem with memory or gc (at least IMO). >>>=20 >>> Marek >>>=20 >>> --=20 >>> Marek Slama >>> mslama@email.cz >>>=20 >>>=20 >>>=20 >>> ---------- P=C5=AFvodn=C3=AD zpr=C3=A1va ---------- >>> Od: Daniel Hobi >>> Datum: 7. 3. 2013 >>> P=C5=99edm=C4=9Bt: JVM hangs during startup (indexing) (2) >>>=20 >>> "Hi there >>>=20 >>>=20 >>>=20 >>> Some updates on our issue started here: >>>=20 >>> = http://mail-archives.apache.org/mod_mbox/jackrabbit-users/201301.mbox/%3C >> 201 >>> 30115135117.186960%40gmx.net%3E >>>=20 >>>=20 >>>=20 >>> We're still facing the JVM freeze (possibly during minor gcs) even = if... >>>=20 >>> ...we updated Jackrabbit from 2.2.13 to 2.6.0. >>>=20 >>> ...we updated the Sun JVM to the latest version (1.6.39) and even = tried >>> OpenJDK. >>>=20 >>> ...we increased the JVM heap size (up to 1.5gb). >>>=20 >>> ...we upgraded PostgreSQL to the latest version (8.4.x / 9.2.x)=20 > available. >>>=20 >>>=20 >>>=20 >>> But there is good news too: >>>=20 >>> - I've just played around with the JVM's garbage collection >>> options. Even if the index process still hangs the JVM will no = longer=20 >> block. >>> I was able to get an interesting thread dump: >>> http://pastebin.com/uSrXpej8 >>>=20 >>> - With the addition JVM Param -XX:MaxNewSize=3D10m we can make the >>> hang occur faster (less than 100 minor gcs) >>>=20 >>>=20 >>>=20 >>> I would be very grateful if someone could have a look at this. We're=20= >> running >>> (slowly but surely) out of ideas :-( >>>=20 >>>=20 >>>=20 >>> Thanks! >>>=20 >>> Daniel""""