Return-Path: X-Original-To: apmail-incubator-jspwiki-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-jspwiki-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E09CFD7CA for ; Tue, 26 Jun 2012 22:02:22 +0000 (UTC) Received: (qmail 30518 invoked by uid 500); 26 Jun 2012 22:02:22 -0000 Delivered-To: apmail-incubator-jspwiki-dev-archive@incubator.apache.org Received: (qmail 30481 invoked by uid 500); 26 Jun 2012 22:02:22 -0000 Mailing-List: contact jspwiki-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: jspwiki-dev@incubator.apache.org Delivered-To: mailing list jspwiki-dev@incubator.apache.org Received: (qmail 30472 invoked by uid 99); 26 Jun 2012 22:02:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jun 2012 22:02:22 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of juanpablo.santos@gmail.com designates 209.85.214.175 as permitted sender) Received: from [209.85.214.175] (HELO mail-ob0-f175.google.com) (209.85.214.175) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jun 2012 22:02:15 +0000 Received: by obcva7 with SMTP id va7so497163obc.6 for ; Tue, 26 Jun 2012 15:01:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=LrYayu4uK+dxq+Ykrbk8wwQdkmsW58VxhacgIZ4w+Po=; b=YxtQe5/ABSFHRCTuFpk/E001vBxaHzodqPU97yues7yWzOd83EFoke1GpF4+yhb2Me nK16r9/1V6MLT9BLA8e9vg++YU3lW7cFVfzn0VOJpxGpL8LtJXIAO0ky0uN4Q0XOgHD3 r1hkoxf7qqhvmYik71MvqIYv9pEsgJK4KL+xw3g1aowGQdUlx4lvjGTD1csuMDP8+xSh UPxpMg1XCiIiDSe/lAbr7kNU/RH7cDyj1XUlmWpznsS9DFlM2WoWxEB4do77yK2GoOWW rDZp4HpBlAO7TVb0jmnhKKYz86yaSdUUMPUp4BiJV0yRizcXdi2f7mG0GvO645B7a77P APCg== MIME-Version: 1.0 Received: by 10.182.72.42 with SMTP id a10mr17777710obv.22.1340748114498; Tue, 26 Jun 2012 15:01:54 -0700 (PDT) Received: by 10.76.99.239 with HTTP; Tue, 26 Jun 2012 15:01:54 -0700 (PDT) In-Reply-To: References: <17150735.64.1340310410178.JavaMail.florian@ultron> Date: Wed, 27 Jun 2012 00:01:54 +0200 Message-ID: Subject: Re: 3 out of 6 tests from SearchManagerTest failing if run all together From: =?UTF-8?Q?Juan_Pablo_Santos_Rodr=C3=ADguez?= To: jspwiki-dev@incubator.apache.org Content-Type: multipart/alternative; boundary=f46d04462ff01915ad04c3673f20 --f46d04462ff01915ad04c3673f20 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi, today I was able to run the build at work on a RedHat x64 + JDK6 (1.6.0_22), 64 bits and the build went ok. I'll try in the next days with a 32bit JDK to see if I can reproduce the exception, although I believe I won't have any problems.. I've been digging a bit through lucene src and found that it behaves here and there differently depending on 32 or 64bit, OS, and JDK -don't know why I'm having the exceptions, and being the only one with them :-? Most probably something with my environment, as I get the exceptions with a clean build. regards, juan pablo On Fri, Jun 22, 2012 at 1:38 AM, Juan Pablo Santos Rodr=C3=ADguez < juanpablo.santos@gmail.com> wrote: > Hallo, > > @Harry: I've been comparing bot setInfoStream outputs. They're basically > the same output, only a couple of differences > - when running all 6 tests I only get up to 15 IW, instead of 18 > - the exceptions either when writing the segments file or when closing th= e > index (and therefore committing changes) > IW 8 [Fri Jun 22 00:36:22 CEST 2012; JSPWiki Lucene Indexer]: commit: > wrote segments file "segments_1" > IFD [Fri Jun 22 00:36:22 CEST 2012; JSPWiki Lucene Indexer]: now > checkpoint "segments_1" [0 segments ; isCommit =3D true] > IFD [Fri Jun 22 00:36:22 CEST 2012; JSPWiki Lucene Indexer]: delete > pending file _0.tis > IFD [Fri Jun 22 00:36:22 CEST 2012; JSPWiki Lucene Indexer]: delete > "_0.tis" > IFD [Fri Jun 22 00:36:22 CEST 2012; JSPWiki Lucene Indexer]: unable to > remove file "_0.tis": java.io.IOException: Cannot delete > E:\tmp\testworkdir\lucene\_0.tis; Will re-try later. > IFD [Fri Jun 22 00:36:22 CEST 2012; JSPWiki Lucene Indexer]: delete > pending file _0.nrm > IFD [Fri Jun 22 00:36:22 CEST 2012; JSPWiki Lucene Indexer]: delete > "_0.nrm" > IFD [Fri Jun 22 00:36:22 CEST 2012; JSPWiki Lucene Indexer]: unable to > remove file "_0.nrm": java.io.IOException: Cannot delete > E:\tmp\testworkdir\lucene\_0.nrm; Will re-try later. > > My environment: Win 7 x64, with or without cygwin, with JDK 1.6.0_19 firs= t > and 1.7.0_3 secondly, 64 bit edition of both JDKs. I can try to run the a= nt > tests at work (JDK 1.6, 32-bit) but most probably not until next Monday / > Tuesday. Also, I'll try to run with newer JDK6/7 versions but again, not > until next Monday/Tuesday. > > @Florian: it seems to me that the problem is caused due to a concurrency > issue, as the index is updated in a separate thread, so it would be > difficult to serialize its execution. Also, being a concurrency issue mak= es > it hard to reproduce in a unit test. > > The singleton approach would be feasible too and most probably would also > get rid off the (my) exceptions problem. I preferred the re-schedule > approach because if an exception occurs while doing the index operation, = we > are now logging it and then acting as if nothing had happened, with the > page not being indexed; re-scheduling the index operation simply places t= he > WikiPage back in the index queue (as the indexing hasn't been performed) = so > the index operation can be retried later on. > > > rgrds, > juan pablo > > > > On Thu, Jun 21, 2012 at 10:28 PM, Florian Holeczek w= rote: > >> Hi Juan Pablo, >> >> I'd rather not fix working code only because some tests are not working >> due to (test-)architectural issues. >> Not sure if I understood the problem completely though. >> Wouldn't the right way be serializing the tests or using a singleton for >> the Lucene index? >> >> Regards >> Florian >> >> >> ----- Urspr=C3=BCngliche Mail ----- >> Von: "Juan Pablo Santos Rodr=C3=ADguez" >> An: jspwiki-dev@incubator.apache.org >> Gesendet: Donnerstag, 21. Juni 2012 18:55:35 >> Betreff: Re: 3 out of 6 tests from SearchManagerTest failing if run all >> together >> >> Ok, so finally, I think I got it. >> >> Updates to lucene Index are done in a separate thread, via the >> LuceneSearchProvider.LuceneUpdater inner class, which is executed on a >> different thread. This may lead to situations where there are more than >> one >> IndexWriter instances working on the same directory, provoking the >> exception noted on the first email. >> >> The proposed fix is relatively easy, on >> LuceneSearchProvider.luceneIndexPage( WikiPage page, String text, >> IndexWriter writer ) method, if an update index operation fails, simply >> re-schedule the index operation: >> >> protected synchronized void updateLuceneIndex( WikiPage page, String >> text ) >> { >> [...] >> try >> { >> [...] >> luceneIndexPage( page, text, writer ); >> } >> catch ( IOException e ) >> { >> log.error("Unable to update page '" + page.getName() + "' fro= m >> Lucene index", e); >> reindexPage( page ); >> } >> catch( Exception e ) >> { >> log.error("Unexpected Lucene exception - please check >> configuration!",e); >> reindexPage( page ); >> } >> finally >> [...] >> >> This approach could cause some sort of infinite loop if there's somethin= g >> severe happening (i.e.: disk full, no permission to write on the work di= r, >> etc.), but I think that those kind of situations would be noted througho= ut >> the whole application, so there would be little to do on those cases >> anyway. >> >> WDYT about the suggested change? >> >> >> br, >> juan pablo >> >> >> On Thu, Jun 21, 2012 at 1:00 AM, Juan Pablo Santos Rodr=C3=ADguez < >> juanpablo.santos@gmail.com> wrote: >> >> > Hello Harry, >> > >> > I've increased the time up to 100s and still the same tests failing / >> > exceptions. I think it has to be something related to index updating, = as >> > the IOExceptions noted by enabling writer.setInfoStream( System.out ); >> > occur every time a page is updated. >> > >> > Also, the exceptions are caught always at >> > LuceneSearchProvider.LuceneUpdater threads so maybe they are trying to >> open >> > an IndexWriter while the JUnit test has his own IndexWriter opened. >> Anyway, >> > I'll continue doing more tests to see if I can isolate the cause of th= e >> > exception. >> > >> > >> > regards, >> > juan pablo >> > >> > >> > >> > On Wed, Jun 20, 2012 at 6:21 AM, Harry Metske > >wrote: >> > >> >> Wat happens of you increase the waittime specified in the >> >> SearchManagerTest? >> >> >> >> I think I had these ones when this time was too short. >> >> >> >> regards, >> >> Harry >> >> Op 20 jun. 2012 00:45 schreef "Juan Pablo Santos Rodr=C3=ADguez" < >> >> juanpablo.santos@gmail.com> het volgende: >> >> >> >> > Hi, >> >> > >> >> > I'm stuck running the tests from SearchManagerTest. If I run them >> >> > individually, then all is OK. >> >> > >> >> > However, if I run them all together (via Ant task or via Eclipse), >> then >> >> the >> >> > following tests fail: testSimplesearch3, testTitleSearch, >> >> testTitleSearch2, >> >> > at the assertEquals() regarding res.size() >> >> > >> >> > jspwikitests.log yields several exceptions like this one: >> >> > >> >> > java.io.IOException: Cannot overwrite: >> E:\tmp\testworkdir\lucene\_0.fdt >> >> > at >> >> > >> org.apache.lucene.store.FSDirectory.ensureCanWrite(FSDirectory.java:316) >> >> > at >> >> > >> org.apache.lucene.store.FSDirectory.createOutput(FSDirectory.java:305) >> >> > at >> org.apache.lucene.index.FieldsWriter.(FieldsWriter.java:83) >> >> > at >> >> > >> >> > >> >> >> org.apache.lucene.index.StoredFieldsWriter.initFieldsWriter(StoredFields= Writer.java:64) >> >> > at >> >> > >> >> > >> >> >> org.apache.lucene.index.StoredFieldsWriter.finishDocument(StoredFieldsWr= iter.java:107) >> >> > at >> >> > >> >> > >> >> >> org.apache.lucene.index.StoredFieldsWriter$PerDoc.finish(StoredFieldsWri= ter.java:151) >> >> > at >> >> > >> >> > >> >> >> org.apache.lucene.index.DocumentsWriter$WaitQueue.writeDocument(Document= sWriter.java:1404) >> >> > at >> >> > >> >> > >> >> >> org.apache.lucene.index.DocumentsWriter$WaitQueue.add(DocumentsWriter.ja= va:1424) >> >> > at >> >> > >> >> > >> >> >> org.apache.lucene.index.DocumentsWriter.finishDocument(DocumentsWriter.j= ava:1043) >> >> > at >> >> > >> >> > >> >> >> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.j= ava:772) >> >> > at >> >> > >> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2060) >> >> > at >> >> > >> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:2034) >> >> > at >> >> > >> >> > >> >> >> org.apache.wiki.search.LuceneSearchProvider.luceneIndexPage(LuceneSearch= Provider.java:485) >> >> > at >> >> > >> >> > >> >> >> org.apache.wiki.search.LuceneSearchProvider.updateLuceneIndex(LuceneSear= chProvider.java:377) >> >> > [...] >> >> > >> >> > Also, if I place writer.setInfoStream( System.out ); just after the >> >> writer >> >> > is instantiated, I get several messages similar to this one: >> >> > IFD [Wed Jun 20 00:20:07 CEST 2012; JSPWiki Lucene Indexer]: unable >> to >> >> > remove file "_0.fdt": java.io.IOException: Cannot delete >> >> > E:\tmp\testworkdir\lucene\_0.fdt; Will re-try later. >> >> > >> >> > Done a little googling and found that may be related to opening an >> >> > IndexWriter on the index while there's another one open [1] or >> removing >> >> > files while the index is open at that directory [2]. I'll continue >> >> looking >> >> > at this tomorrow, but I was wondering if anyone else is experiencin= g >> >> this >> >> > problem too. >> >> > >> >> > >> >> > br, >> >> > juan pablo >> >> > >> >> > [1]: http://www.gossamer-threads.com/lists/lucene/java-user/92262 >> >> > [2]: >> http://www.gossamer-threads.com/lists/lucene/java-user/62201#62201 >> >> > >> >> >> > >> > >> > > --f46d04462ff01915ad04c3673f20--