From java-dev-return-15649-apmail-lucene-java-dev-archive=lucene.apache.org@lucene.apache.org Wed Sep 13 19:32:12 2006 Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 12132 invoked from network); 13 Sep 2006 19:32:12 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 13 Sep 2006 19:32:12 -0000 Received: (qmail 49604 invoked by uid 500); 13 Sep 2006 19:32:09 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 49565 invoked by uid 500); 13 Sep 2006 19:32:09 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 49554 invoked by uid 99); 13 Sep 2006 19:32:09 -0000 Received: from idunn.apache.osuosl.org (HELO idunn.apache.osuosl.org) (140.211.166.84) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Sep 2006 12:32:09 -0700 X-ASF-Spam-Status: No, hits=0.0 required=5.0 tests= Received: from ([209.237.227.198:44701] helo=brutus.apache.org) by idunn.apache.osuosl.org (ecelerity 2.1 r(10620)) with ESMTP id 58/B0-10750-E7C58054 for ; Wed, 13 Sep 2006 12:31:11 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 26806714362 for ; Wed, 13 Sep 2006 19:27:28 +0000 (GMT) Message-ID: <18162846.1158175648155.JavaMail.jira@brutus> Date: Wed, 13 Sep 2006 12:27:28 -0700 (PDT) From: "Michael McCandless (JIRA)" To: java-dev@lucene.apache.org Subject: [jira] Commented: (LUCENE-665) temporary file access denied on Windows In-Reply-To: <24993893.1156548203672.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/LUCENE-665?page=comments#action_12434527 ] Michael McCandless commented on LUCENE-665: ------------------------------------------- I do think we should make Lucene robust to "windows change log" software. We could take the position that you have to uninstall such software because they "conflict" with Lucene, but I don't think that's realistic. Apparently many packages use this convenient API and that will only get worse with time. I would put this under the "Lucene should assume the least common denominator of filesystem's capabilities" umbrella. Meaning, Lucene now assumes it can rename files right after closing them, but on Windows this isn't a safe assumption so if possible we should change the index format to not require this. I will try to reproduce this bug with my [upcoming] changes for lockless commits (numbered segments files) -- the lockless commits changes do much less file renaming, so the issue should be rarer (but could still occur). > temporary file access denied on Windows > --------------------------------------- > > Key: LUCENE-665 > URL: http://issues.apache.org/jira/browse/LUCENE-665 > Project: Lucene - Java > Issue Type: Bug > Components: Store > Affects Versions: 2.0.0 > Environment: Windows > Reporter: Doron Cohen > Attachments: FSDirectory_Retry_Logic.patch, FSDirs_Retry_Logic_3.patch, Test_Output.txt, TestInterleavedAddAndRemoves.java > > > When interleaving adds and removes there is frequent opening/closing of readers and writers. > I tried to measure performance in such a scenario (for issue 565), but the performance test failed - the indexing process crashed consistently with file "access denied" errors - "cannot create a lock file" in "lockFile.createNewFile()" and "cannot rename file". > This is related to: > - issue 516 (a closed issue: "TestFSDirectory fails on Windows") - http://issues.apache.org/jira/browse/LUCENE-516 > - user list questions due to file errors: > - http://www.nabble.com/OutOfMemory-and-IOException-Access-Denied-errors-tf1649795.html > - http://www.nabble.com/running-a-lucene-indexing-app-as-a-windows-service-on-xp%2C-crashing-tf2053536.html > - discussion on lock-less commits http://www.nabble.com/Lock-less-commits-tf2126935.html > My test setup is: XP (SP1), JAVA 1.5 - both SUN and IBM SDKs. > I noticed that the problem is more frequent when locks are created on one disk and the index on another. Both are NTFS with Windows indexing service enabled. I suspect this indexing service might be related - keeping files busy for a while, but don't know for sure. > After experimenting with it I conclude that these problems - at least in my scenario - are due to a temporary situation - the FS, or the OS, is *temporarily* holding references to files or folders, preventing from renaming them, deleting them, or creating new files in certain directories. > So I added to FSDirectory a retry logic in cases the error was related to "Access Denied". This is the same approach brought in http://www.nabble.com/running-a-lucene-indexing-app-as-a-windows-service-on-xp%2C-crashing-tf2053536.html - there, in addition to the retry, gc() is invoked (I did not gc()). This is based on the *hope* that a access-denied situation would vanish after a small delay, and the retry would succeed. > I modified FSDirectory this way for "Access Denied" errors during creating a new files, renaming a file. > This worked fine for me. The performance test that failed before, now managed to complete. There should be no performance implications due to this modification, because only the cases that would otherwise wrongly fail are now delaying some extra millis and retry. > I am attaching here a patch - FSDirectory_Retry_Logic.patch - that has these changes to FSDirectory. > All "ant test" tests pass with this patch. > Also attaching a test case that demostrates the problem - at least on my machine. There two tests cases in that test file - one that works in system temp (like most Lucene tests) and one that creates the index in a different disk. The latter case can only run if the path ("D:" , "tmp") is valid. > It would be great if people that experienced these problems could try out this patch and comment whether it made any difference for them. > If it turns out useful for others as well, including this patch in the code might help to relieve some of those "frustration" user cases. > A comment on state of proposed patch: > - It is not a "ready to deploy" code - it has some debug printing, showing the cases that the "retry logic" actually took place. > - I am not sure if current 30ms is the right delay... why not 50ms? 10ms? This is currently defined by a constant. > - Should a call to gc() be added? (I think not.) > - Should the retry be attempted also on "non access-denied" exceptions? (I think not). > - I feel it is somewhat "woodoo programming", but though I don't like it, it seems to work... > Attached files: > 1. TestInterleavedAddAndRemoves.java - the LONG test that fails on XP without the patch and passes with the patch. > 2. FSDirectory_Retry_Logic.patch > 3. Test_Output.txt- output of the test with the patch, on my XP. Only the createNewFile() case had to be bypassed in this test, but for another program I also saw the renameFile() being bypassed. > - Doron -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org