Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 4064 invoked from network); 12 Apr 2008 08:33:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Apr 2008 08:33:31 -0000 Received: (qmail 40151 invoked by uid 500); 12 Apr 2008 08:33:30 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 40113 invoked by uid 500); 12 Apr 2008 08:33:29 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 40102 invoked by uid 99); 12 Apr 2008 08:33:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Apr 2008 01:33:29 -0700 X-ASF-Spam-Status: No, hits=1.4 required=10.0 tests=SPF_NEUTRAL,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [209.85.132.249] (HELO an-out-0708.google.com) (209.85.132.249) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Apr 2008 08:32:45 +0000 Received: by an-out-0708.google.com with SMTP id c5so270935anc.49 for ; Sat, 12 Apr 2008 01:32:44 -0700 (PDT) Received: by 10.100.214.15 with SMTP id m15mr7186206ang.58.1207989163865; Sat, 12 Apr 2008 01:32:43 -0700 (PDT) Received: from ?10.17.4.4? ( [72.93.214.93]) by mx.google.com with ESMTPS id 31sm8621691wri.14.2008.04.12.01.32.43 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 12 Apr 2008 01:32:43 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v753) In-Reply-To: <16627610.post@talk.nabble.com> References: <16627610.post@talk.nabble.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <9BDF76AA-D7E0-4A80-A686-98B4B5FAA331@mikemccandless.com> Content-Transfer-Encoding: 7bit From: Michael McCandless Subject: Re: Deprecation of flush in IndexWriter Date: Sat, 12 Apr 2008 04:33:06 -0400 To: java-dev@lucene.apache.org X-Mailer: Apple Mail (2.753) X-Virus-Checked: Checked by ClamAV on apache.org That's a good point. Lucene is doing a 2-phase commit, under the hood in the commit() method, but doesn't expose the two separate phases (prepare & commit) through the API. Unfortunately, while the flush() call does some of the work of the prepare phase, it's not doing enough. For example, it does not close the "doc stores", which involve alot of IO if a compound file needs to be built. It also does not fsync() all referenced files, which could fail. Finally, it does not initiate writing the next segments_N file, which can also fail if you run out of disk space right then. These are really things that the "prepare" phase should be doing, because they add risk of an IOException during commit(). Note that even lacking explicit exposure of the two separate phases of the commit in Lucene's API, you can still involve Lucene in a transaction with other resources (eg a database) that do expose the two phases, by calling "IndexWriter.commit()" after all other resources have successfully prepared, and then rolling back all other resources on hitting an exception from Lucene, else committing them. So I don't think keeping flush(), and advertising it as the equivalent of a prepare phase of a 2-phase commit protocol, is enough here -- it would be "false advertising". But I agree it would be good for Lucene to explicitly expose the "prepare" phase. Maybe we could add a "prepareCommit()" method, that does the flush, and closes docs stores, and syncs, and initiates but does not complete the writing of the next segments_N file. Having called prepareCommit() you would not be allowed to call anything else in IndexWriter until commit() or abort() is called. Also, if a concurrent merges completes it would also be blocked from committing the changes to the index until commit(). We should probably also deprecate abort() and rename it to rollback(). I'll open an issue for this... Mike Shay Banon wrote: > > Hi, > > I was just looking a bit at the trunk. First, let me say that the > progress you guys make is amazing!. I would still like to ask a quick > question regarding deprecation of flush in IndexWriter. I think > that there > are cases where flush is needed. For example, in trying to create a > two > phase (or as close as possible to one) commit. The flush can be > used for the > fist phase and the close/commit can be used for the second one. > Does it make > sense? > > Cheers, > Shay > -- > View this message in context: http://www.nabble.com/Deprecation-of- > flush-in-IndexWriter-tp16627610p16627610.html > Sent from the Lucene - Java Developer mailing list archive at > Nabble.com. > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org