Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 59224 invoked from network); 17 Mar 2008 10:36:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Mar 2008 10:36:39 -0000 Received: (qmail 22351 invoked by uid 500); 17 Mar 2008 10:36:30 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 21830 invoked by uid 500); 17 Mar 2008 10:36:29 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 21819 invoked by uid 99); 17 Mar 2008 10:36:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Mar 2008 03:36:29 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [64.233.184.224] (HELO wr-out-0506.google.com) (64.233.184.224) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Mar 2008 10:35:39 +0000 Received: by wr-out-0506.google.com with SMTP id c30so3716848wra.14 for ; Mon, 17 Mar 2008 03:35:49 -0700 (PDT) Received: by 10.115.89.1 with SMTP id r1mr17304434wal.8.1205750148082; Mon, 17 Mar 2008 03:35:48 -0700 (PDT) Received: from ?10.17.4.4? ( [72.93.214.93]) by mx.google.com with ESMTPS id h37sm14423374wxd.18.2008.03.17.03.35.47 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 17 Mar 2008 03:35:47 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v753) In-Reply-To: <1bcb7c7f0803170331u6fa8817buaecb8d83f64799f5@mail.gmail.com> References: <800d6e9c0803121359x97a8b18i78423eb3217323ed@mail.gmail.com> <1bcb7c7f0803170133h37a95a6dnf1fd8660cc2733ce@mail.gmail.com> <594DD2DE-B63A-410F-BB76-771A1F1BD909@mikemccandless.com> <1bcb7c7f0803170150r7dcdd149kb80fe4436622e7d7@mail.gmail.com> <53F056C0-74F5-4DB4-A81D-0204365AAC4C@mikemccandless.com> <1bcb7c7f0803170216v453fe2dfxb0dbd73a0ebe9b64@mail.gmail.com> <57C95CB4-0AC6-4B90-9706-48B6F56CFD94@mikemccandless.com> <1bcb7c7f0803170318q51d490e8tb6f2233d97f6d208@mail.gmail.com> <9A4EDD8C-E861-468F-BBB5-8BF90903D785@mikemccandless.com> <1bcb7c7f0803170331u6fa8817buaecb8d83f64799f5@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: Content-Transfer-Encoding: 7bit From: Michael McCandless Subject: Re: IndexReader deleteDocument Date: Mon, 17 Mar 2008 06:35:55 -0400 To: java-user@lucene.apache.org X-Mailer: Apple Mail (2.753) X-Virus-Checked: Checked by ClamAV on apache.org Oh, sorry, no you still must reopen the IndexReader. IndexReader still searches only a point in time. Mike Cam Bazz wrote: > yes, I meant the same index. > > I thought with the new changes - the index reader would see the > changes > without re-opening. > It would be real real cool to have that. > > > Best. > > -C.B. > > On Mon, Mar 17, 2008 at 12:28 PM, Michael McCandless < > lucene@mikemccandless.com> wrote: > >> >> I'm not sure what you mean by "same thread". Maybe you meant "same >> index"? >> >> Yes, if the IndexReader reopens. >> >> IndexWriter.commit() makes the changes visible to readers, and makes >> the changes durable to os/computer crash or power outage. >> >> Mike >> >> Cam Bazz wrote: >> >>> Another and last question; >>> >>> when the user commits, will an indexreader that is reading the same >>> thread >>> see the changes made or not? >>> >>> I thought something was said about this, if my memory serves me >>> correct. >>> >>> Best. >>> >>> On Mon, Mar 17, 2008 at 11:53 AM, Michael McCandless < >>> lucene@mikemccandless.com> wrote: >>> >>>> >>>> It's a hard drive issue. When you call fsync, the OS asks the hard >>>> drive to sync. >>>> >>>> Mike >>>> >>>> Cam Bazz wrote: >>>> >>>>> Hello, >>>>> >>>>> I understand the issue. But I have not understood - is this >>>>> hardware related >>>>> issue - i.e a harddisk? or operating system? >>>>> >>>>> If I am using linux would the OS lie about fsyncing? could I do >>>>> anything in >>>>> the kernel to stop it from lying? or is this just a harddrive >>>>> related >>>>> issue... >>>>> >>>>> Best. >>>>> >>>>> On Mon, Mar 17, 2008 at 11:12 AM, Michael McCandless < >>>>> lucene@mikemccandless.com> wrote: >>>>> >>>>>> >>>>>> When you write to a file, modern OSs by default just buffer those >>>>>> writes in memory rather than actually writing them immediately to >>>>>> disk. Modern hard drives do the same (so, after the OS >>>>>> flushes to >>>>>> the hard drive, the hard drive actually just buffers the writes, >>>>>> too). Then, when it's a good time, these buffered writes are >>>>>> spooled >>>>>> to disk in the background. They do this to get better >>>>>> performance on >>>>>> write. >>>>>> >>>>>> Then, the fsync() call, which is an OS level call, requests that >>>>>> all >>>>>> buffered bytes be flushed to the real underlying storage ("stable >>>>>> storage"). It is not supposed to return until all written bytes >>>>>> are >>>>>> on stable storage. Lucene relies on this by fsync'ing all >>>>>> referenced >>>>>> files in the index, before deleting the files referenced by >>>>>> previous >>>>>> commits. So, as of 2.4, this ensures the index will remain >>>>>> consistent even if the OS or computer crashes, or power is cut. >>>>>> >>>>>> Unfortunately, there are apparently some devices which even when >>>>>> fsync >>>>>> () is called, return immediately even though the bytes are not >>>>>> actually written to stable storage. If you have such a device >>>>>> that >>>>>> lies then Lucene 2.4 won't be able to guarantee index >>>>>> consistency on >>>>>> crash/power outage. >>>>>> >>>>>> Mike >>>>>> >>>>>> Cam Bazz wrote: >>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> What do you mean by IO system lying on fsync? >>>>>>> >>>>>>> Best. >>>>>>> >>>>>>> On Mon, Mar 17, 2008 at 10:40 AM, Michael McCandless < >>>>>>> lucene@mikemccandless.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> Yes that's already been committed to trunk as well. >>>>>>>> >>>>>>>> IndexWriter now has a commit() method which syncs all >>>>>>>> referenced >>>>>>>> files in the index to stable storage (assuming your IO system >>>>>>>> doesn't >>>>>>>> "lie" on fsync). >>>>>>>> >>>>>>>> Mike >>>>>>>> >>>>>>>> On Mar 17, 2008, at 4:33 AM, Cam Bazz wrote: >>>>>>>> >>>>>>>>> Nice. Thanks. >>>>>>>>> >>>>>>>>> will the 2.4 have commit improvements that we previously >>>>>>>>> talked >>>>>>>>> about? >>>>>>>>> >>>>>>>>> best regards. >>>>>>>>> >>>>>>>>> -C.B. >>>>>>>>> >>>>>>>>> On Mon, Mar 17, 2008 at 10:31 AM, Michael McCandless < >>>>>>>>> lucene@mikemccandless.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> The trunk version of Lucene (eventually 2.4) now has >>>>>>>>>> deletion by >>>>>>>>>> query, in IndexWriter. >>>>>>>>>> >>>>>>>>>> Mike >>>>>>>>>> >>>>>>>>>> Cam Bazz wrote: >>>>>>>>>> >>>>>>>>>>> Hello Erick, >>>>>>>>>>> >>>>>>>>>>> Has anyone found a way for deleting a document with a >>>>>>>>>>> query? I >>>>>>>>>>> understand it >>>>>>>>>>> can be deleted via terms, but I need to delete a document >>>>>>>>>>> with two >>>>>>>>>>> terms, >>>>>>>>>>> that is the only way I can identify my document is by >>>>>>>>>>> looking at >>>>>>>>>>> two terms >>>>>>>>>>> not one. >>>>>>>>>>> >>>>>>>>>>> best. >>>>>>>>>>> >>>>>>>>>>> On Fri, Mar 14, 2008 at 4:58 PM, Erick Erickson >>>>>>>>>>> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Doc IDs are assigned at index time and can change over time >>>>>>>>>>>> That >>>>>>>>>>>> is, >>>>>>>>>>>> deleting >>>>>>>>>>>> a document and optimizing (and other operations) can and >>>>>>>>>>>> will >>>>>>>>>>>> change >>>>>>>>>>>> document IDs. So, yes, you have to do a search (either >>>>>>>>>>>> use a >>>>>>>>>>>> hits >>>>>>>>>>>> object >>>>>>>>>>>> or one of the HitCollectors) in order to delete by doc ID. >>>>>>>>>>>> >>>>>>>>>>>> You can also delete by terms, see the API. >>>>>>>>>>>> >>>>>>>>>>>> There are other options, but you haven't explianed what >>>>>>>>>>>> you're >>>>>>>>>>>> trying to accomplish enough to offer any more suggestions. >>>>>>>>>>>> >>>>>>>>>>>> Best >>>>>>>>>>>> Erick >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Mar 12, 2008 at 5:44 PM, varun sood >>>>>>>>>>>> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> No. I haven't but I will. even though I would like to >>>>>>>>>>>>> make my >>>>>>>>>>>>> own >>>>>>>>>>>>> implementation. So any idea of how to get the "doc num"? >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for replying. >>>>>>>>>>>>> Varun >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Mar 12, 2008 at 5:15 PM, Mark Miller >>>>>>>>>>>>> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Have you seen the work that Mark Harwood has done >>>>>>>>>>>>>> making a >>>>>>>>>>>>>> GWT >>>>>>>>>>>>>> version >>>>>>>>>>>>>> of Luke? I think its in the latest release. >>>>>>>>>>>>>> >>>>>>>>>>>>>> varun sood wrote: >>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>> I am trying to delete a document without using the >>>>>>>>>>>>>>> hits >>>>>>>>>>>>>>> object. >>>>>>>>>>>>>>> What is the unique field in the index that I can use to >>>>>>>>>>>>>>> delete the >>>>>>>>>>>>>> document? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I am trying to make a web interface where index can be >>>>>>>>>>>>>>> modified, >>>>>>>>>>>>> smaller >>>>>>>>>>>>>>> subset of what Luke does but using JSPs and Servlet. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> to use deleteDocument(int docNum) >>>>>>>>>>>>>>> I need docNum how can I get this? or does it have to >>>>>>>>>>>>>>> come >>>>>>>>>>>>>>> only vis >>>>>>>>>>>>> Hits? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>> Varun >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> --------------------------------------------------------- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> To unsubscribe, e-mail: java-user- >>>>>>>>>>>>>> unsubscribe@lucene.apache.org >>>>>>>>>>>>>> For additional commands, e-mail: java-user- >>>>>>>>>>>>>> help@lucene.apache.org >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------------------------------------- >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> -- >>>>>>>>>> To unsubscribe, e-mail: java-user- >>>>>>>>>> unsubscribe@lucene.apache.org >>>>>>>>>> For additional commands, e-mail: java-user- >>>>>>>>>> help@lucene.apache.org >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> --------------------------------------------------------------- >>>>>>>> -- >>>>>>>> -- >>>>>>>> -- >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>>>>>> For additional commands, e-mail: java-user- >>>>>>>> help@lucene.apache.org >>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>>> ----------------------------------------------------------------- >>>>>> -- >>>>>> -- >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>>>> >>>>>> >>>> >>>> >>>> ------------------------------------------------------------------- >>>> -- >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>> >>>> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-user-help@lucene.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org