lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cam Bazz" <camb...@gmail.com>
Subject Re: IndexReader deleteDocument
Date Mon, 17 Mar 2008 10:31:00 GMT
yes, I meant the same index.

I thought with the new changes - the index reader would see the changes
without re-opening.
It would be real real cool to have that.


Best.

-C.B.

On Mon, Mar 17, 2008 at 12:28 PM, Michael McCandless <
lucene@mikemccandless.com> wrote:

>
> I'm not sure what you mean by "same thread".  Maybe you meant "same
> index"?
>
> Yes, if the IndexReader reopens.
>
> IndexWriter.commit() makes the changes visible to readers, and makes
> the changes durable to os/computer crash or power outage.
>
> Mike
>
> Cam Bazz wrote:
>
> > Another and last question;
> >
> > when the user commits, will an indexreader that is reading the same
> > thread
> > see the changes made or not?
> >
> > I thought something was said about this, if my memory serves me
> > correct.
> >
> > Best.
> >
> > On Mon, Mar 17, 2008 at 11:53 AM, Michael McCandless <
> > lucene@mikemccandless.com> wrote:
> >
> >>
> >> It's a hard drive issue.  When you call fsync, the OS asks the hard
> >> drive to sync.
> >>
> >> Mike
> >>
> >> Cam Bazz wrote:
> >>
> >>> Hello,
> >>>
> >>> I understand the issue. But I have not understood - is this
> >>> hardware related
> >>> issue - i.e a harddisk? or operating system?
> >>>
> >>> If I am using linux would the OS lie about fsyncing? could I do
> >>> anything in
> >>> the kernel to stop it from lying? or is this just a harddrive
> >>> related
> >>> issue...
> >>>
> >>> Best.
> >>>
> >>> On Mon, Mar 17, 2008 at 11:12 AM, Michael McCandless <
> >>> lucene@mikemccandless.com> wrote:
> >>>
> >>>>
> >>>> When you write to a file, modern OSs by default just buffer those
> >>>> writes in memory rather than actually writing them immediately to
> >>>> disk.  Modern hard drives do the same (so, after the OS flushes to
> >>>> the hard drive, the hard drive actually just buffers the writes,
> >>>> too).  Then, when it's a good time, these buffered writes are
> >>>> spooled
> >>>> to disk in the background.  They do this to get better
> >>>> performance on
> >>>> write.
> >>>>
> >>>> Then, the fsync() call, which is an OS level call, requests that
> >>>> all
> >>>> buffered bytes be flushed to the real underlying storage ("stable
> >>>> storage").  It is not supposed to return until all written bytes
> >>>> are
> >>>> on stable storage.  Lucene relies on this by fsync'ing all
> >>>> referenced
> >>>> files in the index, before deleting the files referenced by
> >>>> previous
> >>>> commits.  So, as of 2.4, this ensures the index will remain
> >>>> consistent even if the OS or computer crashes, or power is cut.
> >>>>
> >>>> Unfortunately, there are apparently some devices which even when
> >>>> fsync
> >>>> () is called, return immediately even though the bytes are not
> >>>> actually written to stable storage.  If you have such a device that
> >>>> lies then Lucene 2.4 won't be able to guarantee index
> >>>> consistency on
> >>>> crash/power outage.
> >>>>
> >>>> Mike
> >>>>
> >>>> Cam Bazz wrote:
> >>>>
> >>>>> Hello,
> >>>>>
> >>>>> What do you mean by IO system lying on fsync?
> >>>>>
> >>>>> Best.
> >>>>>
> >>>>> On Mon, Mar 17, 2008 at 10:40 AM, Michael McCandless <
> >>>>> lucene@mikemccandless.com> wrote:
> >>>>>
> >>>>>>
> >>>>>> Yes that's already been committed to trunk as well.
> >>>>>>
> >>>>>> IndexWriter now has a commit() method which syncs all referenced
> >>>>>> files in the index to stable storage (assuming your IO system
> >>>>>> doesn't
> >>>>>> "lie" on fsync).
> >>>>>>
> >>>>>> Mike
> >>>>>>
> >>>>>> On Mar 17, 2008, at 4:33 AM, Cam Bazz wrote:
> >>>>>>
> >>>>>>> Nice. Thanks.
> >>>>>>>
> >>>>>>> will the 2.4 have commit improvements that we previously
talked
> >>>>>>> about?
> >>>>>>>
> >>>>>>> best regards.
> >>>>>>>
> >>>>>>> -C.B.
> >>>>>>>
> >>>>>>> On Mon, Mar 17, 2008 at 10:31 AM, Michael McCandless <
> >>>>>>> lucene@mikemccandless.com> wrote:
> >>>>>>>
> >>>>>>>>
> >>>>>>>> The trunk version of Lucene (eventually 2.4) now has
> >>>>>>>> deletion by
> >>>>>>>> query, in IndexWriter.
> >>>>>>>>
> >>>>>>>> Mike
> >>>>>>>>
> >>>>>>>> Cam Bazz wrote:
> >>>>>>>>
> >>>>>>>>> Hello Erick,
> >>>>>>>>>
> >>>>>>>>> Has anyone found a way for deleting a document with
a query? I
> >>>>>>>>> understand it
> >>>>>>>>> can be deleted via terms, but I need to delete a
document
> >>>>>>>>> with two
> >>>>>>>>> terms,
> >>>>>>>>> that is the only way I can identify my document
is by
> >>>>>>>>> looking at
> >>>>>>>>> two terms
> >>>>>>>>> not one.
> >>>>>>>>>
> >>>>>>>>> best.
> >>>>>>>>>
> >>>>>>>>> On Fri, Mar 14, 2008 at 4:58 PM, Erick Erickson
> >>>>>>>>> <erickerickson@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Doc IDs are assigned at index time and can change
over time
> >>>>>>>>>> That
> >>>>>>>>>> is,
> >>>>>>>>>> deleting
> >>>>>>>>>> a document and optimizing (and other operations)
can and will
> >>>>>>>>>> change
> >>>>>>>>>> document IDs. So, yes, you have to do a search
(either use a
> >>>>>>>>>> hits
> >>>>>>>>>> object
> >>>>>>>>>> or one of the HitCollectors) in order to delete
by doc ID.
> >>>>>>>>>>
> >>>>>>>>>> You can also delete by terms, see the API.
> >>>>>>>>>>
> >>>>>>>>>> There are other options, but you haven't explianed
what
> >>>>>>>>>> you're
> >>>>>>>>>> trying to accomplish enough to offer any more
suggestions.
> >>>>>>>>>>
> >>>>>>>>>> Best
> >>>>>>>>>> Erick
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Mar 12, 2008 at 5:44 PM, varun sood
> >>>>>>>>>> <vsood2@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> No. I haven't but I will. even though I
would like to
> >>>>>>>>>>> make my
> >>>>>>>>>>> own
> >>>>>>>>>>> implementation. So any idea of how to get
the "doc num"?
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for replying.
> >>>>>>>>>>> Varun
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Mar 12, 2008 at 5:15 PM, Mark Miller
> >>>>>>>>>>> <markrmiller@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Have you seen the work that Mark Harwood
has done making a
> >>>>>>>>>>>> GWT
> >>>>>>>>>>>> version
> >>>>>>>>>>>> of Luke? I think its in the latest release.
> >>>>>>>>>>>>
> >>>>>>>>>>>> varun sood wrote:
> >>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>   I am trying to delete a document
without using the hits
> >>>>>>>>>>>>> object.
> >>>>>>>>>>>>> What is the unique field in the
index that I can use to
> >>>>>>>>>>>>> delete the
> >>>>>>>>>>>> document?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I am trying to make a web interface
where index can be
> >>>>>>>>>>>>> modified,
> >>>>>>>>>>> smaller
> >>>>>>>>>>>>> subset of what Luke does but using
JSPs and Servlet.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> to use deleteDocument(int docNum)
> >>>>>>>>>>>>> I need docNum how can I get this?
or does it have to come
> >>>>>>>>>>>>> only vis
> >>>>>>>>>>> Hits?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Varun
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> -----------------------------------------------------------
> >>>>>>>>>>>> --
> >>>>>>>>>>>> --
> >>>>>>>>>>>> --
> >>>>>>>>>>>> --
> >>>>>>>>>>>> --
> >>>>>>>>>>>> To unsubscribe, e-mail: java-user-
> >>>>>>>>>>>> unsubscribe@lucene.apache.org
> >>>>>>>>>>>> For additional commands, e-mail: java-user-
> >>>>>>>>>>>> help@lucene.apache.org
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> ---------------------------------------------------------------
> >>>>>>>> --
> >>>>>>>> --
> >>>>>>>> --
> >>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>>>> For additional commands, e-mail: java-user-
> >>>>>>>> help@lucene.apache.org
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> -----------------------------------------------------------------
> >>>>>> --
> >>>>>> --
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>>>
> >>>>
> >>>>
> >>>> -------------------------------------------------------------------
> >>>> --
> >>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>
> >>>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message