lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From software visualization <>
Subject Re: Using Lucene to search live, being-edited documents
Date Fri, 21 Jan 2011 19:59:24 GMT
Hi sorry for the long delay.

The idea is that a single user is editing a single document. As they edit,
any indexes built against the document become stale, actually wrong.
Example:  references to specific localities within this document are all
instantly wrong the first time a user types a new beginning  character-
they're all off by one. Deleting  words is of course disastrous etc. etc.
 So our story is- we used to have this document nicely indexed and now we
have nothing useful.

Considering what Lucene does prior to indexing, stemming for instance,  I am
not sure no, I am quite sure I can't  recreate the same powerful indexing

But it seems wrong  to lure our users into opening this document with
promises that this that and the other thing is has been located for them
only to invalidate all that just because they began to edit the document. I
understand why that happens , but my users are perhaps not as tech savvy and
I think it will just feel "wrong" to them.

So I am looking for a way around this.

On Tue, Jan 4, 2011 at 1:25 PM, adasal <> wrote:

> I would think this is more like it.
> But the essential thing, so it seems to me, is whether there is a
> requirement for a serialised index, i.e. a more permanent record, aside
> from
> the saved document.
> Then, if there is a penalty to creating the index compared to regex,
> stringsearch or so, it is justified on other grounds.
> I think it is an interesting q. when does that requirement emerge?
> There is size of document.
> But there would also be field types. I think I have this right. This is
> really a classification system, so more than bare regex.
> There must be other criteria that apply to this use case, too?
> Adam
> p.s. we (in my work project) are just beginning to use Lucene for geometry
> objects and I am looking forward to understanding its use better,
> including,
> possibly, expanding it to other use cases apart from geo objects.
> On 3 January 2011 15:31, Robert Muir <> wrote:
> > On Mon, Jan 3, 2011 at 10:16 AM, Grant Ingersoll <>
> > wrote:
> > > There is also the MemoryIndex, which is in contrib and is designed for
> > one document at a time.  That being said, basic grep/regex is probably
> fast
> > enough.
> > >
> >
> > In cases where you are doing a 'find' in a document similar to what a
> > wordprocessor would do (especially if you want to iterate
> > forwards/backwards through matches etc), you might want to consider
> > something like
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > For additional commands, e-mail:
> >
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message