lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Bell <arach...@gmail.com>
Subject Re: Beginner's questions
Date Wed, 27 Mar 2013 12:58:12 GMT
Thanks, Sashidar.

In response to your "P.S.", you raise a reasonable point. I agree that it
would probably be a vain undertaking to try to keep the search index in
real-time sync with the database. I suppose this relationship is rooted in
something like the application's SLA, e.g., "query results will reflect
last night's indexing run," etc.

As to the ideas raised in the links you pointed me to: the first link shows
the instantiation of a Term object via

   writer.UpdateDocument(new Term("IDField", *id*), doc);

yet in the 4.2.0 docs I see no Term constructor that allows this "id"
field.

But this raises an interesting question: is it possible to tell Lucene that
the Document I've given it to index has a specific identifier? Here's an
example of what I mean. Suppose that the DB in question is a NoSQL type of
the graph flavor. I add a vertex to that graph. The vertex contains some
properties, e.g., name and type, whose values are text strings. I want
Lucene to index these data AND I want to know some kind of identifier for
that vertex Document. I would prefer to give Lucene that ID, though I might
be able to tolerate it giving it to me.

I suppose this is a fundamental question whose flip-side is seen when you
query the index. My concern is with what Lucene returns when I query. Given
the private or proprietary nature of the DB, I need Lucene to give me back
a meaningful identifier, one that I can use to identify the vertex (or
vertices) that matched the query.

Can you shed any light on this issue?

Thanks again.

-Paul


On Tue, Mar 26, 2013 at 11:46 PM, Sashidhar Guntury <
sashidhar.moony@gmail.com> wrote:

> hi,
>
> I think this stack overflow question might be of some help to you-
> http://stackoverflow.com/questions/2842500/updating-lucene-index
>
> Note that the constructor method has changed and you might have to specify
> the append mode in the indexWriterConfig method. Take a look at this -
>
> http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/IndexWriterConfig.html
>
> You can then update the document by the term according to this function -
>
> http://lucene.apache.org/core/4_2_0/core/org/apache/lucene/index/IndexWriter.html#updateDocument(org.apache.lucene.index.Term
> ,
> java.lang.Iterable)
>
> P.S - I'm new to Lucene and as such to the whole search engine world so I
> might be wrong but does it make sense to use lucene for a db that changes
> so often? The whole premise of using a search engine is to pre-process a
> huge chunk of data (that changes very infrequently) and then searching for
> things in it...
>
> sashidhar
>
> On Tue, Mar 26, 2013 at 11:24 PM, Paul <arachweb@gmail.com> wrote:
>
> > Hi All,
> >
> > I've just begun to get my feet wet with Lucene and have a few simple
> > questions:
> >
> > 1. Must the index writer read and index files on disk, or can i create
> > documents in memory and ask the writer to index them?
> >
> > 2. I think I've seen examples of the behavior I asked about in (1). In
> > these examples the addDocument method's Document argument is created on
> the
> > fly (new Document(); document.add(String), etc.). In such cases is it
> > possible to continually update the index, e.g., because the index source
> > changes, I want to incorporate that change into the index? Let me try to
> > give a practical example of what I'm trying to say. The application has
> > access to a proprietary database. This DB is presently only in memory,
> not
> > on disk. The application knows how to read parts of that DB and can
> present
> > them to Lucene for indexing. As the DB changes, I'd like to reflect those
> > changes in the index built by Lucene.
> >
> > 3. I guess there are 3 kinds of DB changes to consider: additions,
> > deletions, and updates. So, when a new element is added to the DB, I want
> > to extract certain fields therefrom and have them indexed. When a row is
> > deleted, I'd want to tell the index that text it had indexed is no longer
> > valid, at least not at a certain location. And when a row is updated, I'd
> > want to replace a previously indexed value with its new value.
> >
> > Forgive me if these questions are put a bit awkwardly. But, knowing
> little
> > about Lucene, they're about as coherent as I can make them.
> >
> > Thank you.
> >
> > -Paul
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message