lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naveen kumar srikakolanu" <naveen.srikakol...@exensys.com>
Subject How can we search the database records using Lucene.net
Date Tue, 22 Aug 2006 05:37:07 GMT
Hi all,

How can we search the database records using Lucene.net

Please help me I am in urgent need of searching the database records.

Thanks in advance,

S. Naveen Kumar
Software Engineer
Exensys Software Solutions Ltd. 
Phone: +91-40-23392440 / 1

Fax:  +91-40-23391105 

Mobile: 9949519255

E-Mail: naveen.srikakolanu@exensys.com

Website: www.exensys.com  

 

---------------------Legal Disclaimer--------------------------------- 

Confidential Information:   The information contained in this mail is
confidential and protected from general disclosure. If the recipient or the
reader of this e-mail is not the intended recipient, or person responsible
to receive this e-mail, you are requested to delete this mail immediately
and do not disseminate or distribute or copy. If you have received this
e-mail by mistake, please notify us immediately by replying to the message
so that we can take appropriate action immediately and see to it that this
mistake is rectified. Any statement made in the mail is solely from the
organizational perspective. How ever, any statement made from the personal
perspective is not endorsed by the organization. 

-----Original Message-----
From: Leimbach, Johannes [mailto:JLeimbach@CONET.DE] 
Sent: Wednesday, August 09, 2006 11:52 AM
To: general@lucene.apache.org
Subject: AW: Need advice for doing incremental Index updates

Good morning Chris,

Thank you for your answer. 

I have though about using an external filetable to solve my problem, but I
don't like this idea very much either.

The problem is, that your lucene index might be very easily get corrupted
and out of sync. Imagine the external index gets lost or writing to it is
aborted while still updating the index. It seems like you can get very
easily inconsistencies. Can't I?

Though this will probably be the way I'll gonna go.. touching everything in
the index might be truly atomic but too slow.

To John:
I don't understand your question, can you post it again?

Bye,
Johannes

-----Urspr√ľngliche Nachricht-----
Von: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Gesendet: Dienstag, 8. August 2006 23:32
An: general@lucene.apache.org
Betreff: Re: Need advice for doing incremental Index updates


i would solve your problem external to the index ... everytime you run
your incrimental process, as you walk your directory tree of files (adding
the new ones, deleting/readdign the modified ones) record every file and
save that somewhere.  when you are all done, compare the list from this
run with the list from the last run -- any file in the old list and not in
hte new list is a document to be deleted.


: Date: Tue, 8 Aug 2006 15:48:16 +0200
: From: "Leimbach, Johannes" <JLeimbach@CONET.DE>
: Reply-To: general@lucene.apache.org
: To: general@lucene.apache.org
: Subject: Need advice for doing incremental Index updates
:
: Hello,
:
:
:
: I need some advice regarding incremental index updates.
:
:
:
: There are three cases I need to handle when iterating over the
: sourcefiles (files that need to be indexed):
:
: 1.	A file did not change since the last update
: 2.	A file did change since the last update
: 3.	A file was removed since the last update
:
:
:
: Case 1. is easy...
:
: Case 2. as well.. just remove the old file and add the new one
:
: Case 3. is bugging me..
:
:
:
: How can I find out if a file which is specified in the index, does not
: exist anymore?
:
:
:
: The blunt solution would be to retrieve *all* file paths from the index,
: and check whether each one exists. If so - go on, if the file does not
: exist on disk, remove it from the index. The problem I have with this
: is, that I am possibly pulling a lot of data from the lucene index. I
: will also do a lot of local filesystem checks. Sloooow?!
:
:
:
: Another idea I had is about introducing an "index version" integer. This
: number will be unique for each start of the parsing process. So each
: time my indexer program is started a new "index version" is created. Now
: each file which exists in the index and gets processed will have the
: "index version" number stored as a document field.
:
: This way all newly added and modified documents will have an up to date
: "index version" flag after indexing is complete.
:
: Now, to remove all physically deleted files from the index, I would
: select all documents which have an old "index version" flag stored
: inside them. Every document with such an old number can be safely
: removed.
:
: Problem with this solution is, that *every* document in the index will
: get updated: First the old index version field is removed, then the new
: field is added.
:
: On the plusside, removing deleted files will be very fast.
:
:
:
:
:
: What would you recommend for keeping an incremental update?
:
: I fear the first version will be utterly slow for small updates whereas
: the second version will be a lot faster - though adding stuff is slower
: because of the additional field update for every document.
:
:
:
: Thanks for your advice,
:
: Johannes :-)
:
:
:
:
:
:



-Hoss






Mime
View raw message