lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From petite_abeille <petite_abei...@mac.com>
Subject Re: rc4 and FileNotFoundException: an update
Date Sun, 28 Apr 2002 05:57:51 GMT
Hi Steven,

> Sounds like a pretty nasty situation.

It is...

> This makes sense - any effort to solve the problem will first involve 
> isolating the
> bug, and that's a task you're best suited for, since you know your
> system best.

Ok... From what I understand, this situation arise depending on my 
"usage pattern" of Lucene. For example, if I use it in "batch" mode (eg, 
through some tools to stress test my app by loading a couple of millions 
of objects), everything works perfectly fine. However, when running my 
app in a more "interactive" mode (eg, with user interaction, object 
indexing, writing and searching at the same time) I run into this 
exception very quickly. The problem, seems to have something to do with 
Searcher and/or how I'm using them. I need to investigate in that 
direction... Also, what it the "magic" formula for minimizing 
RandomAccessFile usage in Lucene to a very strict minimum? Is 
IndexWriter.mergeFactor the only parameter I can play with, or am I 
missing some other configuration that might help?

> Then post your code and ask if some of the more lucene-knowledgeable 
> can take a look.

Unfortunately, it's not that straightforward as I'm using Lucene as part 
of some sort of custom built oodbms and this behavior seems to be usage 
related... You can check the app at http://homepage.mac.com/zoe_info/ if 
that helps.

>  Re: index integrity, I agree that it would be really, really nice to 
> have some sort of "sanity" check.

I'm not familiar with Lucene internals, but is it conceivable to have 
some sort of checksum per document and/or index that will help to 
identify "corrupted" data?

> As for repairing an index, I think that's working sort of against the 
> grain of Lucene.

:-(

> In your case, it sounds like rebuilding the index is important, because 
> you're using Lucene as a data store.

Well, not exactly. I'm just using Lucene to index my data store (with a 
bunch of Field.Keyword and Field.Unstored). The actual object storage is 
handled externally to Lucene. However, I need a consistent index as I'm 
using it as part of my object tree.

> Maybe it'd be a better idea to figure out some way to use Lucene as the 
> indexing
> technology in a data store, the way traditional RDBMSes use indexes,
> for speeding access.

I agree. It's how I'm using it more or less. Nevertheless, for the sake 
of reliability, I need to have some level of confidence that the 
underlying indexes are "sane"... And a way to correct the problem if 
they are not. In my case, I will happily trade speed for reliability as 
I cannot afford to have inconsistent indexes. A corrupted index is of 
not use to me.

> Or possibly you should look at Xindice (http://xml.apache.org/xindice/) 
> which is an XML database.

I'm familiar with Xindice and other related toolboxes. However, I have 
some "peculiar" requirements, so I decided to custom made my own 
persistency layer. Works fine so far. Just this very annoying exception. 
Also this situation seems to arise on UNIX systems only as I never heard 
anybody complaining about it on any Windows type platforms... Very odd 
in any case...

Thanks for your help in any case.

PA


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message