lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: asking about index verification tools
Date Wed, 17 Nov 2010 20:39:10 GMT
How could there be such a tool? Consider the number of ways
that a given input stream can be defined. WordDelimiter, Stopwords,
synonyms, etc. Eventually, you'd reconstruct all of the logic embedded
in the analysis process in your checking program. Then you'd wonder
if that was correct.

There's quite a bit of testing done automatically, but that doesn't
really answer your question. You can easily create a special-purpose
tool that would spin through the terms for a given field and figure
out that they were what was expected, but then you have to define
expected terms.

So, you're stuck with just assuming it works or writing a custom tool
I think.


On Wed, Nov 17, 2010 at 5:17 AM, Yakob <> wrote:

> yes you're correct.but I was just wondering my chances here though.
> are there any tools that do this crosschecking of index?or else when
> you make a search engine then you just feel complacent about it and
> feel the crosschecking of index isn't really necessary? what do you do
> in this situation? :-)
> On 11/17/10, Anshum <> wrote:
> > Lance, CheckIndex would only check for the sanity of the index and not
> > really if all words from the source got added into the index or not.
> > CheckIndex would only check for corrupt indexes and in the process also
> take
> > a lot of time.
> > Perhaps what Yakob wanted here is just a cross check between the index
> and
> > the source.
> >
> > --
> --
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message