lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Shane <sha...@LEXUM.UMontreal.CA>
Subject Is there a way to check for field "uniqueness" when indexing?
Date Thu, 13 Aug 2009 14:33:45 GMT
Hi all!

I'm currently running a big lucene index and one of my main concerns is 
the integrity of the data entered. A few things come to mind, like 
enforcing that certain fields be non-blank, forcing certain formats etc...

All these validations are easy to do with lucene, since I can validate 
the document before it is indexed or when it is retrieved.

The thing however that I have a hard time with, is field uniquness.

Lets say I have a field and I really want it to be unique. I can't seem 
to find out how to do it during the indexation phase since everything 
that is added to the index is not readable by an index reader until the 
index is closed.

Add to that the fact that items can be deleted from the index during the 
indexation and the only way I have to figure uniquness is to check every 
unique field values using termEnums and checking for docFreq.

This has a major disadvantage that I cannot inform people who are using 
the library of the unique conflit when it happens, only when the index 
is closed.

Does anyone have an idea on how I could check an index that is in the 
process of being indexed (things added, things deleted) for the uniquess 
of a given field *at the time I index a document* ?

Daniel Shane

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message