lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject commit with only commitData
Date Sat, 24 Nov 2012 05:08:40 GMT
Hi

Today, you cannot call IW.commit(userData) twice, even if the userData's
content is not null (or different) in the two calls.
Is there any particular reason why we prevent someone from doing that?

For instance, when one works with a search and taxonomy indexes, we found
it useful to store some commit
data in both indexes to keep them in sync, so that e.g. when you reopen
both, you can make sure the two actually
match.
However, for some indexing sessions, no new categories will be added to the
index, therefore any commit that
will be called on TaxoWriter will silently be ignored, even if commitData
is passed.

I've asked around and discovered that more people had a need for that -
storing some global-application information which
e.g. denotes the state of this index in the overall app. Because commitData
cannot be used like that, they add a dummy
document to the index with that info, which they always update, and also
make sure to filter it out during search.

I don't think that adding dummy documents to the index is good, especially
not if you need to ensure they're filtered
out. Also, it's currently not possible to add dummy documents to the
taxonomy index, but let's leave that aside for now.

So, why shouldn't we let someone commit by only changing userData? What
would be the harm? I can see two ways to allow that:

1) If commit() is called and nothing has changed, don't create a new commit
point, only if commit(userData) is called.

2) Alternatively, remove userData from the commit() API (that will simplify
prpeareCommit API too !), and exchange with an
   IndexWriter.setCommitData() API, which will also mark that IW has
pending changes, and therefore must commit.

Maybe option #2 will make it clear to both users of IW (and us developers)
that the application requests to make a transaction
to this IW instance. It also removes the duplicate commit and prepareCommit
API.

Thoughts?

Shai

Mime
View raw message