lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Trejkaz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-458) Merging may create duplicates if the JVM crashes half way through
Date Wed, 26 Oct 2005 23:51:55 GMT
    [ http://issues.apache.org/jira/browse/LUCENE-458?page=comments#action_12356029 ] 

Trejkaz commented on LUCENE-458:
--------------------------------

I was thinking more along the lines of...

1. open a reader, writer
2. read the document
3. write a marker marking that this document is the result of a move of another one
4. write the document
5. delete the original document
6. delete the marker
7. close the reader, writer

Then later on, when the reader opens an index and finds a marker, it goes and checks the location
the marker points at, and if the location is still there, it continues from step 5 again.

> Merging may create duplicates if the JVM crashes half way through
> -----------------------------------------------------------------
>
>          Key: LUCENE-458
>          URL: http://issues.apache.org/jira/browse/LUCENE-458
>      Project: Lucene - Java
>         Type: Bug
>     Versions: 1.4
>  Environment: Windows XP SP2, JDK 1.5.0_04 (crash occurred in this version.  We've updated
to 1.5.0_05 since, but discovered this issue with an older text index since.)
>     Reporter: Trejkaz

>
> In the past, our indexing process crashed due to a Hotspot compiler bug on SMP systems
(although it could happen with any bad native code.)  Everything picked up and appeared to
work, but now that it's a month later I've discovered an oddity in the text index.
> We have two documents which are identical in the text index.  I know we only stored it
once for two reasons.  First, we store the MD5 of every document into the hash and the MD5s
were the same.  Second, we store a GUID into each document which is generated uniquely for
each document.  The GUID and the MD5 hash on these two documents, as well as all other fields,
is exactly the same.
> My conclusion is that a merge was occurring at the point the JVM crashed, which is consistent
with the time the process crashed.  Is it possible that Lucene did the copy of this document
to the new location, and didn't get to delete the original?
> If so, I guess this issue should be prevented somehow.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message