jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Papez (JIRA)" <j...@apache.org>
Subject [jira] Updated: (JCR-2357) Duplicate entries in the index
Date Fri, 16 Oct 2009 08:46:31 GMT

     [ https://issues.apache.org/jira/browse/JCR-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benjamin Papez updated JCR-2357:
--------------------------------

    Description: 
We are very frequently having problems with queries, because the results are always duplicated.
The reason for it has been identified to be in the indexing, because when we get the problem,
we now look at the index with Luke, and we can see that documents with the same UUID are there
twice. This happens with Jackrabbit 1.5.0, 1.6.0 and also with 2.0-alpha10. 

I am able to recreate the problem quite regular. We have some event listeners, which themselves
call the node.save or session.save method. I made a breakpoint in NodeIndexer.createDoc()
and stop when there is a specific name in the fullpath property. I can see that two listeners
we wrote ( FieldReferenceListener and RulesListener) lead to the NodeIndexer.createDoc() -
below are the stacktraces (based on Jackrabbit 2.0-alpha10)

 NodeIndexer.createDoc() line: 247 
 SearchIndex.createDocument(NodeState, NamespaceMappings, IndexFormatVersion) line: 1122 
 SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 635 
 SearchManager.onEvent(EventIterator) line: 459 
 EventConsumer.consumeEvents(EventStateCollection) line: 244 
 ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
 EventStateCollection.dispatch() line: 455 
 SharedItemStateManager$Update.end() line: 756 
 SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
 XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
 XAItemStateManager.update(ChangeLog) line: 354 
 XAItemStateManager(LocalItemStateManager).update() line: 326 
 SessionItemStateManager.update() line: 326 
 NodeImpl(ItemImpl).save() line: 1098 
 FieldReferenceListener.updateFullPath(Node) line: 193 
 FieldReferenceListener.onEvent(EventIterator) line: 101 
 EventConsumer.consumeEvents(EventStateCollection) line: 244 
 ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
 EventStateCollection.dispatch() line: 455 
 SharedItemStateManager$Update.end() line: 756 
 SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
 XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
 XAItemStateManager.update(ChangeLog) line: 354 
 XAItemStateManager(LocalItemStateManager).update() line: 326 
 SessionItemStateManager.update() line: 326 
 NodeImpl(ItemImpl).save() line: 1098 


 NodeIndexer.createDoc() line: 247 
 SearchIndex.createDocument(NodeState, NamespaceMappings, IndexFormatVersion) line: 1122 
 SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 635 
 SearchManager.onEvent(EventIterator) line: 459 
 EventConsumer.consumeEvents(EventStateCollection) line: 244 
 ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
 EventStateCollection.dispatch() line: 455 
 SharedItemStateManager$Update.end() line: 756 
 SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
 XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
 XAItemStateManager.update(ChangeLog) line: 354 
 XAItemStateManager(LocalItemStateManager).update() line: 326 
 SessionItemStateManager.update() line: 326 
 NodeImpl(ItemImpl).save() line: 1098 
 XASessionImpl(SessionImpl).save() line: 928 
 RulesListener.onEvent(EventIterator) line: 376 
 EventConsumer.consumeEvents(EventStateCollection) line: 244 
 ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
 EventStateCollection.dispatch() line: 455 
 SharedItemStateManager$Update.end() line: 756 
 SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
 XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
 XAItemStateManager.update(ChangeLog) line: 354 
 XAItemStateManager(LocalItemStateManager).update() line: 326 
 SessionItemStateManager.update() line: 326 
 NodeImpl(ItemImpl).save() line: 1098 


Now when continuing to debug after the second NodeIndexer.createDoc() I see that the node
with the watched UUID is also passed within the remove collection into  SearchIndex.updateNodes(Iterator<NodeId>,
Iterator<NodeState>) , but when looking at the index with Luke before and after finishing
this SearchIndex.updateNodes, I see that the first document with the UUID was NOT deleted
and then we always have two documents with the same UUID.

Now it would be interesting, why the deletion failed. When debugging DeleteNode.exceute and
stopping at the watched UUID, I come down the following methods:

CommittableIndexReader(IndexReader).deleteDocuments(Term) line: 873	
PersistentIndex(AbstractIndex).removeDocument(Term) line: 225	
PersistentIndex.removeDocument(Term) line: 90	
MultiIndex$DeleteNode.execute(MultiIndex) line: 1951	
MultiIndex.executeAndLog(MultiIndex$Action) line: 1084	
MultiIndex.update(Collection<NodeId>, Collection<Document>) line: 442	
SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 645	

What I can see in IndexReader.deleteDocuments(Term) is that docs.next() returns true, but
docs.doc() then returns 0 as the doc-id, which I think is the cause that the right doc-id
is not deleted. 

Any clues what goes wrong here or hints what else I could watch for on debugging ?

  was:
We are very frequently having problems with queries, because the results are always duplicated.
The reason for it has been identified to be in the indexing, because when we get the problem,
we now look at the index with Luke, and we can see that documents with the same UUID are there
twice. This happens with Jackrabbit 1.5.0, 1.6.0 and also with 2.0-alpha10. 

I am able to recreate the problem quite regular. We have some event listeners, which themselves
call the node.save or session.save method. I made a breakpoint in NodeIndexer.createDoc()
and stop when there is a specific name in the fullpath property. I can see that two listeners
we wrote ( FieldReferenceListener and RulesListener) lead to the NodeIndexer.createDoc() -
below are the stacktraces (based on Jackrabbit 2.0-alpha10)

 NodeIndexer.createDoc() line: 247 
 SearchIndex.createDocument(NodeState, NamespaceMappings, IndexFormatVersion) line: 1122 
 SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 635 
 SearchManager.onEvent(EventIterator) line: 459 
 EventConsumer.consumeEvents(EventStateCollection) line: 244 
 ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
 EventStateCollection.dispatch() line: 455 
 SharedItemStateManager$Update.end() line: 756 
 SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
 XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
 XAItemStateManager.update(ChangeLog) line: 354 
 XAItemStateManager(LocalItemStateManager).update() line: 326 
 SessionItemStateManager.update() line: 326 
 NodeImpl(ItemImpl).save() line: 1098 
 FieldReferenceListener.updateFullPath(Node) line: 193 
 FieldReferenceListener.onEvent(EventIterator) line: 101 
 EventConsumer.consumeEvents(EventStateCollection) line: 244 
 ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
 EventStateCollection.dispatch() line: 455 
 SharedItemStateManager$Update.end() line: 756 
 SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
 XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
 XAItemStateManager.update(ChangeLog) line: 354 
 XAItemStateManager(LocalItemStateManager).update() line: 326 
 SessionItemStateManager.update() line: 326 
 NodeImpl(ItemImpl).save() line: 1098 


 NodeIndexer.createDoc() line: 247 
 SearchIndex.createDocument(NodeState, NamespaceMappings, IndexFormatVersion) line: 1122 
 SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 635 
 SearchManager.onEvent(EventIterator) line: 459 
 EventConsumer.consumeEvents(EventStateCollection) line: 244 
 ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
 EventStateCollection.dispatch() line: 455 
 SharedItemStateManager$Update.end() line: 756 
 SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
 XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
 XAItemStateManager.update(ChangeLog) line: 354 
 XAItemStateManager(LocalItemStateManager).update() line: 326 
 SessionItemStateManager.update() line: 326 
 NodeImpl(ItemImpl).save() line: 1098 
 XASessionImpl(SessionImpl).save() line: 928 
 RulesListener.onEvent(EventIterator) line: 376 
 EventConsumer.consumeEvents(EventStateCollection) line: 244 
 ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
 EventStateCollection.dispatch() line: 455 
 SharedItemStateManager$Update.end() line: 756 
 SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
 XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
 XAItemStateManager.update(ChangeLog) line: 354 
 XAItemStateManager(LocalItemStateManager).update() line: 326 
 SessionItemStateManager.update() line: 326 
 NodeImpl(ItemImpl).save() line: 1098 


Now when continuing to debug after the second NodeIndexer.createDoc() I see that the node
with the watched UUID is also passed within the remove collection into  SearchIndex.updateNodes(Iterator<NodeId>,
Iterator<NodeState>) , but when looking at the index with Luke before and after finishing
this SearchIndex.updateNodes, I see that the first document with the UUID was NOT deleted
and then we always have two documents with the same UUID.

Now it would be interesting, why the deletion failed. When debugging DeleteNode.exceute and
stopping at the watched UUID, I come down the following methods:

CommittableIndexReader(IndexReader).deleteDocuments(Term) line: 873	
PersistentIndex(AbstractIndex).removeDocument(Term) line: 225	
PersistentIndex.removeDocument(Term) line: 90	
MultiIndex$DeleteNode.execute(MultiIndex) line: 1951	
MultiIndex.executeAndLog(MultiIndex$Action) line: 1084	
MultiIndex.update(Collection<NodeId>, Collection<Document>) line: 442	
SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 645	

What I can in IndexReader.deleteDocuments(Term) is that docs.next() returns true, but docs.doc()
then returns 0 as the doc-id, which I think is the cause that the right doc-id is not deleted.


Any clues what goes wrong here or hints what else I could watch for on debugging ?


> Duplicate entries in the index
> ------------------------------
>
>                 Key: JCR-2357
>                 URL: https://issues.apache.org/jira/browse/JCR-2357
>             Project: Jackrabbit Content Repository
>          Issue Type: Bug
>          Components: indexing
>    Affects Versions: 1.5.0, 1.6.0, 2.0-alpha9
>            Reporter: Benjamin Papez
>            Priority: Critical
>
> We are very frequently having problems with queries, because the results are always duplicated.
The reason for it has been identified to be in the indexing, because when we get the problem,
we now look at the index with Luke, and we can see that documents with the same UUID are there
twice. This happens with Jackrabbit 1.5.0, 1.6.0 and also with 2.0-alpha10. 
> I am able to recreate the problem quite regular. We have some event listeners, which
themselves call the node.save or session.save method. I made a breakpoint in NodeIndexer.createDoc()
and stop when there is a specific name in the fullpath property. I can see that two listeners
we wrote ( FieldReferenceListener and RulesListener) lead to the NodeIndexer.createDoc() -
below are the stacktraces (based on Jackrabbit 2.0-alpha10)
>  NodeIndexer.createDoc() line: 247 
>  SearchIndex.createDocument(NodeState, NamespaceMappings, IndexFormatVersion) line: 1122

>  SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 635

>  SearchManager.onEvent(EventIterator) line: 459 
>  EventConsumer.consumeEvents(EventStateCollection) line: 244 
>  ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
>  EventStateCollection.dispatch() line: 455 
>  SharedItemStateManager$Update.end() line: 756 
>  SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
>  XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
>  XAItemStateManager.update(ChangeLog) line: 354 
>  XAItemStateManager(LocalItemStateManager).update() line: 326 
>  SessionItemStateManager.update() line: 326 
>  NodeImpl(ItemImpl).save() line: 1098 
>  FieldReferenceListener.updateFullPath(Node) line: 193 
>  FieldReferenceListener.onEvent(EventIterator) line: 101 
>  EventConsumer.consumeEvents(EventStateCollection) line: 244 
>  ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
>  EventStateCollection.dispatch() line: 455 
>  SharedItemStateManager$Update.end() line: 756 
>  SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
>  XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
>  XAItemStateManager.update(ChangeLog) line: 354 
>  XAItemStateManager(LocalItemStateManager).update() line: 326 
>  SessionItemStateManager.update() line: 326 
>  NodeImpl(ItemImpl).save() line: 1098 
>  NodeIndexer.createDoc() line: 247 
>  SearchIndex.createDocument(NodeState, NamespaceMappings, IndexFormatVersion) line: 1122

>  SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 635

>  SearchManager.onEvent(EventIterator) line: 459 
>  EventConsumer.consumeEvents(EventStateCollection) line: 244 
>  ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
>  EventStateCollection.dispatch() line: 455 
>  SharedItemStateManager$Update.end() line: 756 
>  SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
>  XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
>  XAItemStateManager.update(ChangeLog) line: 354 
>  XAItemStateManager(LocalItemStateManager).update() line: 326 
>  SessionItemStateManager.update() line: 326 
>  NodeImpl(ItemImpl).save() line: 1098 
>  XASessionImpl(SessionImpl).save() line: 928 
>  RulesListener.onEvent(EventIterator) line: 376 
>  EventConsumer.consumeEvents(EventStateCollection) line: 244 
>  ObservationDispatcher.dispatchEvents(EventStateCollection) line: 198 
>  EventStateCollection.dispatch() line: 455 
>  SharedItemStateManager$Update.end() line: 756 
>  SharedItemStateManager.update(ChangeLog, EventStateCollectionFactory) line: 1102 
>  XAItemStateManager(LocalItemStateManager).update(ChangeLog) line: 351 
>  XAItemStateManager.update(ChangeLog) line: 354 
>  XAItemStateManager(LocalItemStateManager).update() line: 326 
>  SessionItemStateManager.update() line: 326 
>  NodeImpl(ItemImpl).save() line: 1098 
> Now when continuing to debug after the second NodeIndexer.createDoc() I see that the
node with the watched UUID is also passed within the remove collection into  SearchIndex.updateNodes(Iterator<NodeId>,
Iterator<NodeState>) , but when looking at the index with Luke before and after finishing
this SearchIndex.updateNodes, I see that the first document with the UUID was NOT deleted
and then we always have two documents with the same UUID.
> Now it would be interesting, why the deletion failed. When debugging DeleteNode.exceute
and stopping at the watched UUID, I come down the following methods:
> CommittableIndexReader(IndexReader).deleteDocuments(Term) line: 873	
> PersistentIndex(AbstractIndex).removeDocument(Term) line: 225	
> PersistentIndex.removeDocument(Term) line: 90	
> MultiIndex$DeleteNode.execute(MultiIndex) line: 1951	
> MultiIndex.executeAndLog(MultiIndex$Action) line: 1084	
> MultiIndex.update(Collection<NodeId>, Collection<Document>) line: 442	
> SearchIndex.updateNodes(Iterator<NodeId>, Iterator<NodeState>) line: 645

> What I can see in IndexReader.deleteDocuments(Term) is that docs.next() returns true,
but docs.doc() then returns 0 as the doc-id, which I think is the cause that the right doc-id
is not deleted. 
> Any clues what goes wrong here or hints what else I could watch for on debugging ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message