lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Zotter (JIRA)" <j...@apache.org>
Subject [jira] Updated: (SOLR-1927) DocBuilder Inefficiency
Date Tue, 25 May 2010 19:14:24 GMT

     [ https://issues.apache.org/jira/browse/SOLR-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robert Zotter updated SOLR-1927:
--------------------------------

    Description: 
I am looking into collectDelta method in DocBuilder.java and I noticed that
to determine the deltaRemoveSet it currently loops through the whole
deltaSet for each deleted row.

Does anyone else agree with the fact that this is quite inefficient?

For delta-imports with a large deltaSet and deletedSet I found a
considerable improvement in speed if we just save all deleted keys in a set.
Then we just have to loop through the deltaSet once to determine which rows
should be removed by checking if the deleted key set contains the delta row
key.


  was:
I am looking into collectDelta method in DocBuilder.java and I noticed that
to determine the deltaRemoveSet it currently loops through the whole
deltaSet for each deleted row. (Version 1.4.0 line 641)

Does anyone else agree with the fact that this is quite inefficient?

For delta-imports with a large deltaSet and deletedSet I found a
considerable improvement in speed if we just save all deleted keys in a set.
Then we just have to loop through the deltaSet once to determine which rows
should be removed by checking if the deleted key set contains the delta row
key.



> DocBuilder Inefficiency
> -----------------------
>
>                 Key: SOLR-1927
>                 URL: https://issues.apache.org/jira/browse/SOLR-1927
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 1.4
>            Reporter: Robert Zotter
>            Priority: Trivial
>
> I am looking into collectDelta method in DocBuilder.java and I noticed that
> to determine the deltaRemoveSet it currently loops through the whole
> deltaSet for each deleted row.
> Does anyone else agree with the fact that this is quite inefficient?
> For delta-imports with a large deltaSet and deletedSet I found a
> considerable improvement in speed if we just save all deleted keys in a set.
> Then we just have to loop through the deltaSet once to determine which rows
> should be removed by checking if the deleted key set contains the delta row
> key.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message