lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Willnauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-2571) Indexing performance tests with realtime branch
Date Thu, 14 Apr 2011 16:23:05 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019890#comment-13019890
] 

Simon Willnauer commented on LUCENE-2571:
-----------------------------------------

I run batch indexing benchmarks trunk vs. realtime branch with addDocument and with updateDocument.


For add document I indexed 10M wikipedia docs into a spinning disk reading from a separate
SSD

Here is the realtime graph:
!wikimedium.realtime.Standard.nd10M_dps_addDocuments.png!

vs. trunk:
!wikimedium.trunk.Standard.nd10M_dps_addDocuments.png!

This graph shows how DWPT is flushing to disk over time:

!wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png!

for updateDocument I build a 10M docs wiki index and indexed the exact same documents with
updateDocument here are the results:
Realtime Branch:
!wikimedium.realtime.Standard.nd10M_dps.png!

trunk:
!wikimedium.trunk.Standard.nd10M_dps.png!



> Indexing performance tests with realtime branch
> -----------------------------------------------
>
>                 Key: LUCENE-2571
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2571
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: Index
>            Reporter: Michael Busch
>            Priority: Minor
>             Fix For: Realtime Branch
>
>         Attachments: wikimedium.realtime.Standard.nd10M_dps.png, wikimedium.realtime.Standard.nd10M_dps_addDocuments.png,
wikimedium.realtime.Standard.nd10M_dps_addDocuments_flush.png, wikimedium.trunk.Standard.nd10M_dps.png,
wikimedium.trunk.Standard.nd10M_dps_addDocuments.png
>
>
> We should run indexing performance tests with the DWPT changes and compare to trunk.
> We need to test both single-threaded and multi-threaded performance.
> NOTE:  flush by RAM isn't implemented just yet, so either we wait with the tests or flush
by doc count.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message