lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Commented: (LUCENE-1539) Improve Benchmark
Date Thu, 09 Apr 2009 10:18:12 GMT


Michael McCandless commented on LUCENE-1539:

This patch still has some noise, eg the unused *Property additions to PerfRunData, the nocommit
"first" logic in ReadTask.

On DeleteTaskByPercentTask: should it delete a pctg of the undeleted (numDocs()) docs or of
the total (maxDoc()) doc space?  Right now its implementation is dangerous, eg, if I delete
5% of the index and then 10%, that 10% delete will do nothing, since the docs it deletes will
fall onto the exact docs that the 5% had deleted.

It seems a bit awkward that DeleteByPercentTask needs to call
IR.undeleteAll before executing the deletes.

Oh, I see.  I don't think it should do that?  I think it should mean "delete XXX% of the remaining
undeleted docs"?

Also that
subsequent delete by percent calls in deletepercent.alg need to
open the latest version of the index rather than the original
(which does not have deletes)

This seems correct?  Ie the purpose of this task is "open the latest commit on the index,
delete XXX% of its undeleted docs".

This is due to
DirectoryIndexReader.acquireWriteLock checking to insure the
latest version of the index is locked. Perhaps we can relax
this? I would rather be able to open a commit point and delete
from the reader, then flush as the latest version.
I don't think we can relax that.  This (single transaction (writer) open at once) is a core
assumption in Lucene.

> Improve Benchmark
> -----------------
>                 Key: LUCENE-1539
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: contrib/benchmark
>    Affects Versions: 2.4
>            Reporter: Jason Rutherglen
>            Assignee: Michael McCandless
>            Priority: Minor
>             Fix For: 2.9
>         Attachments: LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch, LUCENE-1539.patch,,
>   Original Estimate: 336h
>  Remaining Estimate: 336h
> Benchmark can be improved by incorporating recent suggestions posted
> on java-dev. M. McCandless' Python scripts that execute multiple
> rounds of tests can either be incorporated into the codebase or
> converted to Java.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message