lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steven Parkes" <steven_par...@esseff.org>
Subject RE: [jira] Commented: (LUCENE-847) Factor merge policy out of IndexWriter
Date Sun, 25 Mar 2007 17:31:17 GMT
Yes, I'll separate out issues related to the basic refactor before
submitting a candidate patch. I actually thought it might be helpful to
keep it in the rough version to see context. But I can do that at any
time ...

With the factored merge policy, it's easy enough to create a merge
policy on size parallel to the one on docs. Hmmm ... suppose one could
use derivation of one from the other or from a common base given the
appropriate factoring of "size" in the algorithm.

I really want to do some larger tests of this. I've downloaded Wikipedia
and plan to add support for it in the benchmarker stuff (if anyone else
is doing this, can you stop me now?) I figure I'll try it on my main
machine and my laptop. My main machine has a lot of RAM and I wonder how
big the corpus has to get before you see signficant changes.

-----Original Message-----
From: Michael McCandless (JIRA) [mailto:jira@apache.org] 
Sent: Sunday, March 25, 2007 5:47 AM
To: java-dev@lucene.apache.org
Subject: [jira] Commented: (LUCENE-847) Factor merge policy out of
IndexWriter


    [
https://issues.apache.org/jira/browse/LUCENE-847?page=com.atlassian.jira
.plugin.system.issuetabpanels:comment-tabpanel#action_12483929 ] 

Michael McCandless commented on LUCENE-847:
-------------------------------------------

Steven, I looked through the patch quickly.  It looks great!  First
some general comments and then I'll add more specifics as
separate comments.

Can you open separate issues for the other new and interesting merge
policies here?  I think the refactoring of merge policy plus creation
of the default policy that is identical to today's merge policy, which
should be a fairly quick and low-risk operation, would then remain
under this issue?

Then, iterating / vetting / debugging the new interesting merge
policies can take longer under their own separate issues and time
frame.

On staging I think we could first do this issue (decouple MergePolicy
from writer), then LUCENE-845 because it blocks LUCENE-843 (which
would then be fixing LogarithmicMergePolicy to use segment sizes
instead of docs counts as basis for determing levels) then LUCENE-843
(performance improvements for how writer uses RAM)?




> Factor merge policy out of IndexWriter
> --------------------------------------
>
>                 Key: LUCENE-847
>                 URL: https://issues.apache.org/jira/browse/LUCENE-847
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Steven Parkes
>         Assigned To: Steven Parkes
>         Attachments: LUCENE-847.txt
>
>
> If we factor the merge policy out of IndexWriter, we can make it
pluggable, making it possible for apps to choose a custom merge policy
and for easier experimenting with merge policy variants.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message