lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Rutherglen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1750) Create a MergePolicy that limits the maximum size of it's segments
Date Tue, 21 Jul 2009 17:40:14 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12733740#action_12733740
] 

Jason Rutherglen commented on LUCENE-1750:
------------------------------------------

{quote}We cannot merge A w/ D, because the doc IDs need to be in
increasing order and retain the order they were added to the
index?{quote}

The segments are merged in order because they may be sharing doc
stores. I think we can refine this to only merge contiguous
segments that are sharing doc stores, otherwise we can merge
non-contiguous segments which continues with LUCENE-1076? 

When the shards are in their own directories (which is how Katta
works), the building process is somewhat easier as we're dealing
with a separate segmentInfos for each shard. I am not sure how
Solr would handle an index sharded into multiple directories. 

> Create a MergePolicy that limits the maximum size of it's segments
> ------------------------------------------------------------------
>
>                 Key: LUCENE-1750
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1750
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.4.1
>            Reporter: Jason Rutherglen
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: LUCENE-1750.patch
>
>   Original Estimate: 48h
>  Remaining Estimate: 48h
>
> Basically I'm trying to create largish 2-4GB shards using
> LogByteSizeMergePolicy, however I've found in the attached unit
> test segments that exceed maxMergeMB.
> The goal is for segments to be merged up to 2GB, then all
> merging to that segment stops, and then another 2GB segment is
> created. This helps when replicating in Solr where if a single
> optimized 60GB segment is created, the machine stops working due
> to IO and CPU starvation. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message