Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 66156 invoked from network); 11 Nov 2010 13:46:06 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 11 Nov 2010 13:46:06 -0000 Received: (qmail 58008 invoked by uid 500); 11 Nov 2010 13:46:35 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 57757 invoked by uid 500); 11 Nov 2010 13:46:34 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 57746 invoked by uid 99); 11 Nov 2010 13:46:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Nov 2010 13:46:34 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Nov 2010 13:46:34 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id oABDkEIr018016 for ; Thu, 11 Nov 2010 13:46:14 GMT Message-ID: <14053481.25971289483174131.JavaMail.jira@thor> Date: Thu, 11 Nov 2010 08:46:14 -0500 (EST) From: "Shai Erera (JIRA)" To: dev@lucene.apache.org Subject: [jira] Created: (LUCENE-2755) Some improvements to CMS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Some improvements to CMS ------------------------ Key: LUCENE-2755 URL: https://issues.apache.org/jira/browse/LUCENE-2755 Project: Lucene - Java Issue Type: Improvement Components: Index Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Fix For: 3.1, 4.0 While running optimize on a large index, I've noticed several things that got me to read CMS code more carefully, and find these issues: * CMS may hold onto a merge if maxMergeCount is hit. That results in the MergeThreads taking merges from the IndexWriter until they are exhausted, and only then that blocked merge will run. I think it's unnecessary that that merge will be blocked. * CMS sorts merges by segments size, doc-based and not bytes-based. Since the default MP is LogByteSizeMP, and I hardly believe people care about doc-based size segments anymore, I think we should switch the default impl. There are two ways to make it extensible, if we want: ** Have an overridable member/method in CMS that you can extend and override - easy. ** Have OneMerge be comparable and let the MP determine the order (e.g. by bytes, docs, calibrate deletes etc.). Better, but will need to tap into several places in the code, so more risky and complicated. On the go, I'd like to add some documentation to CMS - it's not very easy to read and follow. I'll work on a patch. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org