Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 73118 invoked from network); 2 Dec 2010 11:19:31 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Dec 2010 11:19:31 -0000 Received: (qmail 98806 invoked by uid 500); 2 Dec 2010 11:19:30 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 98577 invoked by uid 500); 2 Dec 2010 11:19:30 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 98570 invoked by uid 99); 2 Dec 2010 11:19:30 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Dec 2010 11:19:30 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of serera@gmail.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pw0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 02 Dec 2010 11:19:25 +0000 Received: by pwj9 with SMTP id 9so1796488pwj.35 for ; Thu, 02 Dec 2010 03:19:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=LVUabn2BCpXAkE8ClJMxvt+PyCOnRUcY7m0DP+xysXU=; b=YcyU8T2GJcnC2PYzXgeHC7ne3Tr3/SYgaoWkuX5jqs+KTAJGbu2W2+4teUXjAw7dTn R9Uw2AWonzhqvkMumXmXYHEUr6u3f/sZHdrdAFgn6NnWImqT/w/7FYADCKnxkVSF/geE fttB+89CAIPitqiZBvUBQKyGAlUPQkFLw+hZ8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Gas0K1Vk70iGKOcZ9VaKTJJ6jS8H6AYunchxc1cUBnjqlVUQYbmhZGdaxEB2ebBMGE DS4aAdZgBIHpDwARSvhk6mGczvz4TGdohePSvQdw4C2lXPxjCcV35fD9nWlbPRhOQqD4 eRRv31rv6y3hzipk/K0Ov8Aer5hKt5JiTX6nI= MIME-Version: 1.0 Received: by 10.142.50.16 with SMTP id x16mr30245wfx.233.1291288745222; Thu, 02 Dec 2010 03:19:05 -0800 (PST) Received: by 10.142.204.7 with HTTP; Thu, 2 Dec 2010 03:19:05 -0800 (PST) In-Reply-To: References: Date: Thu, 2 Dec 2010 13:19:05 +0200 Message-ID: Subject: Re: Consolidate MP and LMP From: Shai Erera To: dev@lucene.apache.org Content-Type: multipart/alternative; boundary=000e0cd1b6eaf623f704966b96b7 --000e0cd1b6eaf623f704966b96b7 Content-Type: text/plain; charset=ISO-8859-1 > > You can't remove it on 3x, it's used by a host of deprecated methods > that access LMP's settings through IW. > Remove means deprecate in 3x and remove in trunk. Should have been more clear about that. For LMP is > just returns the value of getUseCompoundFile (that is, until Mike's > patch that switches off compounding for large segments). > As far as I can tell, getUseCompoundFile returns the same in trunk too. The noCFS setting is not applied there. Shai On Thu, Dec 2, 2010 at 1:14 PM, Michael McCandless < lucene@mikemccandless.com> wrote: > On Thu, Dec 2, 2010 at 4:43 AM, Simon Willnauer > wrote: > > > During the work on Column Stride Fields I was actually thinking that > > Compound vs. Non-Compound should not be a global decision since we now > > have codecs and each codec should use its own way of writing files. > > Maybe it would make things way easier if we expose CFS to codecs and > > let them decide what to do. I can imagine that I want to use CFS for > > some of the codecs like Column Stride or fields that are not used for > > searches but keep individual files per codec. Just an idea.... > > +1! > > This would be a nice simplification. > > EG, it's bizarre today that on flushing a new segment, which has > nothing to do with merging, we consult the MP to decide if we need CFS > or not. > > Also, it's awkward we have getCF and also getCFDocStore. In the > future (docvalues) we may also want to separately build CFS for those > files, or not. > > Making all these decisions private to the codec makes great sense. > It's then free to CFS however it wants to. But, the codec would need > wider context, I think the full SegmentInfos, to base its decision on. > EG, LMP now conditionally builds CFS only if the segment is > "smallish" relative to total index size. > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: dev-help@lucene.apache.org > > --000e0cd1b6eaf623f704966b96b7 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
= You can't remove it on 3x, it's used by a host of deprecated method= s
that access LMP's settings through IW.

Remove means deprecate in 3x and remove in trunk. Should have been more cle= ar about that.

For LMP is
just returns the value of getUseCompoundFile (that is, until Mike's
patch that switches off compounding for large segments).
As far as I can tell, getUseCompoundFile returns the same in trunk too. T= he noCFS setting is not applied there.

Shai

On Thu, Dec 2, 2010 at 1:14 PM, Michael McCandless <lucene@mikemccandless.com>= ; wrote:
simon.willnauer@googl= email.com> wrote:

> During the work on Column Stride Fields I was actually thinking that > Compound vs. Non-Compound should not be a global decision since we now=
> have codecs and each codec should use its own way of writing files. > Maybe it would make things way easier if we expose CFS to codecs and > let them decide what to do. I can imagine that I want to use CFS for > some of the codecs like Column Stride or fields that are not =A0used f= or
> searches but keep individual files per codec. Just an idea....

+1!

This would be a nice simplification.

EG, it's bizarre today that on flushing a new segment, which has
nothing to do with merging, we consult the MP to decide if we need CFS
or not.

Also, it's awkward we have getCF and also getCFDocStore. =A0In the
future (docvalues) we may also want to separately build CFS for those
files, or not.

Making all these decisions private to the codec makes great sense.
It's then free to CFS however it wants to. =A0But, the codec would need=
wider context, I think the full SegmentInfos, to base its decision on.
=A0EG, LMP now conditionally builds CFS only if the segment is
"smallish" relative to total index size.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


--000e0cd1b6eaf623f704966b96b7--