Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 56525 invoked from network); 22 Nov 2009 18:57:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Nov 2009 18:57:48 -0000 Received: (qmail 34899 invoked by uid 500); 22 Nov 2009 18:57:47 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 34836 invoked by uid 500); 22 Nov 2009 18:57:47 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 34828 invoked by uid 99); 22 Nov 2009 18:57:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Nov 2009 18:57:47 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rcmuir@gmail.com designates 209.85.160.46 as permitted sender) Received: from [209.85.160.46] (HELO mail-pw0-f46.google.com) (209.85.160.46) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 22 Nov 2009 18:57:38 +0000 Received: by pwj17 with SMTP id 17so3369582pwj.5 for ; Sun, 22 Nov 2009 10:57:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=cdHdgo+AVkICnJJI+rX4dOel1UbkxjieA3nEbpHeMog=; b=KihNBlMIcCTwqTJXg3Q911X8laNF0KsCNt6YaxYfk+zgAzPGyrFqhuwFae3La+SX9W 480z2yPUxqO9mSaYQOoCGBDQr8/DlcgL3fqgTm9Sd+LsiJguLRSVp7WZyaxeDmMUMs6q X2/PE9PCNf2AOnw7/oy8GeYFatcDsSNs3hX00= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=fHXNVFf38XGXBwpIY8k9UukPlWtOx/ruVsKHbTKtr2ogfXrWnO5LSG4tw33vfthLjB N4Ao4192mZ4RYLGgF68lCXBRPKvl7fGHqwT1QGLfNyuHjYzjYbY8QUafe25legafNblB VT/OaRwtfAPUq3yejGY4a9XHQQw1zD9u9r0C4= MIME-Version: 1.0 Received: by 10.115.100.13 with SMTP id c13mr6768961wam.65.1258916237055; Sun, 22 Nov 2009 10:57:17 -0800 (PST) In-Reply-To: <9ac0c6aa0911221033l3d7e1246o69e6a27048c563dc@mail.gmail.com> References: <20091122145202.A34D323889C8@eris.apache.org> <8f0ad1f30911220722n5357f2cfu5b3da9381bfafff0@mail.gmail.com> <9ac0c6aa0911220823n3e35374fr1a4776ba7ce29cb8@mail.gmail.com> <8f0ad1f30911220831o14c69eb9h99a12e836b7a7353@mail.gmail.com> <9ac0c6aa0911221005p214763fdq458ad555094b831a@mail.gmail.com> <8f0ad1f30911221011r3a2ce9d9s6ec91061e71517e9@mail.gmail.com> <9ac0c6aa0911221033l3d7e1246o69e6a27048c563dc@mail.gmail.com> From: Robert Muir Date: Sun, 22 Nov 2009 13:56:57 -0500 Message-ID: <8f0ad1f30911221056s1b0a9b5kbded8553aca13d99@mail.gmail.com> Subject: Re: svn commit: r883088 - in /lucene/java/branches/flex_1458/src/java/org/apache/lucene/index: TermRef.java codecs/standard/StandardTermsDictReader.java To: java-dev@lucene.apache.org Content-Type: multipart/alternative; boundary=0016e64d93c01c9e4a0478fa47fa X-Virus-Checked: Checked by ClamAV on apache.org --0016e64d93c01c9e4a0478fa47fa Content-Type: text/plain; charset=UTF-8 ok, I only ask because some rework of this enum could be necessary to take advantage of the new api. examples include changing it to use char[] (easy) to prevent lots of string creation, which was unavoidable with TermEnum since it is based on string. i will never mention this again, but it could also run on byte[] pretty easily. However I think high-level processing like this should use utf-16 processing, as java intended, although I'm pretty positive it would be extremely fast. On Sun, Nov 22, 2009 at 1:33 PM, Michael McCandless < lucene@mikemccandless.com> wrote: > I think you should keep doing all LUCENE-1606 work (and, any other > issues) on trunk, and then we merge down to flex branch once it's > committed? > > We shouldn't hold up any trunk features because flex is > coming... merging down every so often seems manageable so far (Mark?). > > I'm hoping to finish flex soonish -- largely what remains (I think!) > is better testing (correctness & performance) of the 4-way > combinations. I think the codecs approach is generally working > well.. the fact that we have initial Pulsing & PforDelta codecs > working is great. > > Mike > > On Sun, Nov 22, 2009 at 1:11 PM, Robert Muir wrote: > > Mike, I guess what I am implying is should i even bother with lucene-1606 > > and trunk? > > > > or instead, should i be helping you, looking at TermsEnum, and working on > > integrating it into flex? > > > > On Sun, Nov 22, 2009 at 1:05 PM, Michael McCandless > > wrote: > >> > >> On Sun, Nov 22, 2009 at 11:31 AM, Robert Muir wrote: > >> > >> >> No, not really... just an optimization I found when hunting ;) > >> >> > >> >> I'm working now on an AutomatonTermsEnum that uses the flex API > >> >> directly, to test that performance. > >> >> > >> > > >> > I didn't mean to 'bail out' on this > >> > >> You didn't 'bail out'; I 'bailed in' ;) This is the joy of open > >> source... great big noisy Bazaar. > >> > >> > but I could not tell if TermsEnum was close to stabilized > >> > >> I think it's close; we need to do this port anyway, once automaton is > >> committed to trunk, so really I saved Mark some work ;) > >> > >> > and it might be significant work to convert it? > >> > >> It wasn't too bad, but maybe you can look it over once I post patch > >> and see if I messed anything up :) > >> > >> > Maybe benching numeric range would be easier and accomplish the same > >> > thing? > >> > >> Yeah benching NRQ would be good too... many benchmarks still to run. > >> > >> Mike > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > >> For additional commands, e-mail: java-dev-help@lucene.apache.org > >> > > > > > > > > -- > > Robert Muir > > rcmuir@gmail.com > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > > -- Robert Muir rcmuir@gmail.com --0016e64d93c01c9e4a0478fa47fa Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable ok, I only ask because some rework of this enum could be necessary to take = advantage of the new api.

examples include changing it to use char[]= (easy) to prevent lots of string creation, which was unavoidable with Term= Enum since it is based on string.

i will never mention this again, but it could also run on byte[] pretty= easily.
However I think high-level processing like this should use utf= -16 processing, as java intended, although I'm pretty positive it would= be extremely fast.

On Sun, Nov 22, 2009 at 1:33 PM, Michael McC= andless <= lucene@mikemccandless.com> wrote:
I think you should keep doing all LUCENE-1606 work (and, any other
issues) on trunk, and then we merge down to flex branch once it's
committed?

We shouldn't hold up any trunk features because flex is
coming... merging down every so often seems manageable so far (Mark?).

I'm hoping to finish flex soonish -- largely what remains (I think!) is better testing (correctness & performance) of the 4-way
combinations. =C2=A0I think the codecs approach is generally working
well.. the fact that we have initial Pulsing & PforDelta codecs
working is great.

Mike

On Sun, Nov 22, 2009 at 1:11 PM, Robert Muir <rcmuir@gmail.com> wrote:
> Mike, I guess what I am implying is should i even bother with lucene-1= 606
> and trunk?
>
> or instead, should i be helping you, looking at TermsEnum, and working= on
> integrating it into flex?
>
> On Sun, Nov 22, 2009 at 1:05 PM, Michael McCandless
> <lucene@mikemccandless= .com> wrote:
>>
>> On Sun, Nov 22, 2009 at 11:31 AM, Robert Muir <rcmuir@gmail.com> wrote:
>>
>> >> No, not really... just an optimization I found when hunti= ng ;)
>> >>
>> >> I'm working now on an AutomatonTermsEnum that uses th= e flex API
>> >> directly, to test that performance.
>> >>
>> >
>> > I didn't mean to 'bail out' on this
>>
>> You didn't 'bail out'; I 'bailed in' ;) =C2=A0= This is the joy of open
>> source... great big noisy Bazaar.
>>
>> > but I could not tell if TermsEnum was close to stabilized
>>
>> I think it's close; we need to do this port anyway, once autom= aton is
>> committed to trunk, so really I saved Mark some work ;)
>>
>> > and it might be significant work to convert it?
>>
>> It wasn't too bad, but maybe you can look it over once I post = patch
>> and see if I messed anything up :)
>>
>> > Maybe benching numeric range would be easier and accomplish t= he same
>> > thing?
>>
>> Yeah benching NRQ would be good too... many benchmarks still to ru= n.
>>
>> Mike
>>
>> ------------------------------------------------------------------= ---
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
>
>
> --
> Robert Muir
> rcmuir@gmail.com
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org




--
Robert Muir=
rcmuir@gmail.com
--0016e64d93c01c9e4a0478fa47fa--