Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 86288 invoked from network); 15 Jun 2009 21:10:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Jun 2009 21:10:29 -0000 Received: (qmail 37814 invoked by uid 500); 15 Jun 2009 21:10:36 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 37749 invoked by uid 500); 15 Jun 2009 21:10:36 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 37711 invoked by uid 99); 15 Jun 2009 21:10:36 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Jun 2009 21:10:36 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of markrmiller@gmail.com designates 209.85.221.176 as permitted sender) Received: from [209.85.221.176] (HELO mail-qy0-f176.google.com) (209.85.221.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Jun 2009 21:10:23 +0000 Received: by qyk6 with SMTP id 6so73890qyk.29 for ; Mon, 15 Jun 2009 14:10:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=YChQCVVBTmAJs3QMnrdo99lY+zk4HhA5fNzM/AVXfOM=; b=qRQWDHB4ajsNtCSW0qcMblzdy7WfLiqAU7sZz3LmmuCTxg1pBEfhAvNIp/MIgyPf3/ WRKJiRXThNmjBAEFo5CnxAVpsRfhKO7MPRVyrQyQAVImlM0YWtUqNu8rC0Da8b1nXabz w0uwm9NqVCqoulg0U2NUiMA5+1nnbEqZiSBV0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=gePgs2azNtDdPTLjs4OOT5QkOH61WKWwPcOUFsBYNpwzuLxLtJdHfvj7WIOB5XLOQY VNqZvMdZlEKtSRh7hqWyLRQeY+VwZhsb5ESlhvyxHn3jK/myq1fZJrkLQwhbyeH4c0wu GWcwlAQI1GEZ1PsiwPGBUoArG49GiCY0pJR4M= Received: by 10.224.46.16 with SMTP id h16mr7719063qaf.179.1245100202203; Mon, 15 Jun 2009 14:10:02 -0700 (PDT) Received: from ?192.168.1.100? (ool-44c639d9.dyn.optonline.net [68.198.57.217]) by mx.google.com with ESMTPS id 6sm208184qwk.20.2009.06.15.14.10.00 (version=SSLv3 cipher=RC4-MD5); Mon, 15 Jun 2009 14:10:01 -0700 (PDT) Message-ID: <4A36B8A8.9070103@gmail.com> Date: Mon, 15 Jun 2009 17:10:00 -0400 From: Mark Miller User-Agent: Thunderbird 2.0.0.21 (X11/20090409) MIME-Version: 1.0 To: java-dev@lucene.apache.org Subject: Re: New Token API was Re: Payloads and TrieRangeQuery References: <85d3c3b60906091932i591ef6f4gcc950586b15d4506@mail.gmail.com> <8f0ad1f30906151031n5cbdf563vac86bd7fa784c315@mail.gmail.com> <4A368B2E.6000607@gmail.com> <8f0ad1f30906151121r1c1687a4o4efca0fa319ef01c@mail.gmail.com> <4A369892.1080401@gmail.com> <8f0ad1f30906151245p290bb3ddq6bf7af23e9b691f5@mail.gmail.com> <8f0ad1f30906151309i79ce7327hf42b3ec2cdf5227e@mail.gmail.com> <4A36AD53.5020908@gmail.com> <8f0ad1f30906151335v5d19fdd4la417f1d55b02ac5c@mail.gmail.com> <4A36B21B.7060705@gmail.com> <8f0ad1f30906151345l79e45792u87078a79f59b4224@mail.gmail.com> In-Reply-To: <8f0ad1f30906151345l79e45792u87078a79f59b4224@mail.gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I may do the Highlighter. Its annoying though - I'll have to break back compat because Token is part of the public API (Fragmenter, etc). Robert Muir wrote: > Michael OK, I plan on adding some tests for the analyzers that don't have any. > > I didn't try to migrate things such as highlighter, which are > definitely just as important, only because I'm not familiar with that > territory. > > But I think I can figure out what the various language analyzers are > trying to do and add tests / convert the remaining ones. > > On Mon, Jun 15, 2009 at 4:42 PM, Michael Busch wrote: > >> I agree. It's my fault, the task of changing the contribs (LUCENE-1460) is >> assigned to me for a while now - I just haven't found the time to do it yet. >> >> It's great that you started the work on that! I'll try to review the patch >> in the next couple of days and help with fixing the remaining ones. I'd like >> to get the AttributeSource improvements patch out first. I'll try that >> tonight. >> >> Michael >> >> On 6/15/09 1:35 PM, Robert Muir wrote: >> >> Michael, again I am terrible with such things myself... >> >> Personally I am impressed that you have the back compat, even if you >> don't change any code at all I think some reformatting of javadocs >> might make the situation a lot friendlier. I just listed everything >> that came to my mind immediately. >> >> I guess I will also mention that one of the reasons I might not use >> the new API is that since all filters, etc on the same chain must use >> the same API, its discouraging if all the contrib stuff doesn't >> support the new API, it makes me want to just stick with the old so >> everything will work. So I think contribs being on the new API is >> really important otherwise no one will want to use it. >> >> On Mon, Jun 15, 2009 at 4:21 PM, Michael Busch wrote: >> >> >> This is excellent feedback, Robert! >> >> I agree this is confusing; especially having a deprecated API and only a >> experimental one that replaces the old one. We need to change that. >> And I don't like the *useNewAPI*() methods either. I spent a lot of time >> thinking about backwards compatibility for this API. It's tricky to do >> without sacrificing performance. In API patches I find myself spending more >> time for backwards-compatibility than for the actual new feature! :( >> >> I'll try to think about how to simplify this confusing old/new API mix. >> >> However, we need to make the decisions: >> a) if we want to release this new API with 2.9, >> b) if yes to a), if we want to remove the old API in 3.0? >> >> If yes to a) and no to b), then we'll have to support both APIs for a >> presumably very long time, so we then need to have a better solution for the >> backwards-compatibility here. >> >> -Michael >> >> On 6/15/09 1:09 PM, Robert Muir wrote: >> >> let me try some slightly more constructive feedback: >> >> new user looks at TokenStream javadocs: >> http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc/org/apache/lucene/analysis/TokenStream.html >> immediately they see deprecated, text in red with the words >> "experimental", warnings in bold, the whole thing is scary! >> due to the use of 'e.g.' the javadoc for .incrementToken() is cut off >> in a bad way, and its probably the most important method to a new >> user! >> there's also a stray bold tag gone haywire somewhere, possibly >> .incrementToken() >> >> from a technical perspective, the documentation is excellent! but for >> a new user unfamiliar with lucene, its unclear exactly what steps to >> take: use the scary red experimental api or the old deprecated one? >> >> theres also some fairly advanced stuff such as .captureState and >> .restoreState that might be better in a different place. >> >> finally, the .setUseNewAPI() and .setUseNewAPIDefault() are confusing >> [one is static, one is not], especially because it states all streams >> and filters in one chain must use the same API, is there a way to >> simplify this? >> >> i'm really terrible with javadocs myself, but perhaps we can come up >> with a way to improve the presentation... maybe that will make the >> difference. >> >> On Mon, Jun 15, 2009 at 3:45 PM, Robert Muir wrote: >> >> >> Mark, I'll see if I can get tests produced for some of those analyzers. >> >> as a new user of the new api myself, I think I can safely say the most >> confusing thing about it is having the old deprecated API mixed in the >> javadocs with it :) >> >> On Mon, Jun 15, 2009 at 2:53 PM, Mark Miller wrote: >> >> >> Robert Muir wrote: >> >> >> Mark, I created an issue for this. >> >> >> >> Thanks Robert, great idea. >> >> >> I just think you know, converting an analyzer to the new api is really >> not that bad. >> >> >> >> I don't either. I'm really just complaining about the initial readability. >> Once you know whats up, its not too much different. I just have found myself >> having to refigure out whats up (a short task to be sure) over again after I >> leave it for a while. With the old one, everything was just kind of >> immediately self evident. >> >> That makes me think new users might be a little more confused when they >> first meet again. I'm not a new user though, so its only a guess really. >> >> >> reverse engineering what one of them does is not necessarily obvious, >> and is completely unrelated but necessary if they are to be migrated. >> >> I'd be willing to assist with some of this but I don't want to really >> work the issue if its gonna be a waste of time at the end of the >> day... >> >> >> >> The chances of this issue being fully reverted are so remote that I really >> wouldnt let that stop you ... >> >> >> On Mon, Jun 15, 2009 at 1:55 PM, Mark Miller wrote: >> >> >> >> Robert Muir wrote: >> >> >> >> As Lucene's contrib hasn't been fully converted either (and its been >> quite >> some time now), someone has probably heard that groan before. >> >> >> >> >> hope this doesn't sound like a complaint, >> >> >> >> Complaints are fine in any case. Every now and then, it might cause a >> little >> rant from me or something, but please don't let that dissuade you :) >> Who doesnt like to rant and rave now and then. As long as thoughts and >> opinions are coming out in a non negative way (which certainly includes >> complaints), >> I think its all good. >> >> >> >> but in my opinion this is >> because many do not have any tests. >> I converted a few of these and its just grunt work but if there are no >> tests, its impossible to verify the conversion is correct. >> >> >> >> >> Thanks for pointing that out. We probably get lazy with tests, especially >> in >> contrib, and this brings up a good point - we should probably push >> for tests or write them before committing more often. Sometimes I'm sure >> it >> just comes downto a tradeoff though - no resources at the time, >> the class looked clear cut, and it was just contrib anyway. But then here >> we >> are ... a healthy dose of grunt work is bad enough when you have tests to >> check it. >> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-dev-help@lucene.apache.org >> >> >> >> >> >> >> >> -- >> - Mark >> >> http://www.lucidimagination.com >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-dev-help@lucene.apache.org >> >> >> >> >> -- >> Robert Muir >> rcmuir@gmail.com >> >> >> >> >> >> >> >> >> >> > > > > -- - Mark http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org