Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 81969 invoked from network); 12 May 2007 01:04:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 May 2007 01:04:23 -0000 Received: (qmail 14601 invoked by uid 500); 12 May 2007 01:04:23 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 14577 invoked by uid 500); 12 May 2007 01:04:23 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 14561 invoked by uid 99); 12 May 2007 01:04:23 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 May 2007 18:04:23 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of yseeley@gmail.com designates 64.233.166.182 as permitted sender) Received: from [64.233.166.182] (HELO py-out-1112.google.com) (64.233.166.182) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 May 2007 18:04:16 -0700 Received: by py-out-1112.google.com with SMTP id a25so916073pyi for ; Fri, 11 May 2007 18:03:55 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=aNJMk+Aq2q/4nsiJEBDZ+NseaIm1OIvZW3cRgKyDFmtT2Mt5cLIgktj9TIwQw5zwGG8RwYOa5Iupwt/+pulsTdG4fncY301hNLjtV9y3EXTWZGnA7n7B0oPEDl0GWIb1Q1eJ+Q39ew/lEJC+/O3FUx8gHxu7mYFGqjjhgEEH6gg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=EBvDi/scELg2zhiL12fFIPe75h7Bp9BynafmUV9m51Z0WfBFLv8KiovZJePpd8lOZJlYECVhJodPr735aeNr1mNHKwefVkpjrsfPnwOBm/ZnNoNPdPaqD/rkZkV7rylFtIkrzbHkngXJt5uvGjKkqGUlQ8pl9fb0dn3vdJXUbpU= Received: by 10.35.54.1 with SMTP id g1mr5950979pyk.1178931835929; Fri, 11 May 2007 18:03:55 -0700 (PDT) Received: by 10.35.98.7 with HTTP; Fri, 11 May 2007 18:03:55 -0700 (PDT) Message-ID: Date: Fri, 11 May 2007 21:03:55 -0400 From: "Yonik Seeley" Sender: yseeley@gmail.com To: java-dev@lucene.apache.org Subject: Re: Token/Payload API In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: X-Google-Sender-Auth: ce5d33abfb41c68f X-Virus-Checked: Checked by ClamAV on apache.org On 5/11/07, Grant Ingersoll wrote: > On May 11, 2007, at 4:31 PM, Yonik Seeley wrote: > > > I hadn't kept up with the payload discussion/patch, and just got > > around to looking at Token. > > > > public class Token implements Cloneable { > > String termText; // the text of the term > > int startOffset; // start in source text > > int endOffset; // end in source text > > String type = "word"; // lexical type > > > > Payload payload; > > > > > > It almost feels like we are going down the road of Field, adding more > > and more to the base class instead of using some other mechanism like > > inheritance. > > So PayloadToken would be more inline with what you are thinking? > Then there becomes the need to do instanceof to determine when you > have payloads? I don't have a good answer for that one... a real inheritance solution would be invasive to the indexing code and probably not worth it at this point. There is also the problem of mixing different (future) token properties... what you really want are mixins or something. At this point, just forget I brought it up ;-) > > A bigger problem, however, is that payloads will be lost by filters > > that aren't payload aware, and create new Tokens. We had the same > > problem with position increments being lost. > > > > For this latter problem, I think the answer is to *not* create new > > tokens, and make all the properties of Token settable. > > This seems reasonable. I never quite understood the need to create > new tokens. The other option may be to use a copy constructor, but > again, that seems wasteful. We have clone() when new tokens need to be created (that's needed when filters create more tokens, like synonym injection, etc). Since Token could be subclassed, that's probably the right approach. -Yonik --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org