Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 3046 invoked from network); 17 Nov 2007 19:00:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 17 Nov 2007 19:00:43 -0000 Received: (qmail 97927 invoked by uid 500); 17 Nov 2007 19:00:25 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 97888 invoked by uid 500); 17 Nov 2007 19:00:25 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 97877 invoked by uid 99); 17 Nov 2007 19:00:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Nov 2007 11:00:24 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of pgwillia@uwaterloo.ca designates 129.97.152.18 as permitted sender) Received: from [129.97.152.18] (HELO services10.student.cs.uwaterloo.ca) (129.97.152.18) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Nov 2007 19:00:14 +0000 Received: from [192.168.1.111] (d66-222-166-22.abhsia.telus.net [66.222.166.22]) (authenticated bits=0) by services10.student.cs.uwaterloo.ca (8.13.8/8.13.8) with ESMTP id lAHJ02eV010928 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 17 Nov 2007 14:00:06 -0500 (EST) Message-ID: <473F3A32.9000202@uwaterloo.ca> Date: Sat, 17 Nov 2007 12:00:02 -0700 From: Tricia Williams User-Agent: Thunderbird 2.0.0.6 (X11/20071022) MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: Payloads, Tokenizers, and Filters. Oh My! References: <473E21AC.5000702@uwaterloo.ca> <70CC07CE-5F57-4884-8E23-35F541FCE1CD@apache.org> In-Reply-To: <70CC07CE-5F57-4884-8E23-35F541FCE1CD@apache.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-3.0 (services10.student.cs.uwaterloo.ca [129.97.152.13]); Sat, 17 Nov 2007 14:00:06 -0500 (EST) X-Miltered: at minos with ID 473F3A32.000 by Joe's j-chkmail (http://j-chkmail.ensmp.fr)! X-Virus-Scanned: ClamAV version 0.91.2, clamav-milter version 0.91.2 on localhost X-Virus-Status: Clean X-UUID: 4e808723-082c-45ef-9b21-007ab4e1917f X-Virus-Checked: Checked by ClamAV on apache.org Hi Grant, Thanks for your response! Taking a closer look at the TokenFilter(s) that causes my problem with the Payload are all from org.apache.solr.analysis rather than org.apache.lucene.analysis. I had originally thought that all the TokenFilters available through Solr's TokenFilterFactory(s) were part of Lucene. But I guess there are TokenFilters specific to Solr, such as the WordDelimiterFilter, that aren't aware of Payloads. Thanks for saying exactly the right thing to make me realize that. > I guess you just want to be careful about how big your payloads get. Erik Hatcher suggested storing the bulky XPath strings in a table of contents field and just storing a smaller representation of the information at each token with the intention of doing a lookup to get the bulky stuff at query time. > > One of the original use cases for payloads was for doing XPath queries. > Has anyone actually completed anything with XPath queries and Payloads? > Also, the only thing experimental about Payloads is the actual > signature of the methods, not the need for them. If anything, I think > you will see an expansion of payload capability in the future. Also > note, that you will probably be interested in adding more Payload > querying capability. And also note, I am in the process of adding the > ability to get payloads from Spans, but I am not sure if this gets > into 2.3 or not. > I look forward to seeing more of Payloads! I can already see how they can be extremely useful. Thanks, Tricia --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org