Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 34E41D313 for ; Mon, 24 Dec 2012 04:03:19 +0000 (UTC) Received: (qmail 38354 invoked by uid 500); 24 Dec 2012 04:03:17 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 38209 invoked by uid 500); 24 Dec 2012 04:03:15 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 38170 invoked by uid 99); 24 Dec 2012 04:03:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Dec 2012 04:03:13 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of goksron@gmail.com designates 209.85.210.177 as permitted sender) Received: from [209.85.210.177] (HELO mail-ia0-f177.google.com) (209.85.210.177) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Dec 2012 04:03:08 +0000 Received: by mail-ia0-f177.google.com with SMTP id u21so5676512ial.36 for ; Sun, 23 Dec 2012 20:02:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=7xxOVakODHZA3fyELmTjyfEvNcQvVZedOl+JqdCl8vw=; b=KBJnO7/I9GDMmwlz051qXOCGMWyQf6jgBXAR/OtIIVU915sOKQbDQvLe4QLY4IggcV FV5f4Kqm7xNaDKTSkKuDWAt1lM2cbJf+epuZP43FN7TMHzC+vqnNjTwxYi6bdfi7Y3+V cR6jcekkveU/NyqQz6ZH+IMcMznsj2n1mxZP5mpdYdIYtwen1FaEmtU92Wnai5ERdz4C zGZVuKIt6FzllYfM/NucUbEb2WbtXAiYhrB8xZO2v8h4QBgCW5JbeOtU9hKv01cSw2Ie xt+cvIHD2RbCl00/S9gqw+ktK1DeRaXXnwZqWll35EDfmeSHSPrziwo5Y2yDOEPrKiru lmzg== X-Received: by 10.50.153.200 with SMTP id vi8mr14116059igb.79.1356321767437; Sun, 23 Dec 2012 20:02:47 -0800 (PST) Received: from [172.16.1.247] (adsl-76-212-13-164.dsl.pltn13.sbcglobal.net. [76.212.13.164]) by mx.google.com with ESMTPS id uj6sm16082078igb.4.2012.12.23.20.02.44 (version=SSLv3 cipher=OTHER); Sun, 23 Dec 2012 20:02:45 -0800 (PST) Message-ID: <50D7D3E1.5070002@gmail.com> Date: Sun, 23 Dec 2012 20:02:41 -0800 From: Lance Norskog User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: java-user@lucene.apache.org Subject: Re: how to implement a TokenFilter? References: <282263637.20121222153731@alud.com.pl> <1521985365.20121223090100@alud.com.pl> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org You need to use an IDE. Find the Attribute type and show all subclasses. This shows a lot of rare ones and a few which are used a lot. Now, look at source code for various TokenFilters and search for other uses of the Attributes you find. This generally is how I figured it out. Also, after the full Analyzer stack is called, the caller saves the output (I guess to codecs?). You can look at which Attributes it saves. On 12/23/2012 06:30 PM, Xi Shen wrote: > thanks a lot :) > > > On Mon, Dec 24, 2012 at 10:22 AM, feng lu wrote: > >> hi Shen >> >> May be you can see some source code in org.apache.lucene.analysis package, >> such LowerCaseFilter.java,StopFilter.java and so on. >> >> and some common attribute includes: >> >> offsetAtt = addAttribute(OffsetAttribute.class); >> termAtt = addAttribute(CharTermAttribute.class); >> typeAtt = addAttribute(TypeAttribute.class); >> >> Regards >> >> >> On Sun, Dec 23, 2012 at 4:01 PM, Rafał Kuć wrote: >> >>> Hello! >>> >>> The simplest way is to look at Lucene javadoc and see what >>> implementations of Attribute interface there are - >>> >> http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/util/Attribute.html >>> -- >>> Regards, >>> Rafał Kuć >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch >>> >>>> thanks, i read this ready. it is useful, but it is too 'small'... >>>> e.g. for this.charTermAttr = addAttribute(CharTermAttribute.class); >>>> i want to know what are the other attributes i need in order to >> implement >>>> my function. where i can find a references to these attributes? i tried >>> on >>>> lucene & solr wiki, but all i found is a list of the names of these >>>> attributes, nothing about what are they capable of... >>> >>> >>> >>>> On Sat, Dec 22, 2012 at 10:37 PM, Rafał Kuć wrote: >>>>> Hello! >>>>> >>>>> A small example with some explanation can be found here: >>>>> http://solr.pl/en/2012/05/14/developing-your-own-solr-filter/ >>>>> >>>>> -- >>>>> Regards, >>>>> Rafał Kuć >>>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch >>>>> >>>>>> Hi, >>>>>> I need a guide to implement my own TokenFilter. I checked the wiki, >>> but I >>>>>> could not find any useful guide :( >>>>> >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org >>>>> >>>>> >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: java-user-help@lucene.apache.org >>> >>> >> >> -- >> Don't Grow Old, Grow Up... :-) >> > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org