Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 21494 invoked from network); 13 Feb 2010 06:03:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Feb 2010 06:03:51 -0000 Received: (qmail 28693 invoked by uid 500); 13 Feb 2010 06:03:49 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 28625 invoked by uid 500); 13 Feb 2010 06:03:48 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 28615 invoked by uid 99); 13 Feb 2010 06:03:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Feb 2010 06:03:48 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of iamrohitbanga@gmail.com designates 209.85.223.194 as permitted sender) Received: from [209.85.223.194] (HELO mail-iw0-f194.google.com) (209.85.223.194) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Feb 2010 06:03:41 +0000 Received: by iwn32 with SMTP id 32so4538731iwn.14 for ; Fri, 12 Feb 2010 22:03:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type; bh=R/MmbzfeAb6ipdwLrUdEfbTZkKsKxT2CjszBN4xAUTA=; b=QH+Gfj1lV8iXjJEpR+iRl3/zB0KA0ugvNQybi0Ty64D8T0wiCa0YKxSbPPyZ5mI8nd Y2/Ce/+TcBaRvtW0HA4+DrANqYK7ug4EdFwsk4L4DMd+f0HJ8ET1vB0UOy254fokh62X U3xgHoqA+mvPvyqdwB9Hz8wznqOoIdIRInSwY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=wdZQftBXebZCR99MR7xbS/L8oRAtmsZmuphpTVkt41S7WwGsmJE/Wm1q+O8u1WecKr zKUOEJgGtShzRG/qFVsCbPeu/eK0hJKYgdY7kYhQj3DRzMxj4VNdefxa0TsjfkpgK3eU 92zfteuzumahkVEOpXoSkumo7Fef0NZykm8fk= MIME-Version: 1.0 Received: by 10.231.167.4 with SMTP id o4mr838469iby.66.1266041001109; Fri, 12 Feb 2010 22:03:21 -0800 (PST) In-Reply-To: <100383.40018.qm@web52901.mail.re2.yahoo.com> References: <100383.40018.qm@web52901.mail.re2.yahoo.com> From: Rohit Banga Date: Sat, 13 Feb 2010 11:33:01 +0530 Message-ID: Subject: Re: read more tokens during analysis To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=005045015ea7246e97047f75241b --005045015ea7246e97047f75241b Content-Type: text/plain; charset=ISO-8859-1 thanks will try the code and get back if i have any problems. Rohit Banga On Fri, Feb 12, 2010 at 10:38 PM, Ahmet Arslan wrote: > > > i want to consider the current word > > & the next as a single term. > > > > when analyzing "Arun Kumar" > > > > i want my analyzer to consider "Arun", "Arun Kumar" > > as synonyms. > > > > in the tokenstream method, how do we read the next token > > "Kumar" > > i am going through the setPositionIncrements method for > > considering them as > > synonyms, but i don't understand how to implement look > > ahead in the > > analyzer. > > Can we say that you want to implement a synonym filter that takes a list of > custom synonyms? > If yes why not use Solr's SynonymFilterFactory[1] that does this > automatically? It can handle multi-words synonym like "Arun", "Arun Kumar" > I can share the code to integrate it into Lucene if you want. > > [1] > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --005045015ea7246e97047f75241b--