Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 92127 invoked from network); 14 Oct 2008 16:39:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Oct 2008 16:39:42 -0000 Received: (qmail 69154 invoked by uid 500); 14 Oct 2008 16:39:35 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 69134 invoked by uid 500); 14 Oct 2008 16:39:35 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 69123 invoked by uid 99); 14 Oct 2008 16:39:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Oct 2008 09:39:35 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of almeidaraf@gmail.com designates 209.85.217.13 as permitted sender) Received: from [209.85.217.13] (HELO mail-gx0-f13.google.com) (209.85.217.13) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Oct 2008 16:38:29 +0000 Received: by gxk6 with SMTP id 6so4572209gxk.5 for ; Tue, 14 Oct 2008 09:39:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :content-type:content-transfer-encoding:mime-version:subject:date :x-mailer; bh=Q8D3Uqo3IVuTj8wCJVFvWW2jsAbcn63FxL7z8LWjNN0=; b=SDPaB5el5mIA/+WOoFo29FS3YaDD+t2kibFDac2p4ARaVf54nbZNVQXvMHTLhnQSug I2s3yhmjaZJcNdoUyKOA74zl2iFnFXaaECoRxUc6Ofhff89WypfnEDgfnZfH3/Bs3zjs kE7t1IUEWxv30NeoYUfL+31RYakYBWQ7uncB0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:content-type:content-transfer-encoding :mime-version:subject:date:x-mailer; b=GMjqyoauFL34XBOabOpq+9GcZ43qqBAfqqked+jczWuAEF9VWGwi3mzVvChF3iGmNx GS9AZRTgEzWGrqnBkLLcdoJNQ8wk/p+D3mglDpymkCPEPDi7dqNUCmbMF/HyLQHM48aj zNcgBlLbzV8UtZlh00GVCXFGI5h0qIR0tWvmA= Received: by 10.142.223.20 with SMTP id v20mr3525640wfg.152.1224002345207; Tue, 14 Oct 2008 09:39:05 -0700 (PDT) Received: from ?192.168.1.3? ([189.107.129.191]) by mx.google.com with ESMTPS id 30sm18887842wff.18.2008.10.14.09.38.57 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 14 Oct 2008 09:39:02 -0700 (PDT) Message-Id: From: "Rafael C. de Almeida" To: java-user@lucene.apache.org Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v929.2) Subject: Unique tokens analyzer Date: Tue, 14 Oct 2008 13:38:47 -0200 X-Mailer: Apple Mail (2.929.2) X-Virus-Checked: Checked by ClamAV on apache.org Hello, Is there a analyzer that will tokenize the stream such that there's no repeated tokens in the stream? I have a keyword-field on my document, so if one keyword already appears on the list there's no point in having it shown again. Does it make sense having that analyzer? Or indexing repeated keywords on the same field shouldn't hurt performance or search quality in any way? --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org