Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CF9D1997C for ; Mon, 19 Sep 2011 20:27:40 +0000 (UTC) Received: (qmail 53448 invoked by uid 500); 19 Sep 2011 20:27:38 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 53413 invoked by uid 500); 19 Sep 2011 20:27:38 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 53404 invoked by uid 99); 19 Sep 2011 20:27:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Sep 2011 20:27:38 +0000 X-ASF-Spam-Status: No, hits=2.3 required=5.0 tests=SPF_SOFTFAIL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: softfail (nike.apache.org: transitioning domain of jturnbul@uow.edu.au does not designate 216.139.236.26 as permitted sender) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Sep 2011 20:27:31 +0000 Received: from ben.nabble.com ([192.168.236.152]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1R5kQm-00046l-Bg for java-user@lucene.apache.org; Mon, 19 Sep 2011 13:27:08 -0700 Date: Mon, 19 Sep 2011 13:27:08 -0700 (PDT) From: SBS To: java-user@lucene.apache.org Message-ID: <1316464028354-3350008.post@n3.nabble.com> Subject: Enabling indexing of hyphenated terms sans the hyphen MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org We use StandardTokenizer and this works well but we also need to include terms in our index which consist of hyphenated terms with the hyphen removed. So, for example, if the text being indexed contains "self-induced" we need the terms "self", "induced" and "selfinduced" to be indexed. How would I go about implementing this? We use Lucene Java 3.2. Thanks, -sbs -- View this message in context: http://lucene.472066.n3.nabble.com/Enabling-indexing-of-hyphenated-terms-sans-the-hyphen-tp3350008p3350008.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org