Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 44809 invoked from network); 17 Jun 2010 13:32:09 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Jun 2010 13:32:09 -0000 Received: (qmail 18370 invoked by uid 500); 17 Jun 2010 13:32:07 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 17550 invoked by uid 500); 17 Jun 2010 13:32:03 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 17385 invoked by uid 99); 17 Jun 2010 13:32:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Jun 2010 13:32:02 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [217.12.10.244] (HELO web26207.mail.ukl.yahoo.com) (217.12.10.244) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 17 Jun 2010 13:31:57 +0000 Received: (qmail 14564 invoked by uid 60001); 17 Jun 2010 13:31:34 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.de; s=s1024; t=1276781494; bh=SPVPoyz6D8pzkp8hzz+mw3g6NstbMMYAxtw1paa++sE=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=slBco6rLZec0eGTzuJGN6o9IYuG/ADzo7ELyzhN0wqhdJcdmiWbjXthqI5PYIM3FtAfrzRlfZcfHTVbnNRREfgfgZ+hUptLLgHHXwnTyQ2c6i3DxzXNuk5X9+wjRT3tMWpMGjJCx9ttUg7Xagha1vivAWYKsG1vIe7eTaf3svpg= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.de; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=abcYOJptbSVeWWnUEdML6kaEClodJph55eytbgMjkJqtCU8TRKRNmdilmjvKfZxdligzH9m3VxeFtjZbAUNjdtUiwhVmhJ7a/3Bpij4tTT6tbQ3yMCmXdnsHQIOIjUT0CFoZRHxt9dtW9fgcCyfGzUtsdCRLecJIT0WbhN0Hk34=; Message-ID: <718970.14228.qm@web26207.mail.ukl.yahoo.com> X-YMail-OSG: BJ5wYPQVM1lhZx6ue4ddwUNDZGBP.nqzAnhk7iaTAJzKTmj lTvqeEDD2s8b.H8iPugjEs1E3TQqaGIumZPbDhb2gUwOXT_ZNW8aTqwXpIxD 3lVZzyO.sknax7.B0p3pnLirqWQtSX4mfiARg92DqTegJOlqdB3hqt6IIjTq .gzu.kEyL.uPCZJoeU57DD7HgqPvPf920upxacw5xDuwzMhZ2MzAB8uhgfJp I854HqDTEE4e3nOBI0nuEv0NiS7BHnqGEdWiqZAAgeotWqfptg_8VthN6rLc - Received: from [10.2.3.51] by web26207.mail.ukl.yahoo.com via HTTP; Thu, 17 Jun 2010 13:31:34 GMT X-Mailer: YahooMailClassic/11.1.4 YahooMailWebService/0.8.103.269680 Date: Thu, 17 Jun 2010 13:31:34 +0000 (GMT) From: Anna Hunecke Subject: Strange behaviour of StandardTokenizer To: java-user@lucene.apache.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Hi! I ran into a strange behaviour of the StandardTokenizer. Terms containing a= '-' are tokenized differently depending on the context.=20 For example, the term 'nl-lt' is split into 'nl' and 'lt'. The term 'nl-lt0' is tokenized into 'nl-lt0'. Is this a bug or a feature? Can I avoid it somehow? I'm using Lucene 3.0.0. Best, Anna=0A=0A --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org