Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 42784 invoked from network); 3 Aug 2009 02:49:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Aug 2009 02:49:55 -0000 Received: (qmail 22464 invoked by uid 500); 3 Aug 2009 02:49:58 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 22384 invoked by uid 500); 3 Aug 2009 02:49:58 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 22374 invoked by uid 99); 3 Aug 2009 02:49:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Aug 2009 02:49:58 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of Leonard.Gestrin@markettools.com designates 72.5.112.151 as permitted sender) Received: from [72.5.112.151] (HELO mail.markettools.com) (72.5.112.151) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Aug 2009 02:49:50 +0000 Received: from wdccpmail02.markettools.com ([10.64.64.34]) by wdccpcas01.markettools.com ([10.64.64.36]) with mapi; Sun, 2 Aug 2009 19:49:29 -0700 From: Leonard Gestrin To: "java-user@lucene.apache.org" Date: Sun, 2 Aug 2009 19:49:27 -0700 Subject: question about indexing/searching using standardanalyzer for KEYWORD field that contains alphanumeric data Thread-Topic: question about indexing/searching using standardanalyzer for KEYWORD field that contains alphanumeric data Thread-Index: AcoT5Eih5IC/CzkkRIi4cOz3QTyYJAAAJ1Sw Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Virus-Checked: Checked by ClamAV on apache.org Hello, I have question about KEYWORD type and searching/updating. I am getting st= range behavior that I can't quite comprehend. My index is created using standard analyzer, which used for writing and sea= rching. It has three fields userpin - alphanumeric field which is stored as TEXT documentkey - alphanumeric field which is stored as TEXT contents - text of document which is stored as TEXT When I try to update document I am creating Term to find document by docume= ntKey and I am using org.apache.lucene.index.IndexWriter.updateDocument(term, pDocument); to do the update. Lucene fails to find the document by the term and I am g= etting duplicate documents in the index. When I changed index to define documentKey as KEYWORD the updates started t= o work fine. However, search for documentKey using StandardAnalyzer stopped working. It appears that lucene is using keywordAnalyzer for searching for the term = during update, even though the indexer is open with StandardAnalyzer. The sample values that are stored in documentKeys are: "L2222FAHBHMF", "L22= 22FAHBHAS". I noticed if documentKey is numeric value, both KeywordAnalyzer and Standar= dAnalyzer can find the documents by it without any problem thus reader can = find and indexer can update without any problems. With alphanumeric I cant = get both to work. Any help is appreciated. Thanks Leonard --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org