Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 28399 invoked from network); 21 Aug 2006 18:10:31 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 21 Aug 2006 18:10:31 -0000 Received: (qmail 10660 invoked by uid 500); 21 Aug 2006 18:10:19 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 10635 invoked by uid 500); 21 Aug 2006 18:10:19 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 10374 invoked by uid 99); 21 Aug 2006 18:10:18 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Aug 2006 11:10:18 -0700 X-ASF-Spam-Status: No, hits=1.6 required=10.0 tests=BIZ_TLD,DNS_FROM_RFC_ABUSE,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of markrmiller@gmail.com designates 66.249.92.170 as permitted sender) Received: from [66.249.92.170] (HELO ug-out-1314.google.com) (66.249.92.170) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Aug 2006 11:10:16 -0700 Received: by ug-out-1314.google.com with SMTP id y2so1549254uge for ; Mon, 21 Aug 2006 11:09:55 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=D9O2aiCIQRSC4bFTTl1a8ZJUrCrg/z71p/pRmqHcLAwqBrsWfyIq4Enk7tFzkcSuU+n3WDHWPolDMnsMIqeymydWO8VbZKHNCA6F+TP35Gi8XwqRHmnsriK7XZ85IKkOLldykvhum4va05Q3a4zuTu2IhocjXALdfT0FgKxVsV0= Received: by 10.67.93.6 with SMTP id v6mr3854156ugl; Mon, 21 Aug 2006 11:09:55 -0700 (PDT) Received: by 10.67.93.3 with HTTP; Mon, 21 Aug 2006 11:09:55 -0700 (PDT) Message-ID: Date: Mon, 21 Aug 2006 14:09:55 -0400 From: "Mark Miller" To: java-user@lucene.apache.org Subject: Re: Searching a untokenized field using SnowballAnalyzer In-Reply-To: <44E9E04E.6090802@bassnet.biz> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_55858_22700943.1156183795128" References: <44E9E04E.6090802@bassnet.biz> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_55858_22700943.1156183795128 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline My guess? When you store those field untokenized, they are untokenized. When you use the SnowBall analyzer with the query parser and search those untokenized fields, you're query is tokenized. As you can imagine, a tokenized search by not match un untokenzied field. Why does this not happen with StandardAnalyzer? Most likely because StandardAnalyzer does not modify ferrari during it's processing (in fact I know it does not) while SnowBallAnalyzer probably does modify ferrari...perhaps to ferrar. The results: search query: ferrari query parser /SnowballAnalyzer: ferrar query parser /StandardAnalyzer: ferrari - Mark On 8/21/06, Lorenzo Di Gaetano wrote: > > Hi all, > > I have the following problem. I use SnowballAnalyzer to index Documents > containing tokenized and untokenized fields. But when I try to search a > document using one of the untokenized fields (usually keywords and > unique identifiers) it doesn't find anything... > > Simple exampe of code: > > doc.add(new Field("car","ferrari",Field.Store.NO,Field.Index.UN_TOKENIZED > ); > > when I try to search it using the following search strings: > > car:ferrari > > or > > car:"ferrari" > > it finds nothing. > > If I use StandardAnalyzer instead of SnowballAnalyzer it finds the > Document correctly!!! Even the field name and the field value are > lowercases, it seems that there is a problem on querying untokenized > fields using SnowballAnalyzer... The only way I have to find my "car" > field is using TermQueries... But I absolutely need to make complex > queries on multiple field values at once. > > Please help me! Thank you in advance. > > Lorenzo > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_55858_22700943.1156183795128--