Return-Path: X-Original-To: apmail-lucene-lucene-net-user-archive@www.apache.org Delivered-To: apmail-lucene-lucene-net-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5013AD88D for ; Tue, 26 Jun 2012 22:23:25 +0000 (UTC) Received: (qmail 88038 invoked by uid 500); 26 Jun 2012 22:23:24 -0000 Delivered-To: apmail-lucene-lucene-net-user-archive@lucene.apache.org Received: (qmail 87997 invoked by uid 500); 26 Jun 2012 22:23:24 -0000 Mailing-List: contact lucene-net-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucene-net-user@lucene.apache.org Delivered-To: mailing list lucene-net-user@lucene.apache.org Received: (qmail 87987 invoked by uid 99); 26 Jun 2012 22:23:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jun 2012 22:23:24 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rob.cecil@gmail.com designates 209.85.210.48 as permitted sender) Received: from [209.85.210.48] (HELO mail-pz0-f48.google.com) (209.85.210.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jun 2012 22:23:18 +0000 Received: by dadz8 with SMTP id z8so600329dad.35 for ; Tue, 26 Jun 2012 15:22:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=uDGuZurB77sw2e4yc9iWs1BE9ajr1XN2wRrKkI41XWQ=; b=KCCpEEALQ4z8RyORNEelq5t2P6lUr6cn77dhVIbX7qFUDY2pvVbzS8QRCD2wiS9l/b Wvjaz9alf8lsVimYj7nqBGt7qbNZPVK6Cwx39cTOtOqkpQ/eAnNQDb9LNek96/wUXWtX GZzKngmXplEUw07pHmeSUVwSkq5YDCjlYH9tC5fXCWaHxIEtd5nqqfFgcCm78xzLvoFX S2/S4J7lWJBZlER6J5x3gbasH8nFSK7R5ch4zs+dnzp7fnkAIrd1CHpRnpNNrx8QMlH5 3Rm1KRhY5Uktzyqq0MyyEmhOPJWzUIi7yYftWds9REa0nFM77VYsOpGYguf5XCIrI5+e PMxg== Received: by 10.68.195.198 with SMTP id ig6mr57097164pbc.92.1340749377160; Tue, 26 Jun 2012 15:22:57 -0700 (PDT) MIME-Version: 1.0 Received: by 10.143.168.18 with HTTP; Tue, 26 Jun 2012 15:22:36 -0700 (PDT) In-Reply-To: <4FEA300D.7000204@devhost.se> References: <4FEA300D.7000204@devhost.se> From: Rob Cecil Date: Tue, 26 Jun 2012 16:22:36 -0600 Message-ID: Subject: Re: SPAM-HIGH: Disparity between API usage and Luke To: lucene-net-user@lucene.apache.org Content-Type: multipart/alternative; boundary=047d7b10c8a75bc42a04c3678a25 --047d7b10c8a75bc42a04c3678a25 Content-Type: text/plain; charset=ISO-8859-1 So if you want to search a non-analyzed (non-tokenized) field, you should not use StandardAnalyzer, but something like KeywordAnalyzer? On Tue, Jun 26, 2012 at 3:56 PM, Simon Svensson wrote: > Luke defaults to KeywordAnalyzer which wont change your term in any way. > The QueryParser will still break up your query, so "Name:Jack Bauer" would > become (Name:Jack DefaultField:Bauer). I believe you can have per-field > analyzers (KeywordAnalyzer for Id, StandardAnalyzer for everything else) > using a PerFieldAnalyzerWrapper. > > > On 2012-06-26 23:06, Lingam, ChandraMohan J wrote: > >> QueryParser has no knowledge of how data was indexed. For your scenario, >> I don't believe you would be able to use Query Parser with standard >> analyzer when data was originally indexed with Field.Index.NOT_ANALYZED >> option. >> >> Interesting question is why is luke working/finding the match? I would >> have expected Luke to not find any matches. >> >> >> -----Original Message----- >> From: Rob Cecil [mailto:rob.cecil@gmail.com] >> Sent: Tuesday, June 26, 2012 12:54 PM >> To: lucene-net-user@lucene.apache.**org >> Subject: Re: SPAM-HIGH: Disparity between API usage and Luke >> >> I can definitely try that. I just expected QueryParser would respect the >> case of the source string. I was hoping to avoid using the Query API >> per-se, and just let the parser to the work for me. >> >> On Tue, Jun 26, 2012 at 1:19 PM, Lingam, ChandraMohan J < >> chandramohan.j.lingam@intel.**com > >> wrote: >> >> var query = _parser.Parse("Id:BAUER*"); >>>>> >>>> In your code, most likely, the value got converted to lower case (i.e. >>> bauer*) by the parse statement. >>> Whereas indexed value is in upper case as it is not analyzed (from >>> screen shot). >>> >>> Can you explicitly try using prefix query? >>> >>> >>> >>> Same results, apparently, when I use Luke 1.0.1. >>>> >>>> When I search for "Id:BAUER*" I get 15 hits in Luke, but in my >>>> custom app, zero. >>>> >>>> On Tue, Jun 26, 2012 at 12:31 PM, Rob Vesse >>>> >>> wrote: >>> >>>> You appear to be using Luke 3.5 which per the information on the >>>>> Luke homepage (http://code.google.com/p/**luke/) >>>>> uses Lucene 3.5 >>>>> >>>>> Since Lucene.Net is currently on 2.9.4 I wouldn't be surprised to >>>>> see different behavior between the API and executing in Luke. >>>>> >>>>> If you use a version of Luke which more closely aligns with the >>>>> version >>>>> >>>> of >>>> >>>>> Lucene.Net (Luke 1.0.1 uses Lucene 3.0.1 which should be close >>>>> enough since the 2.9.x releases were previews of the 3.0.x >>>>> releases as I understood it) what behavior do you see? >>>>> >>>>> Hope this helps, >>>>> >>>>> Rob >>>>> >>>>> On 6/26/12 10:50 AM, "Rob Cecil" wrote: >>>>> >>>>> If I run a query against my index using QueryParser to query a field: >>>>>> >>>>>> var query = _parser.Parse("Id:BAUER*"); >>>>>> var topDocs = searcher.Search(query, 10); >>>>>> Assert.AreEqual(count, topDocs.TotalHits); >>>>>> >>>>>> I get 0 for my TotalHits, yet in Luke, the same query phrase >>>>>> yields >>>>>> 15 results, what am I doing wrong? I use the StandardAnalyzer >>>>>> both to create the index and to query. >>>>>> >>>>>> The field is defined as: >>>>>> >>>>>> new Field("Id", myObject.Id, Field.Store.YES, >>>>>> Field.Index.NOT_ANALYZED) >>>>>> >>>>>> and is a string field. The result set back from Luke looks like >>>>>> (screencap): >>>>>> >>>>>> http://screencast.com/t/**NooMK2Rf >>>>>> >>>>>> Thanks! >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> > > --047d7b10c8a75bc42a04c3678a25--