Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 70324 invoked from network); 20 Aug 2009 18:29:41 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Aug 2009 18:29:41 -0000 Received: (qmail 82096 invoked by uid 500); 20 Aug 2009 18:29:58 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 82020 invoked by uid 500); 20 Aug 2009 18:29:58 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 82010 invoked by uid 99); 20 Aug 2009 18:29:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Aug 2009 18:29:58 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of netcam@gmail.com designates 209.85.210.173 as permitted sender) Received: from [209.85.210.173] (HELO mail-yx0-f173.google.com) (209.85.210.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Aug 2009 18:29:48 +0000 Received: by yxe3 with SMTP id 3so112679yxe.29 for ; Thu, 20 Aug 2009 11:29:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=+I9KsqyV6zrnpd/ajBVNDX2Q62ovCRN/gi93tPgxCe8=; b=NXf1sG7bs/4mb+HfxqE4/ijfZf0j/3QC91pZI4mWAMoFLUUfDznbT511xQU4dKnEik CrkZZSDa6wun3dRpwdNxsZvEmGFcaon/3jC7OKQ5Wgg2WozKOqCD+mSLvbuFoWmMG/0p KFK7YQwQIn4BpI8UHJfP75WHJElaziwiYlUXQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=oVz4S/ypB5qnbIFp03oObRRfkRVpCPe1aZr8dyhEfB6WVj+AjbCWK4Sq7X/VUAfRW5 jGEg53u5NxM309JZ80qYHGqYgmMVy2WS41qt5oI+DwiJxUP3E64msNVI352BhURZS+kS ACvKbk70wN5a3Uxo21OByBVDFxoysolJPzXvM= MIME-Version: 1.0 Received: by 10.90.134.2 with SMTP id h2mr118857agd.110.1250792967549; Thu, 20 Aug 2009 11:29:27 -0700 (PDT) Date: Thu, 20 Aug 2009 11:29:27 -0700 Message-ID: Subject: WordDelimiterFilter to QueryParser to MultiPhraseQuery? From: jOhn To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=00163630f71584b357047196ee10 X-Virus-Checked: Checked by ClamAV on apache.org --00163630f71584b357047196ee10 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit If you have several tokens, for example after a WordDelimiterFilter, there is almost no way NOT to trigger a MultiPhraseQuery when you have catenateWords="1" or catenateAll="1". For example the title: Jokers Wild In the index it is: jokers wild, jokers, wild, jokerswild. When you query "jOkerswild" it becomes these tokens after the WordDelimiterFilter/LowercaseFilter: j(0,1,positionInc=1), okerswild(1,10,positionInc=1), jokerswild(0,10,positionInc=0) In the QueryParser, its j=positionCount(1), okerswild=positionCount(2), jokerswild=positionCount(2) Thus there is no way for jokerswild to match b/c the positionCount > 1 and QueryParser will turn that into a MultiPhraseQuery instead of a BooleanQuery. Even though severalTokensAtSamePosition=true (b/c j=startOffset(0) and jokerswild=startOffset(0)). Isn't this a bug? How could 2 tokens at the same position be treated as a MultiPhraseQuery? -nc --00163630f71584b357047196ee10--