Return-Path: Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: (qmail 15877 invoked from network); 30 Jan 2009 19:28:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 30 Jan 2009 19:28:07 -0000 Received: (qmail 11678 invoked by uid 500); 30 Jan 2009 19:28:03 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 11645 invoked by uid 500); 30 Jan 2009 19:28:03 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 11634 invoked by uid 99); 30 Jan 2009 19:28:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jan 2009 11:28:03 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of markrmiller@gmail.com designates 74.125.44.28 as permitted sender) Received: from [74.125.44.28] (HELO yx-out-2324.google.com) (74.125.44.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jan 2009 19:27:55 +0000 Received: by yx-out-2324.google.com with SMTP id 3so200527yxj.5 for ; Fri, 30 Jan 2009 11:27:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=2Sq3Uudh997iU6v5ovGG1tNKc9cf4lJvm8m3n543ZLw=; b=RfRygzr5nzxsS5r2n/kBvPLTl5x7BxzDwCuXiMdSWaBHNdFq6UrQfi2KEFTqoQ67i5 rH5fZxWuRdxHpa+ngPbqFf7XgTy5BK0uqNssob/dVVI+pxYjXHXLKejOPvC2gaDalBT/ L0kpsumpA2onEbl+iXn5a2L92RQw1xriz88S4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=HCbvX5t7sXeIxGX9LT2D03xgAAbBmpR6ulM63HnmZWuvbKXtCeu8cGNpif/NQB7KPV 8zaHLNNfd2sYdJsH5zAdAMWBHyfZU0jgHD5NzcMBYDRGj2/BKx+Z05FJvhadA82rvw23 X/2zSIqiH6iUNcquD0D28OhRtSoGhP6ePWl74= Received: by 10.90.118.19 with SMTP id q19mr210603agc.26.1233343654400; Fri, 30 Jan 2009 11:27:34 -0800 (PST) Received: from ?192.168.1.103? (ool-44c639d9.dyn.optonline.net [68.198.57.217]) by mx.google.com with ESMTPS id 7sm2785131agb.0.2009.01.30.11.27.33 (version=SSLv3 cipher=RC4-MD5); Fri, 30 Jan 2009 11:27:33 -0800 (PST) Message-ID: <498354A6.1080804@gmail.com> Date: Fri, 30 Jan 2009 14:27:34 -0500 From: Mark Miller User-Agent: Thunderbird 2.0.0.19 (X11/20090105) MIME-Version: 1.0 To: solr-user@lucene.apache.org Subject: Re: query with stemming, prefix and fuzzy? References: <497F3AD7.5070300@netcologne.de> <4981E9E4.3020100@gmail.com> <498317BE.5000206@netcologne.de> <498341CA.2020506@gmail.com> <4983466A.1090505@netcologne.de> <498346D1.9050806@gmail.com> <49835313.8060601@netcologne.de> In-Reply-To: <49835313.8060601@netcologne.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Gert Brinkmann wrote: > 57971 > Its a lot for a small index. The fuzzy query will enumerate all of those terms and calculate an edit distance. Its not an insane amount of work, but it jives with the slowness you see. Doing that 60,000 times for a query is not that fast. Unfortunately, without the prefix setting, FuzzyQueries are slow, slow with that many uniques. Solr should def allow the prefix to be set. There was talk a couple years back about changing the default prefix value in Lucene because its so slow - but it didn't happen. The developers decided that you could tweak it yourself if you needed to be able to scale (if you add a prefix length, up to that length won't be fuzzy). Unfortunately, Solr hasnt yet given this option to my knowledge. - Mark