Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 38E2AF5D0 for ; Fri, 29 Mar 2013 09:38:50 +0000 (UTC) Received: (qmail 72003 invoked by uid 500); 29 Mar 2013 09:38:48 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 71961 invoked by uid 500); 29 Mar 2013 09:38:47 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 71935 invoked by uid 99); 29 Mar 2013 09:38:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Mar 2013 09:38:46 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of arunk786@gmail.com designates 209.85.223.178 as permitted sender) Received: from [209.85.223.178] (HELO mail-ie0-f178.google.com) (209.85.223.178) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Mar 2013 09:38:40 +0000 Received: by mail-ie0-f178.google.com with SMTP id bn7so348208ieb.23 for ; Fri, 29 Mar 2013 02:38:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:from:date:message-id:subject:to :content-type; bh=pqQCix/Pp4o05EbZOlkerNBIv7Mib19oO9wQWN5fYgs=; b=dTYz0LJcRXNKiNhORyCavWFwrWjNweSZOF3yUxIHOzUBr+nv1KyaF80xkRGydFZIEK ckyYkWqXmXj1Qx5YHyW/3p9ClNr9NDzo0WwLBl4w7/iuxn+q0qPUMiJCIGj1Oe0ntYlK sAwQv/nqxqsipa+mvc36qaEfKHCcbbfpHLSXlKWLtN6muP5m+Str4jDHlHS6qCKo29az byP1I96PDOSvEhc9yAqPHGdy099OItEfanBBkRrxZ3jIv0ighORuIdaLNux9kjVP2shz P+WPd511uwXiE7DSClBeK38hZGWGSi9eRrxadtIzAnMM25dx2U93rDKFkS3oln1h8k4y hr6Q== X-Received: by 10.42.150.131 with SMTP id a3mr1025282icw.8.1364549899742; Fri, 29 Mar 2013 02:38:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.213.135 with HTTP; Fri, 29 Mar 2013 02:37:58 -0700 (PDT) From: Arun Kumar K Date: Fri, 29 Mar 2013 15:07:58 +0530 Message-ID: Subject: Wild Card Query Performance To: java-user Content-Type: multipart/alternative; boundary=90e6ba6e83580d781e04d90d0805 X-Virus-Checked: Checked by ClamAV on apache.org --90e6ba6e83580d781e04d90d0805 Content-Type: text/plain; charset=ISO-8859-1 Hi Guys, I have been testing the search time improvement in Lucene 4.0 from Lucene 3.0.2 version for Wildcard Queries (with atleast say 2 chars Eg.ar*). For a 2GB size index with 4000000 docs, the following observations were made: Around 3X improvement with and without STRING sort on a sortable field. I guess this improvement is because of the Automation Query by Robert which is used in WildCard Queries. As per mike's blog, FuzzyQueries are 100X times faster in 4.0 but these wildcard queries are not that faster comparatively. I have used default codecs and postings format. Did i miss something or is it the max improvement that we can expect currently for WildCard Queries? Arun --90e6ba6e83580d781e04d90d0805--