Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0A1F7CF99 for ; Sat, 26 May 2012 00:48:42 +0000 (UTC) Received: (qmail 25107 invoked by uid 500); 26 May 2012 00:48:40 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 25057 invoked by uid 500); 26 May 2012 00:48:39 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 25045 invoked by uid 99); 26 May 2012 00:48:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 May 2012 00:48:39 +0000 X-ASF-Spam-Status: No, hits=3.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HK_RANDOM_ENVFROM,HK_RANDOM_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of teddyyyy123@gmail.com designates 209.85.217.176 as permitted sender) Received: from [209.85.217.176] (HELO mail-lb0-f176.google.com) (209.85.217.176) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 26 May 2012 00:48:32 +0000 Received: by lboj14 with SMTP id j14so1499559lbo.35 for ; Fri, 25 May 2012 17:48:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=IzKxCqik247NgrOgJ7D3IeRnD960F5T5yvlK1zgLZM0=; b=XRKpdoqwB2yZWAOZS2PkwDD4+B9wp76d3WWZ70W4DoW+cB+MKnLRo8bcfqG95oCMrL LuK3CMS+GBQZd3cZ6DOxFvI15gKwHccsJQtNRZyPHiFLWoaan9nH/uXXum6TYhber5np CaiyAsklWCRaKzKUUUdI+wjPTpX2YQ+FwSQ2aBDRO9EF8M4d9xtiTpbgZsZNxIdMfVwK 7Y3/Qvd4zMHrrbKnQ7y3zlfq/D5WMMpLbwN3HX+EaIhZGINazTiIAq3t7r2/DnfKc3k7 QOGy4NU/HlrQQoIwvThF7j+x7KCb+3xvZbuVrUmNODpUxz/ruPfSt3tGlIOrNaY1Fryk 1jmA== Received: by 10.152.114.106 with SMTP id jf10mr823706lab.16.1337993292314; Fri, 25 May 2012 17:48:12 -0700 (PDT) MIME-Version: 1.0 Received: by 10.112.76.71 with HTTP; Fri, 25 May 2012 17:47:51 -0700 (PDT) In-Reply-To: References: From: Yang Date: Fri, 25 May 2012 17:47:51 -0700 Message-ID: Subject: Re: lucene (search) performance tuning To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=f46d040891abe6761b04c0e5d629 --f46d040891abe6761b04c0e5d629 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable thanks a lot guys On Tue, May 22, 2012 at 1:34 AM, Ian Lea wrote: > Lots of good tips in > http://wiki.apache.org/lucene-java/ImproveSearchingSpeed, linked from > the FAQ. > > > -- > Ian. > > > On Tue, May 22, 2012 at 2:08 AM, Li Li wrote: > > something wrong when writing in my android client. > > if RAMDirectory do not help, i think the bottleneck is cpu. you may try > to > > tune jvm but i do not expect much improvement. > > the best one is splitting your index into 2 or more smaller ones. > > you can then use solr s distributed searching. > > if the cpu is not fully used, yuo can do this in one physical machine > > > > =E5=9C=A8 2012-5-22 =E4=B8=8A=E5=8D=888:50=EF=BC=8C"Li Li" =E5=86=99=E9=81=93=EF=BC=9A > >> > >> > >> =E5=9C=A8 2012-5-22 =E5=87=8C=E6=99=A84:59=EF=BC=8C"Yang" =E5=86=99=E9=81=93=EF=BC=9A > >> > >> > > >> > I'm trying to make my search faster. right now a query like > >> > > >> > name:Joe Moe Pizza address:77 main street city:San Francisco > >> >is this a conjunction query or a disjunction query=EF=BC=9F > >> > >> > in a index with 20mil such short business descriptions (total size > > about 3GB) takes about 100--200ms. > >> >20m is not a small size, how many results for a query in average=EF= =BC=9F > >> > >> > I profiled the query, most time is spent in TermScorer.score(), as i= s > > shown by the attached yourkit screenshot. > >> >that=EF=BC=87s true, for a query, matching and scoring is very time c= onsuming > > and cpu intensive. another one is io for reading postings. > >> > >> > > >> > > >> > > >> > I tried loading the index onto tmpfs (in-memory block device), and > also > > tried RAMDirectory, neither helps much. > >> >if that is true. it seems that io is not the > >> > I am reading > > http://www.cnlp.org/presentations/slides/AdvancedLuceneEU.pdf > >> > it mentions > >> > Size > >> > =E2=80=93 Stopword removal > >> > =E2=80=93 Stemming > >> > =E2=80=A2 Lucene has a number of stemmers available > >> > =E2=80=A2 Light versus Aggressive > >> > =E2=80=A2 May prevent fine-grained matches in some cases > >> > =E2=80=93 Not a linear factor (usually) due to index compression > >> > > >> > so for "stopword removal", I'm already using the standard analyzer, = so > > stop word removal is already included, right? > >> > > >> > also generally any other tricks to try for reducing the search > latency? > >> > > >> > Thanks! > >> > Yang > >> > > >> > > >> > --------------------------------------------------------------------= - > >> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --f46d040891abe6761b04c0e5d629--