Return-Path: X-Original-To: apmail-lucene-java-user-archive@www.apache.org Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 414FB10018 for ; Wed, 2 Oct 2013 14:06:07 +0000 (UTC) Received: (qmail 29122 invoked by uid 500); 2 Oct 2013 14:06:03 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 28414 invoked by uid 500); 2 Oct 2013 14:05:56 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 28279 invoked by uid 99); 2 Oct 2013 14:05:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 14:05:53 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of desidero@gmail.com designates 209.85.219.50 as permitted sender) Received: from [209.85.219.50] (HELO mail-oa0-f50.google.com) (209.85.219.50) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 02 Oct 2013 14:05:49 +0000 Received: by mail-oa0-f50.google.com with SMTP id j1so807934oag.37 for ; Wed, 02 Oct 2013 07:05:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=+vrE8GU3YtXmhXwZdbkb7tEaJTW5sBq7+6QW0S95eqc=; b=Gn7iKNSKl+/nTrYlxY7CiAgB9B/kzF0XB4r4NH23apsDKgROSnG5pnVEbBKyDHIJnQ aqcc8Pu/jKprtbXVwbVn6DMQFegOf96dteNBT/4nzSrK1YpOkSCX3MZ+nWo5o6Mg0rq7 7YLbqkyRahjQJpBMl32K4+fpOYQLcdibAR4PweVqJUTr6kxD7fmJutEbZxaGxhDayISF nsITqBUgpwNtURP8aGpfV/vUuhgIPy04/22iu3jfp0Uw9T1WBaznER0PBFTvT9BEXCpL vXVxq8mFBb7XMgtvaD+63voU70cRsmY37jt5feQiVyVvBUX/eDscYZwMFZhqEerS8qWC jjwQ== MIME-Version: 1.0 X-Received: by 10.60.60.5 with SMTP id d5mr3931540oer.0.1380722728157; Wed, 02 Oct 2013 07:05:28 -0700 (PDT) Received: by 10.60.98.72 with HTTP; Wed, 2 Oct 2013 07:05:28 -0700 (PDT) In-Reply-To: References: <012901cebef1$6ce42590$46ac70b0$@thetaphi.de> Date: Wed, 2 Oct 2013 09:05:28 -0500 Message-ID: Subject: Re: Query performance in Lucene 4.x From: Desidero To: java-user@lucene.apache.org Content-Type: multipart/alternative; boundary=089e0149c3c0bec71c04e7c28fc7 X-Virus-Checked: Checked by ClamAV on apache.org --089e0149c3c0bec71c04e7c28fc7 Content-Type: text/plain; charset=ISO-8859-1 I extended the IndexSearcher last night and set it up so it would make one task per IndexReader instead of one per AtomicReaderContext. Performance was pretty bad just like before, so it looks like I'm stuck merging everything into one big segment. I went through the documentation for the various merge policies and tried a few different configurations, but couldn't find one that naturally caps the number of segments at 1. The most promising options either had undocumented limits in their setters or they didn't behave quite like I expected. I'll spend some more time playing with it tonight, but in the meantime I don't suppose anyone else knows a way to accomplish what I'm trying to do without using forceMerge(1)? On Tue, Oct 1, 2013 at 6:10 PM, Desidero wrote: > Uwe, > > I was using a bounded thread pool. > > I don't know if the problem was the task overload or something about the > actual efficiency of searching a single segment rather than iterating over > multiple AtomicReaderContexts, but I'd lean toward task overload. I will do > some testing tonight to find out for sure. > > Matt > Hi, > > use a bounded thread pool. > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: uwe@thetaphi.de > > > > -----Original Message----- > > From: Desidero [mailto:desidero@gmail.com] > > Sent: Tuesday, October 01, 2013 11:37 PM > > To: java-user@lucene.apache.org > > Subject: Re: Query performance in Lucene 4.x > > > > For anyone who was wondering, this was actually resolved in a different > > thread today. I misread the information in the > > IndexSearcher(IndexReader,ExecutorService) constructor documentation - I > > was under the impression that it was submitting a thread for each index > > shard (MultiReader wraps 20 shards, so 20 tasks) but it was really > submitting > > a task for each segment within each shard (20 shards * ~10 segments = > ~200 > > tasks) which is horrible. Since my index changes infrequently, I'm using > > forceMerge(1) before sending out updated indexes to the slave servers. > > Without any extra tuning (threads, # of shards, etc) I've gone from ~2900 > > requests per minute to ~10k requests per minute. > > > > Thanks to Adrien and Mike for the clarification and Benson for bringing > up > > the question that led to my answer. > > > > I'm still pretty new to Lucene so I have a lot of poking around to do, > but I'm > > going to try to implement the "virtual segment" concept that Mike > > mentioned. It'll be really helpful for those of us who want parallelism > within > > queries and don't want to forceMerge. > > > > > > On Fri, Sep 27, 2013 at 9:55 AM, Desidero wrote: > > > > > Erick, > > > > > > Thank you for responding. > > > > > > I ran tests using both compressed fields and uncompressed fields, and > > > it was significantly slower with uncompressed fields. I looked into > > > the lazy field loading per your suggestion, but we don't get any > > > values from the returned Documents until the result set has been > > appropriately reduced. > > > Since we only store one retrievable field and we always need to get > > > it, it doesn't save any time loading it lazily. > > > > > > I'll try running a test without loading any fields just to see how it > > > affects performance and let you know how that goes. > > > > > > Regards, > > > Matt > > > > > > > > > On Fri, Sep 27, 2013 at 8:01 AM, Erick Erickson > > wrote: > > > > > >> Hmmm, since 4.1, fields have been stored compressed by default. > > >> I suppose it's possible that this is a result of > > >> compressing/uncompressing. > > >> > > >> What happens if > > >> 1> you enable lazy field loading > > >> 2> don't load any fields? > > >> > > >> FWIW, > > >> Erick > > >> > > >> On Thu, Sep 26, 2013 at 10:55 AM, Desidero > > wrote: > > >> > A quick update: > > >> > > > >> > In order to confirm that none of the standard migration changes had > > >> > a negative effect on performance, I ported my Lucene 4.x version > > >> > back to Lucene 3.6.2 and kept the newer API rather than using the > > >> > custom ParallelMultiSearcher and other deprecated methods/classes. > > >> > > > >> > Performance in 3.6.2 is even faster than before (~2900 requests/min > > >> with 4.x > > >> > vs ~6200 requests/min with 3.6.2), so none of my code changes > > >> > should be causing the difference. It seems to be something Lucene > > >> > is doing under > > >> the > > >> > covers. > > >> > > > >> > Again, if there's any other information if I can provide to help > > >> determine > > >> > what's going on, please let me know. > > >> > > > >> > Thanks, > > >> > Matt > > >> > > > >> > > > >> > > > >> > ------------------------------------------------------------------- > > >> > -- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > >> > For additional commands, e-mail: java-user-help@lucene.apache.org > > >> > > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > >> For additional commands, e-mail: java-user-help@lucene.apache.org > > >> > > >> > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --089e0149c3c0bec71c04e7c28fc7--