Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 89699 invoked from network); 22 Jun 2009 13:40:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Jun 2009 13:40:43 -0000 Received: (qmail 89942 invoked by uid 500); 22 Jun 2009 13:40:54 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 89849 invoked by uid 500); 22 Jun 2009 13:40:54 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 89841 invoked by uid 99); 22 Jun 2009 13:40:54 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Jun 2009 13:40:53 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of serera@gmail.com designates 209.85.219.227 as permitted sender) Received: from [209.85.219.227] (HELO mail-ew0-f227.google.com) (209.85.219.227) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Jun 2009 13:40:45 +0000 Received: by ewy27 with SMTP id 27so4203775ewy.5 for ; Mon, 22 Jun 2009 06:40:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=ZhSHyCYBod/xysIUQgPfJC6W8CQHgtWEZLYs2n6Y1JE=; b=lXZ8zfRAkGM/SMt0ARy6lOgYO/a3g/D8OTh2z50cq5L2etRpnvu7tS+qa10JnQBxM9 7EwOE6eKh4nCWX2JhTxPlzSlyGMrZYXjHM9CMuC/Kw4wRy4K6wsUz6zFA2vghzErYojM LOeOUOdb88NWBCuT9J7Fug4vJoKlZ4wNNYQMQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=Lh8J8MWTtO53Xv9Yqs4QhmN9cJ/F4iHaz0rvELLHFqC0F9Ud35FDnPZCipu0CbD2vV Vo6C2R6/Wbpz5VtoTAHTHGF2yNv0Snr/9epMOfF+iLMBMbZxbIPJTTOmywkQMhY46dBK e9zf+TeI+ra0h3b76m4H4BIvI4ym2KRiLATU0= MIME-Version: 1.0 Received: by 10.216.19.17 with SMTP id m17mr2127061wem.187.1245678024516; Mon, 22 Jun 2009 06:40:24 -0700 (PDT) In-Reply-To: <9ac0c6aa0906220229t1f9b0d26n53c154f719cc727d@mail.gmail.com> References: <786fde50906220025u6756b56cnf56a4b48b403e345@mail.gmail.com> <9ac0c6aa0906220229t1f9b0d26n53c154f719cc727d@mail.gmail.com> Date: Mon, 22 Jun 2009 16:40:24 +0300 Message-ID: <786fde50906220640s1d91e819y9605999b1a50f9e1@mail.gmail.com> Subject: Re: Optimization of memory usage in PriorityQueue From: Shai Erera To: java-dev@lucene.apache.org Content-Type: multipart/alternative; boundary=0016364c76ff27e374046cf0042d X-Virus-Checked: Checked by ClamAV on apache.org --0016364c76ff27e374046cf0042d Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit > > Though, are Lucene's core collectors reusable? If you did really want > say 100K results out of each search (very unusual), it'd be nice to > not have to throw away the Collector/PQ each time. > PQ has clear(), but it does not really allow you to reuse it, it just removes all the elements. So if we want to reuse a PQ, then we need to add a reset() (with a default impl of calling clear()) to both TSDC and PQ. TSDC will delegate the call to PQ, which is actually HitQueue, which will iterate on all the elements and reset them to sentinel values. TopFieldCollector's reset() will do the same, delegating the call to its own PQ (FieldValueHitQueue), which will do nothing (call clear()). Do you think it's worth it? If so, should we also add reset() to Collector? Shai On Mon, Jun 22, 2009 at 12:29 PM, Michael McCandless < lucene@mikemccandless.com> wrote: > On Mon, Jun 22, 2009 at 3:25 AM, Shai Erera wrote: > > > Or ... we can do nothing, and say that one can write his own Collector, > and > > use Sun's PQ (or any other), if one has a case like "I need 10K results, > but > > I don't know how many are there, and specifically I want to optimize for > the > > case of 1 result". > > +1 > > I think Lucene's current PQ is optimized for the [very] common case, > and if someone would like to eg swap to Sun's PQ impl, the custom > Collector API is the best route. (And, I'd love to hear back on how > the performance compares! If Sun has a faster PQ than Lucene, we > should fix that ;) ). > > Though, are Lucene's core collectors reusable? If you did really want > say 100K results out of each search (very unusual), it'd be nice to > not have to throw away the Collector/PQ each time. > > Mike > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-dev-help@lucene.apache.org > > --0016364c76ff27e374046cf0042d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
= Though, are Lucene's core collectors reusable? =A0If you did really wan= t
say 100K results out of each search (very unusual), it'd be nice to
not have to throw away the Collector/PQ each time.

PQ has clear(), but it does not really allow you to reuse it, it just r= emoves all the elements. So if we want to reuse a PQ, then we need to add a= reset() (with a default impl of calling clear()) to both TSDC and PQ. TSDC= will delegate the call to PQ, which is actually HitQueue, which will itera= te on all the elements and reset them to sentinel values.

TopFieldCollector's reset() will do the same, delegating the call t= o its own PQ (FieldValueHitQueue), which will do nothing (call clear()).
Do you think it's worth it? If so, should we also add reset() to C= ollector?

Shai

On Mon, Jun 22, 2009 at 12:29 PM= , Michael McCandless <lucene@mikemccandless.com> wrote:
On Mon, Jun 22, 2009 at 3:25 AM, Shai Erera<serera@gmail.com> wrote:

> Or ... we can do nothing, and say that one can write his own Collector= , and
> use Sun's PQ (or any other), if one has a case like "I need 1= 0K results, but
> I don't know how many are there, and specifically I want to optimi= ze for the
> case of 1 result".

+1

I think Lucene's current PQ is optimized for the [very] common case, and if someone would like to eg swap to Sun's PQ impl, the custom
Collector API is the best route. =A0(And, I'd love to hear back on how<= br> the performance compares! =A0If Sun has a faster PQ than Lucene, we
should fix that ;) ).

Though, are Lucene's core collectors reusable? =A0If you did really wan= t
say 100K results out of each search (very unusual), it'd be nice to
not have to throw away the Collector/PQ each time.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


--0016364c76ff27e374046cf0042d--