Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 43683 invoked from network); 7 Dec 2010 00:30:17 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 7 Dec 2010 00:30:17 -0000 Received: (qmail 59407 invoked by uid 500); 7 Dec 2010 00:30:16 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 59356 invoked by uid 500); 7 Dec 2010 00:30:16 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 59349 invoked by uid 99); 7 Dec 2010 00:30:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Dec 2010 00:30:16 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jimmoefoe@gmail.com designates 74.125.82.48 as permitted sender) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Dec 2010 00:30:11 +0000 Received: by wwb24 with SMTP id 24so163663wwb.5 for ; Mon, 06 Dec 2010 16:29:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=4IL+Qpex0UQHaGlKtY+Z3lMxzmLcrw1iMqffVWsJd7k=; b=w07ER+yMu/QfCeN1rgriCwXiYDTTsJsIj27L6gpp4uxZTj7avFbMIKACN6NzzE09pR 70s3C/HrpYiuEqGSmB2yK2AZg61392VITwRHbj8T6TtQ47Whwgqn3ICq5B2pLblfMk6A sXa9v2JeXExYx1HvpVdDkXh2ft9Wwjs0Vw6wo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=t00BL7J5TDSpr7mdLuV9j01osKfXwvKWhRWFrgRauZ3LNEPjsgRB+BGjXf9tlrs34y PJZq2wC0KyOFjbibuNEbQFKInDpcugdMRBcS1fdAbZZCvHLSIdvsay/gsn4VqWTdGWBF kJLzzJvwtPrG9vXVDJXakMQc+3UBnLPhiajWg= MIME-Version: 1.0 Received: by 10.216.172.206 with SMTP id t56mr785658wel.66.1291681789725; Mon, 06 Dec 2010 16:29:49 -0800 (PST) Received: by 10.216.210.231 with HTTP; Mon, 6 Dec 2010 16:29:49 -0800 (PST) In-Reply-To: References: Date: Mon, 6 Dec 2010 16:29:49 -0800 Message-ID: Subject: Re: FieldCache usage for custom field collapse in solr 1.4 From: "Adam H." To: dev@lucene.apache.org, yonik@lucidimagination.com Content-Type: multipart/alternative; boundary=0016367f9dc43d4dee0496c71a3c --0016367f9dc43d4dee0496c71a3c Content-Type: text/plain; charset=ISO-8859-1 One more comment/question - Having looked at the Solr stats panel, I do not see detailed memory usage for the field i'm collapsing on in the lucene FieldCache entries listings. As I understand ( after having looked through this ticket: https://issues.apache.org/jira/browse/SOLR-1292 ), this means that its not an 'insanity' instance, and so actually I am not using double the memory, but rather only have this field in the FieldCache on the whole index level. This got me thinking - If i'm not using any segment-level fieldcaching for this field, there's no reason not to use an index-wide one, as long as I can guarantee thats the only use case for this field in the fieldcache.. is this correct? Thanks again for helping me out with this delicate subject :) Adam On Mon, Dec 6, 2010 at 3:21 PM, Adam H. wrote: > ah! so just so I can get cracking on this - Can you be alittle more > specific? e.g > > in my component implementation that runs in the request handling after the > normal QueryComponent, > How would I access the specific field value for the documents that were > retrieved? > > i.e how would it fit in a code like this if at all: > > // docList is the matching documents for given offset/rows/query > DocIterator it = docList.iterator(); > > while (it.hasNext()) { > docId = it.next(); > score = it.score(); > > > // this would've worked if this was stored field: > // reader.document(docId).get(fieldName) > ?? > > } > > > > On Mon, Dec 6, 2010 at 2:57 PM, Yonik Seeley wrote: > >> On Mon, Dec 6, 2010 at 5:48 PM, Adam H. wrote: >> > In other words, using a per-segment fieldcache collection as a >> > post-processing step (e.g after QueryComponent did its collection) does >> not >> > seem at all trivial, if at all possible ( is it possible? ) >> >> Sure, it's possible, and not too hard (as long as no sort field involves >> score). >> Just instruct the QueryComponent to retrieve the set of all matching >> documents, then you can use that to run then through whatever >> collectors you want again. I've been meaning to implement this >> optimization to field collapsing... >> >> Depending on the details, either replacing the QueryComponent with >> your custom one, or inserting an additional component after the query >> component could make sense. >> >> -Yonik >> http://www.lucidimagination.com >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: dev-help@lucene.apache.org >> >> > --0016367f9dc43d4dee0496c71a3c Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable One more comment/question -
Having looked at the Solr stats panel, I do = not see detailed memory usage for the field i'm collapsing on in the lu= cene FieldCache entries listings.

As I understand ( after having loo= ked through this ticket: https://issues.apache.org/jira/browse/SOLR-1292 ), this means= that its not an 'insanity' instance,
and so actually I am not using double the memory, but rather only have this= field in the FieldCache on the whole index level.

This got me thin= king - If i'm not using any segment-level fieldcaching for this field, = there's no reason not to use an index-wide one,
as long as I can guarantee thats the only use case for this field in the fi= eldcache.. is this correct?

Thanks again for helping me out with thi= s delicate subject :)

Adam

On Mon,= Dec 6, 2010 at 3:21 PM, Adam H. <jimmoefoe@gmail.com> wrote:
ah! so just so I = can get cracking on this - Can you be alittle more specific? e.g

in = my component implementation that runs in the request handling after the nor= mal QueryComponent,
How would I access the specific field value for the documents that were ret= rieved?

i.e how would it fit in a code like this if at all:

// docList i= s the matching documents for given offset/rows/query
DocIterator it =3D = docList.iterator();

=A0=A0=A0 =A0=A0=A0 while (it.hasNext()) {
= =A0=A0=A0 =A0=A0=A0 =A0=A0=A0 docId =3D it.next();
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 score =3D it.score();


=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 // this would've worked if this was stored field:=
=A0=A0=A0 =A0=A0=A0 =A0=A0=A0 // reader.document(docId).get(fieldName)<= br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 ??
=A0=A0=A0=A0=A0=A0=A0 }



On Mon, Dec 6, 2010 at 2:57 PM, Yonik Seeley <yonik@lucidimagina= tion.com> wrote:
On Mon, Dec 6, 2010 at 5:48 PM, Adam H. <jimmoefoe@gmail.com> wrote:
> In other words, using a per-segment fieldcache collection as a
> post-processing step (e.g after QueryComponent did its collection) doe= s not
> seem at all trivial, if at all possible ( is it possible? )

Sure, it's possible, and not too hard (as long as no sort field i= nvolves score).
Just instruct the QueryComponent to retrieve the set of all matching
documents, then you can use that to run then through whatever
collectors you want again. =A0I've been meaning to implement this
optimization to field collapsing...

Depending on the details, either replacing the QueryComponent with
your custom one, or inserting an additional component after the query
component could make sense.
------------------------------------------------= ---------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org



--0016367f9dc43d4dee0496c71a3c--