lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dragan Jotanovic" <Dragan.Jotano...@DIOSPHERE.com>
Subject RE: Sorting in lucene through Document boosting
Date Thu, 11 Sep 2008 16:53:47 GMT
Thanks Mark for quick resonse,
but the problem is that I need to do incremental additions to this
index, which means that I can not keep the sort order,
and full reindexation is costly process and I can not do it often.
That's why I need to find some other solution.


-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Thursday, September 11, 2008 5:38 PM
To: Dragan Jotanovic
Subject: Re: Sorting in lucene through Document boosting

You can sort by index order after adding the docs in the sorted order to

the index.

- Mark

Dragan Jotanovic wrote:
> Hi there.
>
> I am trying to implement sorting in a large index (3 million
documents).
> My sort field is simple integer with values between 1 and 100.
>
> With IndexSearcher's search(Query, Sort) everything works fine, but
the
> problem is resource consumption. I need to make it to use less memory
> and CPU.
>
> I thought of setting boost value for documents at index time, with the
> value of my sort field, and then making custom Similarity class which
> would disregard Lucene scoring and take in evaluation only this
document
> boost.
>
> Did someone try this, is it even possible? What do you think will I
gain
> something with this, in terms of resource consumption?
>
>  
>
> Regards,
>
> Dragan
>
>
> ------_=extPart_001_01C9142B.745F28F4--
> Delivered-To: markrmiller@gmail.com
> Received: by 10.140.207.7 with SMTP id e7cs88544rvg;
>         Thu, 11 Sep 2008 09:32:40 -0700 (PDT)
> Received: by 10.142.47.13 with SMTP id
u13mr1030345wfu.38.1221150760779;
>         Thu, 11 Sep 2008 09:32:40 -0700 (PDT)
> Return-Path:
<java-user-return-36070-markrmiller=ail.com@lucene.apache.org>
> Received: from mail.apache.org (hermes.apache.org [140.211.11.2])
>         by mx.google.com with SMTP id
32si13407392wfc.12.2008.09.11.09.32.40;
>         Thu, 11 Sep 2008 09:32:40 -0700 (PDT)
> Received-SPF: pass (google.com: domain of
java-user-return-36070-markrmiller=ail.com@lucene.apache.org designates
140.211.11.2 as permitted sender) client-ip0.211.11.2;
> Authentication-Results: mx.google.com; spf=ss (google.com: domain of
java-user-return-36070-markrmiller=gmail.com@lucene.apache.org
designates 140.211.11.2 as permitted sender)
smtp.mail=java-user-return-36070-markrmiller=gmail.com@lucene.apache.org
> Received: (qmail 90437 invoked by uid 500); 11 Sep 2008 16:32:30 -0000
> Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
> Precedence: bulk
> List-Help: <mailto:java-user-help@lucene.apache.org>
> List-Unsubscribe: <mailto:java-user-unsubscribe@lucene.apache.org>
> List-Post: <mailto:java-user@lucene.apache.org>
> List-Id: <java-user.lucene.apache.org>
> Reply-To: java-user@lucene.apache.org
> Delivered-To: mailing list java-user@lucene.apache.org
> Received: (qmail 90426 invoked by uid 99); 11 Sep 2008 16:32:30 -0000
> Received: from athena.apache.org (HELO athena.apache.org)
(140.211.11.136)
>     by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Sep 2008 09:32:30
-0700
> X-ASF-Spam-Status: No, hits=0 required.0
> 	tests=ML_MESSAGE,SPF_PASS
> X-Spam-Check-By: apache.org
> Received-SPF: pass (athena.apache.org: local policy)
> Received: from [213.52.246.188] (HELO MAIL.DIOSPHERE.com)
(213.52.246.188)
>     by apache.org (qpsmtpd/0.29) with SMTP; Thu, 11 Sep 2008 16:31:33
+0000
> Content-class: urn:content-classes:message
> MIME-Version: 1.0
> Content-Type: multipart/alternative;
> 	boundary=---_=_NextPart_001_01C9142B.745F28F4"
> Subject: Sorting in lucene through Document boosting
> X-MimeOLE: Produced By Microsoft Exchange V6.5
> Date: Thu, 11 Sep 2008 17:31:41 +0100
> Message-ID:
<ED024AB4B57C8543A3425147237C45F84511AF@MAIL.DIOSPHERE.com>
> X-MS-Has-Attach: 
> X-MS-TNEF-Correlator: 
> Thread-Topic: Sorting in lucene through Document boosting
> Thread-Index: AckUK966utYWlK4gQSyxgsEmZmYljg=From: "Dragan Jotanovic"
<Dragan.Jotanovic@DIOSPHERE.com>
> To: <java-user@lucene.apache.org>
> X-Virus-Checked: Checked by ClamAV on apache.org
>
> ------_=extPart_001_01C9142B.745F28F4
> Content-Type: text/plain;
> 	charset=s-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> Hi there.
>
> I am trying to implement sorting in a large index (3 million
documents).
> My sort field is simple integer with values between 1 and 100.
>
> With IndexSearcher's search(Query, Sort) everything works fine, but
the
> problem is resource consumption. I need to make it to use less memory
> and CPU.
>
> I thought of setting boost value for documents at index time, with the
> value of my sort field, and then making custom Similarity class which
> would disregard Lucene scoring and take in evaluation only this
document
> boost.
>
> Did someone try this, is it even possible? What do you think will I
gain
> something with this, in terms of resource consumption?
>
>  
>
> Regards,
>
> Dragan
>
>
> ------_=extPart_001_01C9142B.745F28F4--
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message