lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karsten Konrad" <Karsten.Kon...@xtramind.com>
Subject AW: AW: Better way to Sort by Date
Date Sun, 12 Oct 2003 08:43:20 GMT

Hi again,

you only have to rebuild if your document to add is older
than the last (most current) one already in the index. For 
some applications, like newswire documents or indexing 
emails, the documents come in (almost) the right order 
anyway. Could be for your filesystem as well, but you can
only deal with one date with this method.

Date sorting for Lucene is inefficient in general - you have 
to access some date field of the document (which also loads
the rest of the document!) and the sort the search result
using this information. With, e.g., 1000 documents in your
search result, such an operation can easily take several
seconds. 

What works far better is to restrict the search result
by using a DateRestriction, although this still may take
quite long on a large index. I believe that sorting by date
is still a somewhat unsolved problem in Lucene, unless one 
of our Lucene gurus has some trick in his sleeve that we do 
not know about yet :)

Regards,

Mit freundlichen Grüßen aus Saarbrücken

--

Dr.-Ing. Karsten Konrad
Head of Artificial Intelligence Lab

XtraMind Technologies GmbH
Stuhlsatzenhausweg 3
D-66123 Saarbrücken
Phone: +49 (681) 3025113
Fax: +49 (681) 3025109
konrad@xtramind.com
www.xtramind.com

Besuchen Sie uns auf der SYSTEMS!
20.-24. Oktober 2003, Neue Messe München
Halle A1, Stand 518



-----Ursprüngliche Nachricht-----
Von: none none [mailto:korfut@lycos.com] 
Gesendet: Sonntag, 12. Oktober 2003 09:30
An: Lucene Developers List
Betreff: Re: AW: Better way to Sort by Date


Hi,
i don't have a DB as source, my documents are on the FileSystem, also, what you are saying
means i have to rebuild the index everytime i add a document? i think the idea of write a
sorted index is kind hard, also what if i have 2 dates and filesize as well? i am looking
for a more standard way to that. 
thanks

--

--------- Original Message ---------

DATE: Fri, 10 Oct 2003 16:47:34
From: "Karsten Konrad" <Karsten.Konrad@xtramind.com>
To: "Lucene Developers List" <lucene-dev@jakarta.apache.org>,<korfut@lycos.com>
Cc: 

>
>Hello,
>
>>>
>ok, good idea, but how can i do that?
>>>
>
>that really depends on where you get your documents from. Sorry, as 
>this is quite an unspecific problem, I can not give specific code.
>
>If, for instance, the documents come from a database, you could
>use the SQL-Query to compute an ordered list of document 
>id's sorted by date - then you can create a new index and 
>insert each document one by one by the order given by the list.
>
>In general, if no such external sorting mechanism exists, I would
>make a Java sorter class that holds some link to the document (such as 
>its filename or URL or whatever) and its date (from wherever you have 
>that info). Make sure that the class implements Comparable and write
>the  compareTo-method such that it compares the dates appropriately. 
>Then for each document you have, create a sorter object and put it 
>into a TreeSet. After you have added all sorter objects, you have a 
>sorted collection of the documents and can insert them in order into a 
>Lucene index.
>
>Clear?
>
>Karsten
>
>
>
>-----Urspr&#252;ngliche Nachricht-----
>Von: none none [mailto:korfut@lycos.com]
>Gesendet: Freitag, 10. Oktober 2003 16:21
>An: Lucene Developers List
>Betreff: Re: Better way to Sort by Date
>
>
>ok, good idea, but how can i do that?
>any examples?
>thank you,
>
>--
>
>--------- Original Message ---------
>
>DATE: Fri, 10 Oct 2003 09:18:33
>From: "Karsten Konrad" <Karsten.Konrad@xtramind.com>
>To: "Lucene Developers List" 
><lucene-dev@jakarta.apache.org>,<korfut@lycos.com>
>Cc: 
>
>>
>>Hi,
>>
>>the fastest way would be to build your index such that the documents
>>are inserted in the order of their date. You can then sort a search 
>>result very quickly by date by sorting the document numbers in the 
>>result.
>>
>>Regards,
>>
>>Mit freundlichen Gr&#252;&#223;en aus Saarbr&#252;cken
>>
>>--
>>
>>Dr.-Ing. Karsten Konrad
>>Head of Artificial Intelligence Lab
>>
>>XtraMind Technologies GmbH
>>Stuhlsatzenhausweg 3
>>D-66123 Saarbr&#252;cken
>>Phone: +49 (681) 3025113
>>Fax: +49 (681) 3025109
>>konrad@xtramind.com
>>www.xtramind.com
>>
>>Besuchen Sie uns auf der SYSTEMS!
>>20.-24. Oktober 2003, Neue Messe M&#252;nchen
>>Halle A1, Stand 518
>>
>>
>>
>>
>>-----Urspr&#252;ngliche Nachricht-----
>>Von: none none [mailto:korfut@lycos.com]
>>Gesendet: Freitag, 10. Oktober 2003 06:50
>>An: lucene-dev@jakarta.apache.org
>>Betreff: Better way to Sort by Date
>>
>>
>>hi all,
>>what is the fastest way to sort results by date?
>>anybody implemented it yet? any good performance?
>>
>>thank you,
>>Korfut.
>>
>>
>>
>>____________________________________________________________
>>Get advanced SPAM filtering on Webmail or POP Mail ... Get Lycos Mail!
>>http://login.mail.lycos.com/r/referral?aid=27005
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
>>
>
>
>
>____________________________________________________________
>Get advanced SPAM filtering on Webmail or POP Mail ... Get Lycos Mail! 
>http://login.mail.lycos.com/r/referral?aid=27005
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>



____________________________________________________________
Get advanced SPAM filtering on Webmail or POP Mail ... Get Lycos Mail! http://login.mail.lycos.com/r/referral?aid=27005

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message