lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Jiang <sam.ji...@karoshealth.com>
Subject Re: searching / sorting on timestamp and update efficiency
Date Thu, 22 Sep 2011 18:27:39 GMT
Hi Jason

Thank you for the quick reply. This is exactly what I was looking for =D
One more question tho, NumericRangeQuery says the class is equivalent to
NumbericRangeFilter functionally. Is there any preference between the two?

thanks

On Thu, Sep 22, 2011 at 10:29 AM, Sendros, Jason <
Jason.Sendros@verizonwireless.com> wrote:

> Storing the date as a long and then searching with NumericRangeQuery will
> provide you with exactly what you're looking for. This is an efficient
> search solution for numeric data.
>
> Optimize() will reduce the size of your index and improve search time at
> the cost of a large burst of overhead. Unless your searches are getting
> noticeably slower or your index is expanding rapidly, you're better off
> using IndexReader.reopen() for regular updates and optimize() occasionally.
>
> Note that when using IndexReader.reopen() you should close the original
> IndexReader if it is still open to avoid memory leaks.
>
> Jason
>
> -----Original Message-----
> From: Sam Jiang [mailto:sam.jiang@karoshealth.com]
> Sent: Thursday, September 22, 2011 10:18 AM
> To: java-user@lucene.apache.org
> Subject: searching / sorting on timestamp and update efficiency
>
> Hi all
>
> I have some questions about how I should store timestamps.
>
> From my readings, I can see two ways of indexing timestamps:
> DateTools (which uses formated timestamp strings) and
> NumericUtils (which uses a long?).
>
> I'm not sure which one gives more performance in my scenario:
> For each of my document, it needs to have an indexed millisecond resolution
> timestamp. Almost all searches would be invoked with a range filter
> (searching at hour resolution is sufficient).
> There are usually 2-4 updates to this timestamp field for recently indexed
> documents. And afterwards, updates to this field or any other fields are
> rare.
>
> It would be great if somebody can advice me which format should I use.
> p.s. should I be calling optimize() often given my frequent updates?
>
> thanks
>
> --
> Sam Jiang | karoshealth
> (っ゚Д゚;)っ hidden cat here
> 7 Father David Bauer Drive, Suite 201
> Waterloo, ON, N2L 0A2, Canada
> www.karoshealth.com
>



-- 
Sam Jiang | karoshealth
(っ゚Д゚;)っ hidden cat here
7 Father David Bauer Drive, Suite 201
Waterloo, ON, N2L 0A2, Canada
www.karoshealth.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message