lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James <ajd_...@yahoo.com>
Subject Re: sorting on "dates" a little fuzzy...
Date Fri, 22 Apr 2005 04:26:11 GMT

Hi Che-

The presort method was our first approach but this doesn't work
in practice because we update the index incrementally and insertion order
doesn't match date ordering as we add updates.

I don't think sorting top hits only will deliver what the user is
expecting -- that is, results listed in most-recent-first order.  
Is there a better way to do it?

BTW, we're not seeing a ridiculous performance degredation arising
out of sorts on large result sets.  But on the other hand, sort
doesn't seem to be working very well, so far...

Regards,
James 

--- Che Dong <chedong@chedong.com> wrote:
> Just like Google said: full text search service is not traditional 
> database application. Lucene is not a database too: if you wanna sort on 
> some fields, you'd better pre-sort it before it indexed: like date. then 
> get results by doc id.
> 
> For lucene you can only sort results in top hits. if you sort 400k 
> result hits by date: you lost the speed of Lucene.
> 
> 
> Thanks
> 
> Che Dong
> http://www.chedong.com/
> 
> Erik Hatcher 写道:
> > 
> > On Apr 21, 2005, at 5:22 PM, James Levine wrote:
> > 
> >> I have an index of around 3 million records, and typical queries
> >> can result in result sets of between 1 and 400,000 results.
> >>
> >> We have indexed "dateTime" fields in the form 20050415142, that is, to
> >> 10-minute precision.
> >>
> >> When I try to sort queries I get something back that is roughly sorted
> >> on index, but not quite. Stuff is out of order just a bit. The
> >> size of the result set does not seem to be related occurance of
> >> this problem.
> >>
> >> We've tried lucene 1.4-final and1.4.3.
> >>
> >> my code looks like this
> >>
> >> s = new Sort( new SortField[] { new SortField( "dateTime", 
> >> SortField.STRING,
> >> true ), SortField.FIELD_SCORE } );
> >>
> >> ...
> >>
> >> hits = searcher.search( qry, s );
> >>
> >>
> >> Any help is appreciated, I'm so far baffled by this problem.
> > 
> > 
> > I don't have a solution, but rather some questions to check.... are all 
> > dateTime's the same width, zero padded on the right?  Does every 
> > document have a dateTime field?
> > 
> > I recommend you sort with type INT instead of STRING if it fits, or 
> > maybe LONG.  STRING will use the most resources for sorting.
> > 
> >     Erik
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message