lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From karl wettin <ka...@snigel.net>
Subject Re: Removing search results that fall within a time range
Date Tue, 23 May 2006 22:44:04 GMT
On Tue, 2006-05-23 at 17:38 -0400, Benjamin Stein wrote:
> I have a requirement to only return one result for all documents whose
> timestamps fall within N seconds of one another. (where timestamp is a
> field and N is an integer).
> 
> For example, Document A is timestamped "12:00:00" and Document B has
> timestamp "12:00:30", Document B should be discarded.  On the other
> hand, if Document B has timestamp "12:01:00" then I should return both
> (assuming 30 < N < 59 seconds).  
> 
> Similarly, if Documents A, B, and C have timestamps "12:00:00",
> "12:00:30", and "12:01:00" respectively, only Document A should be
> returned (because B is close to A, and C is close to B).
> 
> If it helps to simplify things, we can assume results are sorted by
> time.  Also, I can apply logic at index time or at search time.  
> 
> Any suggestions?  This is a pretty tough concept to search the
> archives for... 

How big is the corpus and how many hits do you estimate a search can
result in? Can you just take the penalty from iterating the hits?


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message