lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <>
Subject Re:
Date Wed, 17 Sep 2008 03:13:33 GMT
Chris Hostetter wrote:
> : Or do we make a replacement for TopDocCollector which doesn't have
> this : drawback, and uses an alternative for PriorityQueue which
> allows its array to : grow?
> I don't see that as being much better -- you still wouldn't want to
> pass MAX_INT because waht if there really are MAX_INT-42 results?  do
> you want the array to grow that big?  if you are prepared to deal
> with that many results, you might as well preallocate the array so
> that you dont' wind up with two ginormous arrays during the "grow"
> steps.
> I suppose there's some "middle" ground though ... a collector where
> you say "i expect to have less then N results, so allocate a priority
> queue that big and start with that, but i'm willing to accept (and
> want) up to the first M results, so grow the queue to M if needed"

Well, it turns out the theoretical maximum for the Swing case is not 
Integer.MAX_VALUE, but is actually somewhere around 200 million due to 
limitations in Swing itself (JTable inside a JScrollPane, if row height 
times row count exceeds Integer.MAX_VALUE.)

In reality memory runs out sooner, which is why I was considering 
on-disk storage of the topdocs.

But even if the user doesn't want to *see* the items which come back, 
quite often they do want to Select All and perform some operation (e.g. 
tag them, copy the data to somewhere.)  So regardless of the page size 
or the amount of data visible at any point in time, there is always the 
need to get the "set of every match" for any given search eventually.

Maybe others have different opinions as they are working on webapps, 
where the user is already expecting paging before they even see the 
results page.


Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis                                and eDiscovery software

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message