lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nathan Kurz <>
Subject Re: SortCache on a 32-bit OS
Date Sat, 30 Jan 2010 20:11:41 GMT
On Sat, Jan 30, 2010 at 10:11 AM, Marvin Humphrey
<> wrote:
> To solve this problem, I think we ought to mmap the ords file only and use
> sequential reads to recover values -- the same way we do with our lexicons.
> The price will be slightly increased CPU costs under a couple of
> circumstances:

In practice, this is probably fine, but I'm not convinced it's a good
path to follow.

I think you are better off declaring that a 64-bit OS is the target
platform for large indexes, and making optimizations for this platform
even if it reduces utility for those who choose to use a 32-bit
system.   I think you'll end up with simpler code, and better

The window where this choice is beneficial is small:  something like
32-bit systems using 2-4 Gig indexes with multiple sortable fields
with unique values.   Unless this is the use case that Eventful needs,
I don't think it's worth optimizing.    Would more than a handful of
people benefit from this?  Is this worth the significantly reduced
performance on a truly gigantic index?

> The increased CPU costs come from extra seeks, memory maps, and memory copies.

In general, burning CPU instructions is no problem, but consuming
excess memory IO bandwidth should be stringently avoided.  Look to the
multicore future:  you're going to have more and more processors
fighting for access to the same pool of RAM.  Reduce this contention
however you can, target the systems people will be running in 3-5
years, and you'll fly.

> I believe that with this plan we can push the index size at which address
> space runs out beyond the practical size for a single machine -- even when
> you're doing something silly like running a 32-bit OS on a box with 16 gigs of
> RAM.

Sure, these systems will exist, but solve the problem in way that
benefits everyone:  shard it!  Instead of trying to cram a 16 GB index
into a 3GB process address space through inefficient tricks, run 8 2GB
shards on the same machine, or better yet across 2 machines.  Then
there is no hard limit at max RAM, instead you just add another

32-bit machines will run more small shards (varying in size based on
the address space required), 64-bit machines will run fewer large ones
(limited by system RAM and number of cores).  But once you solve this
problem efficiently, you scale to to the moon.

Nathan Kurz

View raw message