Mailing-List: contact lucy-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: lucy-dev@lucene.apache.org
Received-SPF: pass (nike.apache.org: local policy)
Date: Sat, 30 Jan 2010 13:15:29 -0800
To: lucy-dev@lucene.apache.org
Subject: Re: SortCache on a 32-bit OS
Message-ID: <20100130211529.GB9977@rectangular.com>
References: <20100130181122.GA9529@rectangular.com>
 <65d3176c1001301211g33559858oa35e18680cb2f80a@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <65d3176c1001301211g33559858oa35e18680cb2f80a@mail.gmail.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
From: Marvin Humphrey <marvin@rectangular.com>

On Sat, Jan 30, 2010 at 12:11:41PM -0800, Nathan Kurz wrote:

> The window where this choice is beneficial is small:  something like
> 32-bit systems using 2-4 Gig indexes with multiple sortable fields
> with unique values.   Unless this is the use case that Eventful needs,

Well, actually... yes, it is.

However, I don't think that's a bad thing in this case.  Everyone at Eventful
involved with Lucy and KinoSearch agrees that we should not hammer misfeatures
into open source code.  This is a position of enlightened self-interest.  If
we were to do such a thing, and if I ever leave Eventful, all of a sudden
Eventful is counting on misfeatures but doesn't have influence on a project
leader willing to abuse his position and keep the misfeatures jammed in there.
Eventful is much more interested in sponsoring *good* work that they can count
on regardless of whether I continue to be employed by them.  

Eventful's need for fast index reopens was what drove the accelerated
development of mmap'd sort caches in the first place.  It got priority bumped
because of who pays my salary, but it's a great feature, regardless.  What I'm
proposing now is also useful for Lucy at large.  It's not a hack.  If I'm
gonna write hacks for Eventful, they'll stay in private code.

Indexes can actually grow larger than 2-4 GB on such systems and still
maintain top performance.  Because 32-bit operating systems can exploit the
full RAM on a machine and use it for system IO cache, you can have indexes
over 4 GB that stay fully RAM-resident.

> Would more than a handful of people benefit from this?  

Yes, I believe so -- but more to the point, the people who benefit from this
benefit greatly.

The problem with running out of address space is that there's no warning
before catastrophic failure, and then no possibility of recovery short of
rearchitecting your search infrastructure or installing a new operating
system.  It's a really serious glitch to hit.  It would suck if Eventful hit
it, but I really don't want anybody else to hit it either.  

When I worked out the windowing system that provides Lucy with 32-bit
compatibility, I thought I had solved this problem and that no one would ever
hit the out-of-address-space limitation.  It was only when we actually built
some gigantic indexes -- that happen to perform great because they don't ever
need to iterate large posting lists -- that I realized the current
implementation of SortCache might pose a problem.

> > The increased CPU costs come from extra seeks, memory maps, and memory copies.
> 
> In general, burning CPU instructions is no problem, but consuming
> excess memory IO bandwidth should be stringently avoided.  

I should specify that the extra calls to mmap() and munmap() occur on 32-bit
systems only.  For 64-bit systems, we mmap() the whole compound file the
instant it gets opened, and InStream_Buf() is just a thin wrapper around some
pointer math.  

The seeks, likewise, are not system calls on 64-bit systems -- they're just a
method call and pointer math.

The only real downside is the cost of copying text data rather than copying
pointers.  But even then, we have to read over all character data with each
call to TextSortCache_Value() anyway, because we have to perform a UTF-8
sanity check.  Plus, the cost scales with the number of segments rather than
the number of documents.  Plus, CharBufs are easier to deal with than
ViewCharBufs because you don't have to worry about the parent object
disappearing.

> > I believe that with this plan we can push the index size at which address
> > space runs out beyond the practical size for a single machine -- even when
> > you're doing something silly like running a 32-bit OS on a box with 16 gigs of
> > RAM.
> 
> Sure, these systems will exist, but solve the problem in way that benefits
> everyone:  shard it!  

Well, that sort of sharding is not within the scope of Lucy itself.  It's a
Solr-level solution.

> Instead of trying to cram a 16 GB index into a 3GB process address space
> through inefficient tricks, run 8 2GB shards on the same machine, or better
> yet across 2 machines.  Then there is no hard limit at max RAM, instead you
> just add another machine.

Heh.  I couldn't agree more.  Believe me.

Marvin Humphrey