lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: parallel index building & searching multiple indexes
Date Mon, 11 Aug 2003 14:58:15 GMT
You didn't provide details about your setup, but my first guess is that
all your processes are writing the indices on the same disk, and that
disk is your bottleneck.

Otis


--- "Killeen, Tom" <tom.killeen@thomson.com> wrote:
> I am attempting to create approx 10 different Lucene indexes.  I'm
> trying to
> create them at the same time by running multiple processes and each
> index is
> written to a new directory.  Once I create more than one process -
> the
> performance is very, very slow.  
> 
> Any sample code out there showing an efficient way to create multiple
> indexes?
> 
> Also, Any sample code out there to search the multiple indexes?
> 
> thanks, 
> Tom
> 
> -----Original Message-----
> From: Tatu Saloranta [mailto:tatu@hypermall.net]
> Sent: Monday, August 11, 2003 8:41 AM
> To: Lucene Users List
> Subject: Re: 2,147,483,647 max documents?
> 
> 
> On Monday 11 August 2003 01:07, Kevin A. Burton wrote:
> > Why was an int chosen to represent document handles?  Is there a
> reason
> > for this?  Why wasn't a long chosen to represent document handles? 
> 64
> > bits seems like the obvious choice here except for a potentially
> bloated
> > datastore.... (32 extra bits)
> 
> I can't speak for actual reasons (not being core Lucene developer),
> but the
> general benefits of 32-bit ints vs. longs are:
> 
> - Better performance on pretty much any current architecture (even
> so-called
>   64-bit CPUs often prefer 32-bit data access, and 64-bit
> representations
> are
>   more important for addressing).
>   Also, smaller data set size is usually also good for performance
> (caching).
> - Atomicity of access (read access can often be done without
> synchronizing);
>   longs can not be atomically accessed in Java.
> 
> Another question is whether limited address space presents a real
> problem. 
> Since Lucene can reuse doc ids (or rather, there is not persistent id
> per
> se? 
> doc id is just an index, and holes left by removed docs can be
> reused?), 
> perhaps this is usually not much of an issue?
> 
> -+ Tatu +-
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

Mime
View raw message