lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Provalov <iprov...@yahoo.com>
Subject Re: Lucene Partition Size
Date Mon, 12 Apr 2010 14:43:41 GMT
Thank you, Karl!

--- On Fri, 4/9/10, Karl Wettin <karl.wettin@gmail.com> wrote:

> From: Karl Wettin <karl.wettin@gmail.com>
> Subject: Re: Lucene Partition Size
> To: java-user@lucene.apache.org
> Date: Friday, April 9, 2010, 9:39 AM
> It's hard for me to say why this is
> slow.
> 
> Here are a few more questions whose anwers might provide
> further clues:
> 
> What was the reasons that led you to partition the index
> this way?
> What does the searcher implementation look like?
> What would a typcial query sent to that searcher look
> like?
> 
> I would start by loading the full 400Gb as a single index
> on local disc.
> 
> 
>     karl
> 
> 8 apr 2010 kl. 22.07 skrev Ivan Provalov:
> 
> > Karl,
> >
> > We have not done the same scale local-disk test. 
> Our network  
> > parameters are
> >
> > -  Network speed:  1gb
> > -  3 partitions per volume
> > -  The volumes are accessed via NFS to EMC Celera
> devices. (NFS 3)
> > -  The drives are 300 gb fiber attached with
> 10,000 rpm.
> >
> > Thanks,
> >
> > Ivan
> >
> > --- On Thu, 4/8/10, Karl Wettin <karl.wettin@gmail.com>
> wrote:
> >
> >> From: Karl Wettin <karl.wettin@gmail.com>
> >> Subject: Re: Lucene Partition Size
> >> To: java-user@lucene.apache.org
> >> Date: Thursday, April 8, 2010, 2:44 PM
> >>
> >> 8 apr 2010 kl. 20.05 skrev Ivan Provalov:
> >>
> >>> We are using Lucene for searching of 200+ mln
> >> documents (periodical publications).  Is
> there any
> >> limitation on the size of the Lucene index (file
> size,
> >> number of docs, etc...)?
> >>
> >> The only such limitation in Lucene I'm aware of
> is
> >> Integer.MAX_VALUE documents. This might also be
> true for
> >> number of terms.
> >>
> >>> We are partitioning the indexes at about 10
> mln
> >> documents per partition (each partition is on a
> separate
> >> box, some mounted volumes are shared among
> three-four
> >> partitions).  We have index size around 10%
> of the
> >> content size.  The total content is around
> 4Tb and the
> >> index around 400Gb.  This content is planned
> to be
> >> split into 20 partitions (10mln docs, 200Gb
> content size,
> >> 20Gb index size).  We are using a memory
> mapped index
> >> directory implementation.  Our testing is
> done with 600
> >> concurrent users.
> >>>
> >>> We are seeing consistently high response times
> from
> >> the partitions (4-5 seconds).  Is there a
> number of
> >> documents per partition limitation in Lucene for
> this
> >> particular scenario?
> >>
> >> I'm not sure if I got this right but it sounds
> like your
> >> index is mounted over network? Can you tell us
> some more
> >> details about that? What speeds do you see if you
> put the
> >> index on local disc?
> >>
> >>
> >>
> >>     karl
> >>
> >>
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
> >
> >
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 


      

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message