lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject RE: Hardware Question
Date Wed, 27 Jul 2005 22:44:39 GMT
Ah - my brain was off. :)
In the Lucene book we refer to that index format as "compound index
format", while the original format we call "multifile index format"

  http://www.lucenebook.com/search?query=compound+index
  http://www.lucenebook.com/search?query=multifile+index

Yes, the latter will give you a bit more juice.

Otis


--- Mark Bennett <mbennett@ideaeng.com> wrote:

> My apologies Otis, I should have spelled that out.
> 
> I'm going to take a stab at answering this.  But please, others on
> the list,
> chime in with corrections / clarifications.
> 
> CFS = "compact file system" or "consolidate file system" or something
> like
> that.
> 
> Essentially, each Lucene index segment is actually a set of files;
> files for
> a segment have a common file name and then a set of extensions; OR a
> segment
> is just stored as ONE file, with a .cfs extension.
> 
> CFS means that the multiple files for that segment have been joined
> together
> into one physical file; inside there is actually the original set of
> logical
> files, but on your disk it's just one file and one set of file
> handles to
> open that segmgent.
> 
> If you do a DIR / ls on your indexes, if you see a bunch of .cfs
> files, then
> you're using CFS.  The default for the past version or so is that you
> DO get
> CFS files unless you say otherwise.
> 
> I think the idea is that, generally, having fewer physical files is
> better,
> in terms of file handles, etc.  But for search performance, I'm not
> sure if
> that's always the best case; certainly for indexing it takes more
> work to
> create CFS files.
> 
> 
> -----Original Message-----
> From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
> Sent: Wednesday, July 27, 2005 3:20 PM
> To: java-user@lucene.apache.org; mbennett@ideaeng.com
> Subject: RE: Hardware Question
> 
> What's CFS?  Cryptographic File System?  I'm not being sarcastic
> here,
> I'm really curious about what you referring to.
> 
> Otis
> 
> --- Mark Bennett <mbennett@ideaeng.com> wrote:
> 
> > Also, non-hardware, have you considered turning off CFS?
> > 
> > Our client told us this sped up their system.
> > 
> > -----Original Message-----
> > From: Chris Lamprecht [mailto:clamprecht@gmail.com] 
> > Sent: Wednesday, July 27, 2005 11:52 AM
> > To: java-user@lucene.apache.org
> > Subject: Re: Hardware Question
> > 
> > It depends on your usage.   When you search, does your code also
> > retrieve the docs (using Searcher.document(n), for instance).  If
> > your
> > index is 8GB, part of that is the "indexed" part (searchable), and
> > part is just "stored" document fields.
> > 
> > It may be as simple as adding more RAM (try 4, 6, and 8GB) -- but
> not
> > for your java heap -- instead for the linux filesystem cache.
> > 
> > I suggest first adding some simple timing output to your search. 
> You
> > want to see how much time you are spending in the call to search(),
> > and then how much time you're spending pulling the Documents from
> the
> > index (and how much time you're spending in other parts of your
> > search
> > application).   The call to search() is typically CPU-intensive,
> > while
> > pulling Documents is I/O-bound.  And RAM is about 5 or 6 orders of
> > magnitude faster than disk I/O.
> > 
> > -chris
> > 
> > On 7/27/05, Michael Celona <mcelona@criticalmention.com> wrote:
> > > I am going over ways to increase overall search performance.
> > > 
> > > 
> > > 
> > > Currently, I have a dual zeon with 2G of ram dedicated to java
> > searching
> > an
> > > 8G index on one 7200 rpm drive.
> > > 
> > > 
> > > 
> > > Which will give the greatest payoff?
> > > 
> > > 
> > > 
> > > 1)       Going to 64bit server and giving more memory to java
> with
> > faster
> > > drives
> > > 
> > > 
> > > 
> > > Or
> > > 
> > > 
> > > 
> > > 2)       Staying with 32bit server but going with faster drives
> and
> > > splitting the operating system from the index drive.
> > > 
> > > 
> > > 
> > > 
> > > 
> > > Basically, what are the performance improvements from separating
> > the
> > > operation system form the index drive(s).
> > > 
> > > 
> > > 
> > > 
> > > 
> > > Thanks,
> > > 
> > > Michael
> > > 
> > > 
> > > 
> > > 
> > >
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> > 
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> > 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message