lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Neelam Bhatnagar" <nbhatna...@sapient.com>
Subject Re: too many files open issue
Date Tue, 23 Nov 2004 10:41:55 GMT
Hi Dmitry,
 
Thank you so much for your reply. 
 
I'd like to answer your specific questions. 
 
>>It also depends on whether you are using "compound files" or not (this
is a flag on the IndexWriter). >>With compound files flag on, segments
have fixed number of files, regardless of how many fields >>you use.
Without the flag, each field is a separate file.
 
We are using Lucene 1.2 and hence we don't have this compound file
property in IndexWriter class. This would mean that we are having a
separate file for each field. 
 
>>By the way, it is usual to have the file descriptors limit set at 9000
or so for unix machines running >>production web applications. By the
way 2, on Solaris, you will need to modify a value in >>/etc/systems to
get up to this level. Not sure about Linux or other flavors.
 
We are using SunOS 5.8 on a Sparc Sunfire280 R machine. Running ulimit
-n gives number 256. 
This is the number we had first tried to reduce to 200 and then bring
back up to 500 without any luck. Then ultimately, everything started to
work on the default number 256. 
We had tried to alter this number using ulimit command itself instead of
changing it in the /etc/system file. 
 
>>Another suggestion - you may want to look into a tool called "lsof".
It is a utility that will show file >>handles open by a particular
process. It could be that some other part of your process (or of the
>>application server, VM, etc) is not closing files. This tool will help
you see what files are open and >>you can validate that all of the
really need to be open.
 
The "lsof" tool is available through following path 
ftp://vic.cc.purdue.edu/pub/tools/unix/lsof
which is not accepting anonymous access. Hence we have not been able to
download this tool to figure out what's going on with the processes and
the files being opened by them.
 
 
The most worrying aspect about the whole scenario is that there's no
consistency in the way system behaves. It works fine with the default
settings then suddenly it stops working. Then after changing the
settings several times, it works again then breaks again. Our worry is
that we may not be going in the right direction with this approach. 
 
Kindly advise.
 
Thanks and regards
Neelam Bhatnagar
 
 
-----Original Message-----
From: Dmitry [mailto:dmitrys@earthlink.net] 
Sent: Monday, November 22, 2004 8:46 PM
To: Lucene Users List
Subject: Re: Too many open files issue
 
I'm sorry, I wasn't involved in the original conversation but maybe I 
can jump in with some info that will help.
 
The number of files depends on the merge factor, number of segments, and

number of indexed fields in your index. It also depends on whether you 
are using "compound files" or not (this is a flag on the IndexWriter). 
With compound files flag on, segments have fixed number of files, 
regardless of how many fields you use. Without the flag, each field is a

separate file.
 
Let's say you have 10 segments (per your merge factor) that are being 
merged into a new segment (via an optimize call or just because you have

reached the merge factor). This means there are 11 segments open at the 
same time. If you have 20 indexed fields and are not using compound 
files, that's 20 * 11 = 220 files. There are a few other files open as 
well, plus whatever other files and sockets that your JVM process is 
holding open at that time. This would include incoming connections, for 
example, if this is running inside a web server. If you are running in 
an application server, this could include connections and files open by 
other applications in that same app server.
 
So the numbers run up quite a bit.
 
By the way, it is usual to have the file descriptors limit set at 9000 
or so for unix machines running production web applications. By the way 
2, on Solaris, you will need to modify a value in /etc/systems to get up

to this level. Not sure about Linux or other flavors.
 
Another suggestion - you may want to look into a tool called "lsof". It 
is a utility that will show file handles open by a particular process. 
It could be that some other part of your process (or of the application 
server, VM, etc) is not closing files. This tool will help you see what 
files are open and you can validate that all of the really need to be
open.
 
Best of luck.
Dmitry.
 
 
Neelam Bhatnagar wrote:
 
>Hi,
> 
>I had requested help on an issue we have been facing with the "Too many
>open files" Exception garbling the search indexes and crashing the
>search on the web site. 
>As a suggestion, you had asked us to look at the articles on O'Reilly
>Network which had specific context around this exact problem. 
>One of the suggestions was to increase the limit on the number of file
>descriptors on the file system. We tried it by first lowering the limit
>to 200 from 256 in order to reproduce the exception. The exception did
>get reproduced but even after increasing the limit to 500, the
exception
>kept coming until after several rounds of trying to rebuild the index,
>we finally got to get it working for the default file descriptor limit
>of 256.  This makes us wonder if your first suggestion of optimizing
>indexes is a pre-requisite to trying this option. 
> 
>Another piece of relevant information is that we have the default merge
>factor of 10.
> 
>Kindly give us pointers to what it that we are doing wrong or should we
>be trying something completely different.
> 
>Thanks and regards
>Neelam Bhatnagar
> 
>
>  
>
 
 
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
 
 

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message