lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Tyrrell" <>
Subject Indexing process causes Tomcat to stop working
Date Wed, 27 Oct 2004 10:58:57 GMT

I am a Java/Lucene/Tomcat newbie I know that does not bode well as a start 
to a post but I really am in dire straits as far as Lucene goes so bear with 
me. I am working on indexing and replacing search functionality for a 
website (about 10 gig in size, although only about 7 gig is indexed) I 
presently have a working model based on the luceneweb demo dispatched with 
Lucene, this has already proven functional when tested on various sites 
(admittedly much smaller 200-400mb etc). However, issues occur when 
performing the index on the main site that I havenít found explained on any 
of the Lucene forums thus far.

After a successful index and optimisation of the website (takes around 4hrs 
40m unoptimised) I canít get to the index.jsp or even access tomcat. My 
first thought was to restart tomcatÖ No joy and no access. Thinking the 
larger index had killed the test server I accessed apache on port 80, which 
worked perfectly.  After a few checks I realised the test server was fine, 
apache was fine, used the same application to create an index of the tomcat 
docs so java was working. Confused I went back to the forums, FAQ's and 
groups to see if anyone had any similar problems and have come up with a 
brief list of what my problem is not;

There is no index write.lock files found for Lucene in either /tmp or 
opt/tomcat/temp directories so the index is open to be searched. Nor does 
Ďtopí reveal anything overloading the system. Apache is running fine and 
displays all relevant pages. Tomcat cannot be reached with a browser 
(neither the default congratulations page or the Luceneweb application) 
Tomcat was a fresh install as was Java, Tomcat logs show nothing different 
to standard startup logs. So I logged the entire indexing process and saw 
two errors occurring infrequently.

Parse Aborted: Encountered "\"" at line 6, column 129. //where these values 
Was expecting one of:
   <ArgName> ...
   "=" ...
   <TagEnd> ...

Iím satisfied this is just the HTML parser kicking off about some badly 
formatted HTML and is only affecting what is indexed but its here for 
completeness. The other error is more serious: Pipe closed
       at sun.nio.cs.StreamEncoder.write(
       at sun.nio.cs.StreamEncoder.write(

Iím again pretty sure that this is the same error that occurred once before 
when I was using the maxFieldLength to limit the number of terms recorded. 
Iím also confident itís a threading error and found the following post by 
Doug Cutting that seemed to explain it 
however I am assuming thatís what it is and havenít yet attempted to change 
the threading system of the demo as yet due to my lack of java knowledge.

The strange thing is after restarting the server all aspects of the Lucene 
web application work perfectly stemming, alphanumeric indexing summaries etc 
are all as expected, so I am left assuming due to this (and by running out 
of options) that Lucene has somehow done something to Tomcat by doing such a 
large index. Being that both run off Java I guess its something to do with 
that but I have nowhere near enough experience in java to work out what

The system I am currently running on is Java Ė 1.4.2_05, Tomcat Ė 5.0.27, 
Lucene Ė 1.4.1, Linux version Ė 2.4.20-8 (gcc version 3.2.2 20030222 (Red 
Hat Linux 3.2.2-5)), Apache 2.0.42. I have not modified the mergeFactor or 
MaxMergeDocuments nor am I using RAMdirectories. The processor is 800MHz and 
there is 128mb of RAM.

If more info is required on setup, source code etc or you think this should 
be moved to a tomcat forum just post.

Best regards and thanks in advance for any advice you can offer,

J Tyrrell

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message