lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Galambos <galam...@com-os2.ms.mff.cuni.cz>
Subject Indexing HTML
Date Tue, 03 Dec 2002 19:32:21 GMT
I tried to use IndexHTML (demo) and Lucene 1.2 for indexing *.CZ, but
Lucene often falls to never-ending loop. I've analyzed my data, so I know
what file(s) sent Lucene down. I don't see anything special in the
file(s), so I think, that it can go throught parser to main Lucene
routines (and then the problem could be in Merger).

Could you help me, please?

One of the problematic files:
http://com-os2.ms.mff.cuni.cz/bugs/f01529.txt
My program (based on Lucene demo): 
http://com-os2.ms.mff.cuni.cz/bugs/IndexHTML.java

Thank you very much.

-g-


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message