lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Winton Davies <wdav...@overture.com>
Subject Maximum file size problem
Date Fri, 02 Nov 2001 21:21:41 GMT
Hi,

  I ran into a problem earlier this week, where by an index of 8 
million small documents resulted in an index file of 2GB. It turns 
out this is a common file system limit (some say it might be a java 
limit as well).

  Anyway, I have no idea which index file it was, but it seems that I 
need to be able to control the distribution of the index ?

  Each document looks like this:

   exact_keywords: my_phrase_with_underscores , indexed, but not tokenized.
   body: my phrase with underscores (repeated N times for weight) 
<EOM> title words <EOM> body <EOM>
   add_info1:
   add_info2: (just stored, short info)
   add_info3:

   The body is pretty short (say 120 words or so).

 
  As it turns out, the Index I had created was actually usable (thank 
heaven, after 24 hours of indexing :)).

  *** Anyway, is there anyway to control how big the indexes grow ? ****


    Cheers,
    Winton


p.s. A big thank you to Doug -- I was able to deliver my experimental 
results on time :) My boss is V.Happy :)


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message