lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dan sutton <danbsut...@gmail.com>
Subject Large .frq file
Date Tue, 18 Jan 2011 12:13:05 GMT
Hi,

We're trying to create a large index via solr for trends and notice
that we have a large '.frq' file after doing the following:


make all text fields index="true", stored="false",
omitTermFreqAndPositions="true" omitNorms="true" termPositions="false"
termOffsets="false" termVectors="false"

We are using a variation on org.apache.lucene.analysis.cjk and notice
that the .frq is about 4 time larger than, for example, the
WhiteSpaceTokenizer.


Considering that with omitTermFreqAndPositions="true" for the text
fields I'd have thought this should be : "If omitTf were true it would
be this sequence of VInts instead:"
(http://lucene.apache.org/java/2_9_1/fileformats.html#Frequencies)


Can anyone suggest how I can reduce the size of this file?


Many thanks,
Dan

Lucene Specification Version: 2.9.1
Solr Specification Version: 1.4.0.2010.09.10.17.10.36

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message