lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex vB <m...@avomberg.de>
Subject Re: New codecs keep Freq skip/omit Pos
Date Fri, 22 Apr 2011 16:03:26 GMT
Hello Robert,

thank you for the answers! :)
Currently I used PatchedFrameOfRef and PatchedFrameOfRef2. Therefore both
implementations are PForDelta! Sorry my mistake.

PatchedFrameOfRef2: PforDelta W/O Freq W/O Pos               1.6 GB 
PatchedFrameOfRef :  Pfor W/O Freq W/O Pos                      3.1 GB 

Here are some numbers:
PatchedFrameOfRef2 w/o POS w/o FREQ
segements.gen  20 Bytes
_43.fdt  8,1 MB
_43.fdx  64,4 MB
_43.fnm  20 Bytes
_43_0.skp  182,6 MB
_43_0.tib  32,3 MB
_43_0.tiv  1,0 MB
segements_2  268 Bytes
_43_0.doc  1,3 GB

PatchedFrameOfRef w/o POS w/o FREQ
segements.gen  20 Bytes
_43.fdt  8,1 MB
_43.fdx  64,4 MB
_43.fnm  20 Bytes
_43_0.skp  182,6 MB
_43_0.tib  32,3 MB
_43_0.tiv  1,1 MB
segements_2  267 Bytes
_43_0.doc  2,8 GB

During indexing I use StandardAnalyzer (StandardFilter, LowerCaseFilter,
StopFilter). 
Can I get somewhere more information for Codec creation or is there just
"grubbing" through the code? 

My own implementation needs 2,8 GB of space including FREQ but not POS. This
is why I am asking because I want somehow compare the result. Compared to 20
GB it is very nice and compared to 1,6 GB it is very bad ;).

Regards
Alex


--
View this message in context: http://lucene.472066.n3.nabble.com/New-codecs-keep-Freq-skip-omit-Pos-tp2849776p2851809.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message