lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: TermVectorsWriter and DocumentsWriter
Date Fri, 17 Aug 2007 17:26:01 GMT
Michael McCandless wrote:
> One thing I have been wondering is whether it really is necessary to
> sort the term vectors before writing to the index....

Terms in vectors are prefix-compressed.  So not sorting would make 
indexes bigger, and slower to read & write.

http://lucene.apache.org/java/docs/fileformats.html#Term%20Vectors

Also, having them sorted makes it much easier to do dot products between 
document vectors, a potentially common operation.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message