lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Dmitry's Term Vector stuff, plus some
Date Tue, 17 Feb 2004 21:03:35 GMT
Grant Ingersoll wrote:
> Do you see any reason to write position information at all for the term vectors?

It could be useful to some folks.  If, for example, you only want to 
expand a query with terms that occur near query terms, like automatic 
phrase identification.  In general, the vector stuff is just a constant 
factor improvement over re-tokenizing the text of the document, but 
hopefully a substantial one.  If folks are doing computations which 
require positional information, but don't require the actual text (e.g., 
they don't need user-readable fragments) then positions could be handy.

But, certainly, most applications for term vectors do not need 
positions, and I would not be upset if these were left out of the first 
version.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message