lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Whither Query Norm?
Date Sat, 21 Nov 2009 00:51:46 GMT
Okay - my fault - I'm not really talking in terms of Lucene. Though even
there I consider it possible. You'd just have to like, rewrite it :) And
it would likely be pretty slow.


Jake Mannix wrote:
>
>
> On Fri, Nov 20, 2009 at 4:20 PM, Mark Miller <markrmiller@gmail.com
> <mailto:markrmiller@gmail.com>> wrote:
>
>     Mark Miller wrote:
>     >
>     > it looks expensive to me to do both
>     > of them properly.
>     Okay - I guess that somewhat makes sense - you can calculate the
>     magnitude of the doc vectors at index time. How is that impossible
>     with
>     incremental indexing though? Isn't it just expensive? Seems somewhat
>     expensive in the non incremental case as well - your just eating it at
>     index time rather than query time - though the same could be done for
>     incremental? The information is all there in either case.
>
>
> The expense, if you have the idfs of all terms in the vocabulary (keep
> them
> in the form of idf^2 for efficiency at index time), is pretty trivial,
> isn't it?  If
> you have a document with 1000 terms, it's maybe 3000 floating point
> operations, all CPU actions, in memory, no disk seeks. 
>
> What it does require, is knowing, even when you have no documents yet
> on disk, what the idf of terms in the first few documents are.  Where do
> you know this, in Lucene, if you haven't externalized some notion of idf?
>
>   -jake
>  
>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <mailto:java-dev-unsubscribe@lucene.apache.org>
>     For additional commands, e-mail: java-dev-help@lucene.apache.org
>     <mailto:java-dev-help@lucene.apache.org>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message