lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doron Cohen <DOR...@il.ibm.com>
Subject Re: Resolving term vector even when not stored?
Date Sat, 17 Mar 2007 07:15:35 GMT
"Mike Klaas" <mike.klaas@gmail.com> wrote on 16/03/2007 14:26:46:

> On 3/15/07, karl wettin <karl.wettin@gmail.com> wrote:
> > I propose a change of the current IndexReader.getTermFreqVector/s-
> > code so that it /always/ return the vector space model of a document,
> > even when set fields are set as Field.TermVector.NO.
> >
> > Is that crazy? Could be really slow, but except for that.. And if it
> > is cached then that information is known by inspecting the fields.
> > People don't go fetching term vectors without knowing what thay are
> > doing, are they?
>
> The highlighting contrib code does this: attempt to retrieve the
> termvector, catch InvalidArgumentException, fall back to re-analysis
> of the data.

This way makes more sense to me.  IndexReader.getTermFreqVector() means its
there, just bring it, while the fall-back is more a
computeTermFreqVector(), which takes much more time.  Users would likely
prefer getting an exception for the get() (oops, term vectors were not
saved..) rather then auto falling back to an expensive computation.

This functionality seems proper as a utility, so it can be reused, I think
perhaps in contrib?

>
> I'm not sure if that is crazy, but that is what is currently implemented.
>
> -Mike


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message