lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From manjula wijewickrema <manjul...@gmail.com>
Subject Re: Access indexed terms
Date Fri, 14 May 2010 12:24:55 GMT
Hi Andrzej

Thanx for the reply. But as you have mentioned, creating arrays for indexed
terms seems to be little difficult. Here my intention is to find the term
frequencies (of terms) of an indexed document. I can find the term frequency
of a particular term (giving as a query) if I specify the term in the code.
But I really want is to get the term frequency (or even the number of times
it appears in the document) of the all indexed terms (or high frequency
terms) without named them in the code. Is there an alternative way to do
that?

Thanks
Manjula


On Fri, May 14, 2010 at 4:00 PM, Andrzej Bialecki <ab@getopt.org> wrote:

>  On 2010-05-14 11:35, manjula wijewickrema wrote:
> > Hi,
> >
> > Is it possible to put the indexed terms into an array in lucene. For
> > example, imagine I have indexed a single document in Lucene and now I
> want
> > to acces those terms in the index. Is it possible to retrieve (call)
> those
> > terms as array elements? If it is possible, then how?
>
> In short: unless you created TermFrequencyVector when adding the
> document, the answer is "with great difficulty".
>
> For a working code that does this see here:
>
>
> http://code.google.com/p/luke/source/browse/trunk/src/org/getopt/luke/DocReconstructor.java
>
> If you really need such kind of access in your application then add your
> documents with term vectors with offsets and positions. Even then,
> depending on the Analyzer you used, the process is lossy - some input
> data that was discarded by Analyzer is simply no longer available.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message