lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Shackles" <gshack...@gmail.com>
Subject Re: Extract the text that was indexed
Date Tue, 30 Dec 2008 14:40:26 GMT
That is my understanding of it too.  Terms in the index will point to the
position of the tokens they map to.  Since one index term can point at any
number of tokens, this isn't a sequence map, but just a search map.  If you
still have the text that was indexed you could run it through an analyzer
and observe the tokens as they go through.

- Greg

On Tue, Dec 30, 2008 at 7:31 AM, Alexander Aristov <
alexander.aristov@gmail.com> wrote:

> I am not sure but from my understanding fields that are only indexed and
> not
> stored do not keep position. So even if you get back all terms for a field
> for a given document you won't be able to reconstruct original words
> sequence.
>
> And remember that not all words are indexed.
>
> Alex
>
> 2008/12/30 Lebiram <lebiram@ymail.com>
>
> > Hi All,
> >
> > Is it possible to extract the text that was indexed but not stored for a
> > field in a document?
> >
> > Right now, reader.document() returns only fields that was stored. However
> > I'd also want to get the text on the indexed only field...
> >
> > I'd appreciate your help
> >
> >
> >
> >
>
>
>
>
> --
> Best Regards
> Alexander Aristov
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message