lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Obtaining the (indexed) terms in a field in a particular document
Date Tue, 20 Mar 2007 19:08:00 GMT
Sorry, but you have to have the Lucene document ID, which you
can get either as part of a Hits or HitCollector or...
or by using TermDocs/TermEnum on your unique id (my_id in
your example).

Erick

On 3/20/07, Erick Erickson <erickerickson@gmail.com> wrote:
>
> You can do a document.get(field), *assuming* you have stored the data
> (Field.Store.YES) at index time, although you may not get
> stop words.
>
> On 3/20/07, Donna L Gresh <gresh@us.ibm.com> wrote:
> >
> > My apologies if this is a simple question--
> >
> > How can I get all the (stemmed and stop words removed, etc.) terms in a
> > particular field of a particular document?
> >
> > Suppose my documents each consist of two fields, one with the name
> > "my_id"
> > and a unique identifier, and the other being some text string consisting
> > of a number of words.
> > I'd like to get all the terms in the text string given the unique
> > identifier.
> >
> > (My basic reason is to do a sort of document similarity between the text
> >
> > string and some other text string, doing a boolean query with
> > a number of SHOULD clauses, if this makes sense; I'm welcome to
> > suggestions of better ways to do this)
> >
> > Donna L. Gresh
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message