lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Developer Developer" <devquesti...@gmail.com>
Subject Re: Lucene newbee quesiton- Term Positions
Date Sun, 07 Oct 2007 16:48:50 GMT
Hi Eric,

Thanks for the quick reply.    My index does not return any hits when i
search for certain phrases . I am very sure that the indexed documents does
have those phrases in them.

Therefore i want to just list all the terms and their postions for given
document just to make sure that the indexed document does have those terms
indexed in the correct order.

I did check with luke and came up with the following code that does not seem
to be working !!. positions.next()) returns flase !.  Do you see anything
wrong in this code?

Directory dir = FSDirectory.getDirectory(args[0]);
IndexReader reader = IndexReader.open(dir);
TermPositions positions = reader.termPositions();

 while(positions.next())
  {
     positions.nextPosition();

     positions.nextPosition();
     byte b[] = positions.getPayload(null, 0);
     System.out.println(b);
  }





On 10/7/07, Erick Erickson <erickerickson@gmail.com> wrote:
>
> I suspect that this is more work than you think, not to mention
> very slow. This is just due to the nature of an inverted
> index....
>
> To see what I mean, get a copy of Luke and have it
> reconstruct one of your documents and you'll see what the
> performance is like.
>
> I think Luke has all the example code you could ask for, that's
> the place I'd look first. See:
> http://lucene.apache.org/java/docs/contributions.html
>
> Why do you want to do this and is it really necessary? You
> could think about storing the entire document, then when you
> needed to count terms, just using one of the tokenizers and
> counting them yourself....
>
> Best
> Erick
>
> On 10/7/07, Developer Developer <devquestions@gmail.com> wrote:
> >
> > Hello,
> >
> > I have simple lucene 2.2 index created. I want to  list all the terms
> and
> > their positions in a document. how can I do it ?
> >
> > Can you please provide some sample code.
> >
> > Thanks !
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message