lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: hit position
Date Tue, 07 Sep 2010 11:34:42 GMT
Line number is a completely unknown concept to Lucene, you have to somehow
figure it out. I've seen at least two ways to make that work:
1> use payloads. A payload is just a bit of data you attach to each token,
what you
     put in there is up to you, so you can encode this kind of information
however you
     want. See:

2> You can do a similar sort of thing by recording the relevant information
     you analyze a document and the include that data in a very special
     stored-only field) in your document. Say the offsets of each beginning
of line.
     This field would never be searched, just used to find out what line a
     was on. Then, when you can find the lines numbers once you know the
     positions. Stealing from Grant:
I think you can do better than the code in the third reply by using a
TermVectorMapper such that you can process the TermVector as it comes from

Essentially, you need to use a combination of SpanQuery, TermVector and


On Mon, Sep 6, 2010 at 10:36 PM, Lev Bronshtein

> Now that I can index my data, I want to be able to search it and report
> some sort of position information with every hit, such as a line number or a
> byte ofset within the stream.  Any idea how I can acoomplish this?
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message