lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Lucene search question
Date Tue, 13 Nov 2007 17:10:05 GMT

On Nov 13, 2007, at 11:59 AM, Steven D. Majewski wrote:
> Lucene is great at finding documents, but not quite as good at finding
> things IN documents. The index contains pointers to the terms, but  
> they are
> pointers to a token in the parsed token stream, so to find a  
> character index
> into a file, you have to (I believe) run the text thru the tokenizer  
> again.
> ( But lucene API gives you access to everything, even if it's not  
> simple or easy.
>  I think there are some new features in the latest version that can  
> make this
>  sort of thing easier, but I haven't yet figured out how to use  
> them. )

You can use Term Vectors to access the offset (and position)  
information for a document.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message