lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: Return the sentence number in the indexed files
Date Sat, 19 Jul 2008 21:38:38 GMT

On Jul 19, 2008, at 6:00 AM, starz10de wrote:

>
> Hi All,
>
> I have a text files that contain several sentences, there is space  
> between
> each sentence.
> When searching the index  , i get the path for the documents that  
> match the
> query
>
> String path = doc.get("path");
>
>
> Is it possible to get the number of the sentence that match the query
> inside the matched documents?

Not without some extra work.  This kind of thing requires post (or  
pre) processing.  You can use SpanQuery to know where in a document  
you matched, and then do the sentence calculations.  Another option is  
to index each sentence as a separate document and then post process to  
combine.

If you search the archives on this list and java-dev you'll see  
several discussions on the topic.   See:
http://lucene.markmail.org/message/we25gm32p6qot32c?q=sentence+detection
and
http://lucene.markmail.org/message/uq6ffx3oqsulgxys?q=sentence

HTH,
Grant


--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message