lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From starz10de <farag_ah...@yahoo.com>
Subject Re: Return the sentence number in the indexed files
Date Sun, 20 Jul 2008 11:53:16 GMT

thanks Grant for the answer,

 to index each sentence as a separate document  , i already did this and it
work fine, i indexed more than 93000 sentences (Documents)  approx. in 11
minutes. I though the other option might be more efficient.

Farag 

Grant Ingersoll-6 wrote:
> 
> 
> On Jul 19, 2008, at 6:00 AM, starz10de wrote:
> 
>>
>> Hi All,
>>
>> I have a text files that contain several sentences, there is space  
>> between
>> each sentence.
>> When searching the index  , i get the path for the documents that  
>> match the
>> query
>>
>> String path = doc.get("path");
>>
>>
>> Is it possible to get the number of the sentence that match the query
>> inside the matched documents?
> 
> Not without some extra work.  This kind of thing requires post (or  
> pre) processing.  You can use SpanQuery to know where in a document  
> you matched, and then do the sentence calculations.  Another option is  
> to index each sentence as a separate document and then post process to  
> combine.
> 
> If you search the archives on this list and java-dev you'll see  
> several discussions on the topic.   See:
> http://lucene.markmail.org/message/we25gm32p6qot32c?q=sentence+detection
> and
> http://lucene.markmail.org/message/uq6ffx3oqsulgxys?q=sentence
> 
> HTH,
> Grant
> 
> 
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
> 
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
> 
> 
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Return-the-sentence-number-in-the-indexed-files-tp18543061p18553514.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message