lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: how to get the word before and the word after the matched Term?
Date Tue, 26 May 2009 11:13:30 GMT

On May 25, 2009, at 4:35 AM, KK wrote:

> One more information I would like to add,
> # I'm building index mostly for non-english texts/documents. and  
> searching
> is done using unicode utf-8 texts[its obivious, right?]
>


Yes, searching should be fine.


> Thanks
> KK
>
> On Mon, May 25, 2009 at 10:58 AM, KK <dioxide.software@gmail.com>  
> wrote:
>
>> Hi All.
>> I want to do the same thing with say a window of 10/15.
>> Can some one give me more details about how to do this i.e getting
>> neighbors[both sides] of size "window", if some examples are there  
>> please
>> point me to them/post in the mail.
>> Also I would like to know about the term query. Is it the case that  
>> the
>> term query has to be only single term , I mean can'nt we do the  
>> same thing
>> where the search query is not just a term but say a phrase[multiple  
>> terms].
>> Now I want to extract neighbors for this matched phrase. I think  
>> this is the
>> generic scenario.
>> So as per the mail I have to make use of SpanQuery, TermVector and
>> TermVectorMapper for these purposes, right?
>> NB:I also want to add hit highlighting after fixing the neighbor  
>> problem.
>>
>> Thanks,
>> KK.
>>
>>
>> On Thu, May 21, 2009 at 4:46 PM, Grant Ingersoll  
>> <gsingers@apache.org>wrote:
>>
>>> See
>>> http://www.lucidimagination.com/search/document/7fe40486bc935ce4/get_term_neighbours

>>>  (although
>>> I think you can do better than the code in the third reply by  
>>> using a
>>> TermVectorMapper such that you can process the TermVector as it  
>>> comes from
>>> disk.)
>>>
>>> Essentially, you need to use a combination of SpanQuery,  
>>> TermVector and
>>> TermVectorMapper.
>>>
>>> HTH,
>>> Grant
>>>
>>> On May 18, 2009, at 9:20 AM, Kamal Najib wrote:
>>>
>>> Hi all,
>>>> I want to  get the word before and the word after  the matched  
>>>> Term.For
>>>> Example if i have the Text " The drug was freshly prepared at 4- 
>>>> hour
>>>> intervals . Eleven courses were administered to seven patients at  
>>>> this dose
>>>> level and no patient experienced nausea or vomiting" and the  
>>>> matched Term
>>>> for example "patient" i want to get the word level and the word
>>>> experienced("and" and "no" are stop words, therefore i d'ont want  
>>>> to get
>>>> them.).I have looked at the Class Termposition but in this Class  
>>>> i can only
>>>> get the position of the matched Term, how can i get the word  
>>>> before and
>>>> after it, any suggestion?.
>>>> Thank you in advance.
>>>> Kamal
>>>> --
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://www.lucidimagination.com/
>>>
>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
>>> using
>>> Solr/Lucene:
>>> http://www.lucidimagination.com/search
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message