lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <grant.ingers...@gmail.com>
Subject Re: AW: get terms by positions
Date Tue, 03 Oct 2006 14:20:18 GMT
I should note, though, that we do this using the Lucene index, using  
the TermDocs, etc.

On Oct 3, 2006, at 8:42 AM, Grant Ingersoll wrote:

> We often calculate co-occurrence information as an offline task and  
> store it and then it is just a simple lookup at run time.   You  
> just have to put together the appropriate loops based on the window  
> size that you want for any given term.  Probably not efficient if  
> you index is changing a lot.
>
> -Grant
>
> On Oct 3, 2006, at 6:14 AM, Renzo Scheffer wrote:
>
>> I try to get back a list of all left or right neighbours of a  
>> searchterm.
>> Then I will count them to get back the Information, how often a  
>> specific
>> word is used as neighbour of the searchterm. I know that the  
>> results are
>> variable according to the used Analyzer/Filter. It's just an  
>> experiment and
>> first I'll try to find out if it is possible to do something like  
>> that with
>> Lucene.
>>
>> Renzo
>>
>> -----Ursprüngliche Nachricht-----
>> Von: Nicolas Lalevée [mailto:nicolas.lalevee@anyware-tech.com]
>> Gesendet: Dienstag, 3. Oktober 2006 00:04
>> An: java-user@lucene.apache.org
>> Betreff: Re: get terms by positions
>>
>> Le Lundi 02 Octobre 2006 23:06, Renzo Scheffer a écrit :
>>> Hi,
>>>
>>>
>>>
>>> can anybody be so kind to tell me if it is possible to search a  
>>> Term by
>> its
>>> position?
>>>
>>>
>>>
>>> I search a term (for excample "soccer") and get back the DocId's and
>>> positions as follows:
>>>
>>>
>>>
>>>
>>>
>>> TermPositions termPos = reader.termPositions(new
>>> Term("contents","soccer"));
>>>
>>> while(termPos.next()){
>>>
>>> int freq = termPos.freq();
>>>
>>> for(int i=0; i<freq; i++){
>>>
>>>
>>>
>>>       int docNumber = termPos.doc();
>>>
>>>       int position = termPos.nextPosition();
>>>
>>> System.out.println("DocId: "+docNumber+"; Pos:"+position);
>>>
>>> }
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Output:
>>>
>>>
>>>
>>> DocId: 0; Pos: 1
>>>
>>> DocId: 0; Pos: 4
>>>
>>> DocId: 0; Pos: 7
>>>
>>> DocId: 1; Pos: 3
>>>
>>> DocId: 1; Pos: 7
>>>
>>>
>>>
>>> Now I try to get back terms, one position before/after "soccer". I
>>> considered to take the
>>>
>>> Position and increase or decrease it. But I can't find a way to  
>>> get back a
>>> term, according to the given Position.
>>>
>>> Can anybody help me?
>>>
>>
>> I think this is a non-sense to try to find a term. In Lucene, you  
>> search
>> with
>> a term, you are not trying to get some. Basically, in Lucene, you  
>> have a
>> list
>> of term pointing on documents, not the reverse.
>>
>> Maybe if you explain why you are trying to do that, we can find a  
>> better way
>>
>> to do it.
>>
>> Nicolas
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> --------------------------
> Grant Ingersoll
> Sr. Software Engineer
> Center for Natural Language Processing
> Syracuse University
> 335 Hinds Hall
> Syracuse, NY 13244
> http://www.cnlp.org
>
> Voice: 315-443-5484
> Skype: grant_ingersoll
> Fax: 315-443-6886
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

------------------------------------------------------
Grant Ingersoll
http://www.grantingersoll.com/



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message