lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chris.b" <omelhornomedomu...@gmail.com>
Subject Re: Problem with termdocs.freq and other
Date Mon, 10 Dec 2007 11:27:20 GMT

Okay, now i feel real stupid :p
Seen as that solved all my problems (i think),

thank you very much,
Chris



Doron Cohen wrote:
> 
>>          while (termDocs.next()) {
>>             termDocs.next();
>>          }
> 
> For one, this loop calls next() twice in each iteration,
> so every second is skipped... ?
> 
> "chris.b" <omelhornomedomundo@gmail.com> wrote on 10/12/2007 12:58:15:
> 
>>
>> Here goes,
>> I'm developing an application using lucene which will evaluate the
>> representativeness of a list of keywords within a collection ofdocuments.
>> I'm doing this by indexing the documents and then, loading the list of
>> keywords and using the IndexReader Class and DefaultSimilarity,retrieving
>> and average tf of each word (where the tf is obtained through
>> TermDocs.freq() and the average is the sum of tf's divided by
>> the number of
>> documents) and the idf for each word, and printing the output in an html
>> document, together with the documents in which they appear, and others.
>>
>> At this point, I have found two problems,
>> I have documents, in which I know the word appears, but still
>> the tf comes
>> out as '0' (even though the number of documents says 2).
>> and it doesn't print a list of all the documents (ie: it says there are 2
>> documents which contain the word, but only one of them is printed).
>>
>> I don't know if what i'm doing is correct, but to obtain the
>> tf, i'm doing
>> the following:
>>
>>          while (termDocs.next()) {
>>             listaDocNums.add(termDocs.doc());
>>             tf += termDocs.freq();
>>             termDocs.next();
>>          }
>>
>> where termDocs is an enumeration of the documents which containthe word.
>>
>> and for the document names I'm doing the following:
>>
>>          for (int f = 0; f < listaDocNums.size(); f++) {
>>             outrstream.write(reader.document(listaDocNums.
>> get(f)).get("filename"));
>>          }
>>
>> where listaDocNums is an arraylist which contains the numbers for the
>> documents.
>> I must also mention that when i try printing the list of numbers, it also
>> doesn't contain all the documents.
>>
>> That's it, i think i wrote all that was needed.
>>
>> Thanks in advance for any help/guidelines :)
>>
>> Chris
>> --
>> View this message in context: http://www.nabble.com/Problem-
>> with-termdocs.freq-and-other-tp14250898p14250898.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Problem-with-termdocs.freq-and-other-tp14250898p14251256.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message