lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee <lee...@gmail.com>
Subject Re: sorting frequencies
Date Wed, 06 Mar 2013 14:55:19 GMT
Without knowing anything I see your output thinks d1.doc contains the 
word 'mircosoft' - are you sure you are parsing the document you think 
you are parsing?

The output looks like word stems and frequency to me, at a guess.

On 06/03/2013 13:15, Iraida wrote:
> Hello I've indexed documents and considered the frequency of terms. At the
> output file of the form:
> d1.doc 0 14
> d1.doc 1 7
> d1.doc 2 4
> d1.doc 3 1
> d1.doc 4 4
> d1.doc 5 2
> d1.doc 7 1
> d1.doc 8 5
> d1.doc 9 3
> d1.doc aj 1
> d1.doc ax 1
> d1.doc b 11
> d1.doc big 1
> d1.doc bjbjlulu 1
> d1.doc books 3
> d1.doc c 3
> d1.doc can 1
> d1.doc cj 1
> d1.doc come 1
> d1.doc country 1
> d1.doc cx 1
> d1.doc d 8
> d1.doc different 2
> d1.doc e 7
> d1.doc every 1
> d1.doc everywhere 1
> d1.doc f 7
> d1.doc find 1
> d1.doc g 6
> d1.doc gd 1
> d1.doc h 12
> d1.doc has 1
> d1.doc have 1
> d1.doc hx 1
> d1.doc i 10
> d1.doc j 5
> d1.doc jd 1
> d1.doc k 5
> d1.doc l 11
> d1.doc languages 1
> d1.doc libraries 1
> d1.doc library 2
> d1.doc love 1
> d1.doc m 12
> d1.doc many 1
> d1.doc mh 3
> d1.doc microsoft 2
> d1.doc millions 1
> d1.doc msworddoc 1
> d1.doc n 16
> d1.doc newest 1
> d1.doc normal.dot 1
> d1.doc o 16
> d1.doc office 2
> d1.doc oh 1
> d1.doc oldest 1
> d1.doc our 1
> d1.doc p 37
> d1.doc p9 1
> d1.doc pupils 1
> d1.doc q 2
> d1.doc r 11
> d1.doc r4 1
> d1.doc s 4
> d1.doc school 1
> d1.doc sh 3
> d1.doc small 1
> d1.doc subjects 1
> d1.doc t 9
> d1.doc take 1
> d1.doc th 1
> d1.doc u 5
> d1.doc v 1
> d1.doc w 3
> d1.doc word 2
> d1.doc word.document.8 1
> d1.doc x 5
> d1.doc y 4
> d1.doc you 1
> How to sort by frequency???
> The problem is in the fact that there are words that are not in the
> document(I do not know where else there are numbers and letters?)
> The document was the following - d1.doc:
> There are many big and small libraries everywhere in our country. They have
> millions of books in different languages. You can find there the oldest and
> the newest books.
> Every school has a library. Pupils come to the library to take books on
> different subjects.
> Help me please.Sorry for my bad English/
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/sorting-frequencies-tp4045197.html
> Sent from the Lucene - General mailing list archive at Nabble.com.
>


Mime
View raw message