lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Solanki <nitinml...@gmail.com>
Subject Re: Count total frequency of a word in a SOLR index
Date Fri, 23 Jan 2015 08:51:23 GMT
Thanks Mikhail Khludnev..
I tried this:
*http://localhost:8983/solr/collection1/spell?q=gram:%22the%22&rows=1&fl=totaltermfreq(gram,the)
<http://localhost:8983/solr/collection1/spell?q=gram:%22the%22&rows=1&fl=totaltermfreq(gram,the)>*
and it worked.
I want to know more. Can we do same thing *(totaltermfreq)* on suggestions
? I tried "th" and get "the" is suggestion. I want to retrieve term
frequency not document frequency even in the suggestions. Can I do that?

*Instance of suggestions: *
<lst>
<str name="word">the</str>
<int name="freq">897</int>  *Here -* freq is Document frequency. I need
Term frequency
</lst>



On Fri, Jan 23, 2015 at 1:53 PM, Mikhail Khludnev <
mkhludnev@griddynamics.com> wrote:

> https://cwiki.apache.org/confluence/display/solr/Function+Queries
> totaltermfreq()
>
> of you need to sum term freq on docs from resultset?
>
>
> On Fri, Jan 23, 2015 at 10:56 AM, Nitin Solanki <nitinmlvya@gmail.com>
> wrote:
>
> > I indexed some text_file files in Solr as it is. Applied "
> > *StandardTokenizerFactory*" and "*ShingleFilterFactory*" on text_file
> field
> >
> > *Configuration of Schema.xml structure below :*
> > <field name="id" type="string" indexed="true" stored="true"
> required="true"
> > multiValued="false" />
> > <field name="text_file" type="textSpell" indexed="true" stored="true"
> > required="true" multiValued="false"/>
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> > *<fieldType name="textSpell" class="solr.TextField"
> > positionIncrementGap="100">       <analyzer
> > type="index">                             <tokenizer
> > class="solr.StandardTokenizerFactory"/>
>  <filter
> > class="solr.ShingleFilterFactory" maxShingleSize="5" minShingleSize="2"
> > outputUnigrams="true"/>       </analyzer>       <analyzer
> > type="query">                             <tokenizer
> > class="solr.StandardTokenizerFactory"/>
>  <filter
> > class="solr.ShingleFilterFactory" maxShingleSize="5" minShingleSize="2"
> > outputUnigrams="true"/>      </analyzer></fieldType>*
> >
> > *Stored Documents like:*
> > *[{"id":"1", "text_file": "text": "text of document"}, {"id":"2",
> > "text_file": "text": "text of document"} and so on ]*
> >
> > *Problem* : If I search a word in a SOLR index I get a document count for
> > documents which contain this word, but if the word is included more times
> > in a document, the total count is still 1 per document. I need every
> > returned document is counted for the number of times they have the
> searched
> > word in the field. *Example* :I see a "numFound" value of 12, but the
> word
> > "what" is included 20 times in all 12 documents. Could you help me to
> find
> > where I'm wrong, please?
> >
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
> Principal Engineer,
> Grid Dynamics
>
> <http://www.griddynamics.com>
> <mkhludnev@griddynamics.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message