lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nitin Solanki <nitinml...@gmail.com>
Subject Count total frequency of a word in a SOLR index
Date Fri, 23 Jan 2015 07:56:29 GMT
I indexed some text_file files in Solr as it is. Applied "
*StandardTokenizerFactory*" and "*ShingleFilterFactory*" on text_file field

*Configuration of Schema.xml structure below :*
<field name="id" type="string" indexed="true" stored="true" required="true"
multiValued="false" />
<field name="text_file" type="textSpell" indexed="true" stored="true"
required="true" multiValued="false"/>










*<fieldType name="textSpell" class="solr.TextField"
positionIncrementGap="100">       <analyzer
type="index">                             <tokenizer
class="solr.StandardTokenizerFactory"/>                             <filter
class="solr.ShingleFilterFactory" maxShingleSize="5" minShingleSize="2"
outputUnigrams="true"/>       </analyzer>       <analyzer
type="query">                             <tokenizer
class="solr.StandardTokenizerFactory"/>                             <filter
class="solr.ShingleFilterFactory" maxShingleSize="5" minShingleSize="2"
outputUnigrams="true"/>      </analyzer></fieldType>*

*Stored Documents like:*
*[{"id":"1", "text_file": "text": "text of document"}, {"id":"2",
"text_file": "text": "text of document"} and so on ]*

*Problem* : If I search a word in a SOLR index I get a document count for
documents which contain this word, but if the word is included more times
in a document, the total count is still 1 per document. I need every
returned document is counted for the number of times they have the searched
word in the field. *Example* :I see a "numFound" value of 12, but the word
"what" is included 20 times in all 12 documents. Could you help me to find
where I'm wrong, please?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message