lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emir Arnautović <emir.arnauto...@sematext.com>
Subject Re: Solr Phrase Count : How to get count of a phrase in a text field solr
Date Mon, 26 Feb 2018 08:47:10 GMT
For start you don’t have to store it. Also, is 10 words shingle really needed?

Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 24 Feb 2018, at 16:58, aneeshkappu <happyaneesh991@gmail.com> wrote:
> 
> Hi All, I want to get the count of a phrase from a document .
> Currently im using Shingle Filter factory but it consuming a large disk
> space. Any alternate ways or any way to optimize this.
> currently it consuming 40GB for just 46K records
> 
> my schema setting is given below 
> 
> <field name="data_text" type="texto_indexado" indexed="true" stored="true"
> multiValued="false"/>
> 
> 
> <fieldType name="texto_indexado" class="solr.TextField" omitNorms="false">    

>        <analyzer type="index">
>            <tokenizer class="solr.StandardTokenizerFactory"/> 
>            <filter class="solr.LowerCaseFilterFactory"/>   
>            <filter class="solr.ShingleFilterFactory" maxShingleSize="10"
> outputUnigrams="true"/>
>        </analyzer> 
>          <analyzer type="query"> 
>            <tokenizer class="solr.StandardTokenizerFactory"/>
>            <filter class="solr.LowerCaseFilterFactory"/>
>        </analyzer> 
> 
>  </fieldType>  
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Mime
View raw message