lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Victor Pascual <vic...@mobilemediacontent.com>
Subject Re: Solr does not recognize language
Date Mon, 05 May 2014 08:03:59 GMT
Thank you very much for you help Ahmet.

However the language detection is still not workin. :(
My solrconfig.xml didn't contain that lst section inside the update
requestHandler.
That's the content I added:

  <requestHandler name="/update"
>                   class="solr.XmlUpdateRequestHandler">
>        <lst name="defaults">
>          <str name="update.chain">langid</str>
>        </lst>
>     </requestHandler>
>


>    <updateRequestProcessorChain name="langid">
>        <processor
> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
>           <lst name="defaults">
>             <str name="langid.fl">text</str>
>             <str name="langid.langField">lang</str>
>           </lst>
>         </processor>
>         <processor class="solr.LogUpdateProcessorFactory" />
>        <processor class="solr.RunUpdateProcessorFactory" />
>      </updateRequestProcessorChain>


Now, your suggested query
http://localhost:8080/solr/update?commit=true&update.chain=langid returns

<response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">14</int>
> </lst>
> </response>

And there is still no lang field in my documents.
Any idea what am I doing wrong?



On Tue, Apr 29, 2014 at 5:33 PM, Ahmet Arslan <iorixxx@yahoo.com> wrote:

> Hi,
>
> solr/update should be used, not /solr/select
>
> curl 'http://localhost:8983/solr/update?commit=true&update.chain=langid'
>
> By the way don't you have following definition in your solrconfig.xml?
>
>  <requestHandler name="/update" class="solr.UpdateRequestHandler">
>        <lst name="defaults">
>          <str name="update.chain">langid</str>
>        </lst>
>   </requestHandler>
>
>
>
> On Tuesday, April 29, 2014 4:50 PM, Victor Pascual <
> victor@mobilemediacontent.com> wrote:
> Hi Ahmet,
>
> thanks for your reply. Adding &update.chain=langid to my query doesn't
> work: IP:8080/solr/select/?q=*%3A*&update.chain=langid
> Regarding defining the chain in an UpdateRequestHandler... sorry for the
> lame question but shall I paste those three lines to solrconfig.xml, or
> shall I add them somewhere else?
>
> There is not UpdateRequestHandler in my solrconfig.
>
> Thanks!
>
>
>
> On Tue, Apr 29, 2014 at 3:13 PM, Ahmet Arslan <iorixxx@yahoo.com> wrote:
>
> > Hi,
> >
> > Did you attach your chain to a UpdateRequestHandler?
> >
> > You can do it by adding &update.chain=langid to the URL or defining it in
> > a defaults section as follows
> >
> > <lst name="defaults">
> >      <str name="update.chain">langid</str>
> >    </lst>
> >
> >
> >
> > On Tuesday, April 29, 2014 3:18 PM, Victor Pascual <
> > victor@mobilemediacontent.com> wrote:
> > Dear all,
> >
> > I'm a new user of Solr. I've managed to index a bunch of documents (in
> > fact, they are tweets) and everything works quite smoothly.
> >
> > Nevertheless it looks like Solr doesn't detect the language of my
> documents
> > nor remove stopwords accordingly so I can extract the most frequent
> terms.
> >
> > I've added this piece of XML to my solrconfig.xml as well as the Tika lib
> > jars.
> >
> >     <updateRequestProcessorChain name="langid">
> >        <processor
> >
> >
> class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory">
> >           <lst name="defaults">
> >             <str name="langid.fl">text</str>
> >             <str name="langid.langField">lang</str>
> >           </lst>
> >         </processor>
> >         <processor class="solr.LogUpdateProcessorFactory" />
> >        <processor class="solr.RunUpdateProcessorFactory" />
> >      </updateRequestProcessorChain>
> >
> > There is no error in the tomcat log file, so I have no clue of why this
> > isn't working.
> > Any hint on how to solve this problem will be much appreciated!
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message