lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernd Fehling <bernd.fehl...@uni-bielefeld.de>
Subject Re: How to use polish stemmer - Stempel - in schema.xml?
Date Tue, 02 Nov 2010 13:11:32 GMT
Hi Jakub,

if you unzip your stempel-1.0.jar do you have the
required directory structure and file in there?
org/getopt/stempel/lucene/StempelFilter.class

Regards,
Bernd

Am 02.11.2010 13:54, schrieb Jakub Godawa:
> Erick I've put the jar files like that before. I also added the
> directive and put the file in instanceDir/lib
> 
> What is still a problem is that even the files are loaded:
> 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader replaceClassLoader
> INFO: Adding 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
> to classloader
> 
> I am not able to use the FilterFactory... maybe I am attempting it in
> a wrong way?
> 
> Cheers,
> Jakub Godawa.
> 
> 2010/11/2 Erick Erickson <erickerickson@gmail.com>:
>> The polish stemmer jar file needs to be findable by Solr, if you copy
>> it to <solr_home>/lib and restart solr you should be set.
>>
>> Alternatively, you can add another <lib> directive to the solrconfig.xml
>> file
>> (there are several examples in that file already).
>>
>> I'm a little confused about not being able to find TokenFilter, is that
>> still
>> a problem?
>>
>> HTH
>> Erick
>>
>> On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa <jakub.godawa@gmail.com> wrote:
>>
>>> Thank you Bernd! I couldn't make it run though. Here is my problem:
>>>
>>> 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
>>> 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
>>> directive: <lib path="../lib/stempel-1.0.jar" />
>>> 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:
>>>
>>> (...)
>>>  <!-- Polish -->
>>>   <fieldType name="text_pl" class="solr.TextField">
>>>    <analyzer>
>>>       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>      <filter class="solr.LowerCaseFilterFactory"/>
>>>      <filter class="org.getopt.stempel.lucene.StempelFilter" />
>>>      <!--    <filter
>>> class="org.getopt.solr.analysis.StempelTokenFilterFactory"
>>> protected="protwords.txt" /> -->
>>>    </analyzer>
>>>  </fieldType>
>>> (...)
>>>
>>> 4. jar file is loaded but I got an error:
>>> SEVERE: Could not start SOLR. Check solr/home property
>>> java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
>>>      at java.lang.ClassLoader.defineClass1(Native Method)
>>>      at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
>>>      at
>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>> (...)
>>>
>>> 5. Different class gave me that one:
>>> SEVERE: org.apache.solr.common.SolrException: Error loading class
>>> 'org.getopt.solr.analysis.StempelTokenFilterFactory'
>>>      at
>>> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
>>>      at
>>> org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
>>> (...)
>>>
>>> Question is: How to make <fieldType /> and <filter /> work with that
>>> Stempel? :)
>>>
>>> Cheers,
>>> Jakub Godawa.
>>>
>>> 2010/10/29 Bernd Fehling <bernd.fehling@uni-bielefeld.de>:
>>>> Hi Jakub,
>>>>
>>>> I have ported the KStemmer for use in most recent Solr trunk version.
>>>> My stemmer is located in the lib directory of Solr
>>> "solr/lib/KStemmer-2.00.jar"
>>>> because it belongs to Solr.
>>>>
>>>> Write it as FilterFactory and use it as Filter like:
>>>> <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>> protected="protwords.txt" />
>>>>
>>>> This is how my fieldType looks like:
>>>>
>>>>    <fieldType name="text_kstem" class="solr.TextField"
>>> positionIncrementGap="100">
>>>>      <analyzer type="index">
>>>>        <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>>        <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> words="stopwords.txt" enablePositionIncrements="false" />
>>>>        <filter class="solr.WordDelimiterFilterFactory"
>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1"
>>>> catenateAll="0" splitOnCaseChange="1" />
>>>>        <filter class="solr.LowerCaseFilterFactory" />
>>>>        <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>> protected="protwords.txt" />
>>>>        <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
>>>>      </analyzer>
>>>>      <analyzer type="query">
>>>>        <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>>        <filter class="solr.StopFilterFactory" ignoreCase="true"
>>> words="stopwords.txt" />
>>>>        <filter class="solr.WordDelimiterFilterFactory"
>>> generateWordParts="1" generateNumberParts="1" catenateWords="0"
>>> catenateNumbers="0"
>>>> catenateAll="0" splitOnCaseChange="1" />
>>>>        <filter class="solr.LowerCaseFilterFactory" />
>>>>        <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>> protected="protwords.txt" />
>>>>        <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
>>>>      </analyzer>
>>>>    </fieldType>
>>>>
>>>> Regards,
>>>> Bernd
>>>>
>>>>
>>>>
>>>> Am 28.10.2010 14:56, schrieb Jakub Godawa:
>>>>> Hi!
>>>>> There is a polish stemmer http://www.getopt.org/stempel/ and I have
>>>>> problems connecting it with solr 1.4.1
>>>>> Questions:
>>>>>
>>>>> 1. Where EXACTLY do I put "stemper-1.0.jar" file?
>>>>> 2. How do I register the file, so I can build a fieldType like:
>>>>>
>>>>> <fieldType name="text_pl" class="solr.TextField">
>>>>>   <analyzer class="org.geoopt.solr.analysis.StempelTokenFilterFactory"/>
>>>>> </fieldType>
>>>>>
>>>>> 3. Is that the right approach to make it work?
>>>>>
>>>>> Thanks for verbose explanation,
>>>>> Jakub.
>>>>
>>>
>>

-- 
*************************************************************
Bernd Fehling                Universit├Ątsbibliothek Bielefeld
Dipl.-Inform. (FH)                        Universit├Ątsstr. 25
Tel. +49 521 106-4060                   Fax. +49 521 106-4052
bernd.fehling@uni-bielefeld.de                33615 Bielefeld

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************

Mime
View raw message