lucene-lucene-net-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jokin Cuadrado <joki...@gmail.com>
Subject Re: Snowball Filter and Quotes
Date Tue, 12 May 2009 15:22:29 GMT
 i have tested the snowballFilter with whitespacetokenizer and it
leaves the quotes untouched, so maybe the problem is inside
synonymfilter. You can copy one of the test of the analysis test suite
and test your own analyzer to see the differences between the actual
results and the expected ones.

By the way, theres a "PerFieldAnalyzerWrapper" just for the case of
different analyzers depending of the fields, it may be a  excessive
for just 2 cases, but it's a bit more intuitive.

Regards
Jokin


On Tue, May 12, 2009 at 4:25 PM, Heath Aldrich <haldrich@aes2.com> wrote:
> Hi,
> I have a custom analyzer
> It will use different tokenizers depending on the field.
> There is a field in my system called Item_Code, that field is using the
> KeywordTokenizer.
> Any other field uses the WhitespaceTokenizer followed up with the
> Snowballfilter
>
>
> <code>
> Public Overloads Overrides Function TokenStream(ByVal fieldName As
> String, ByVal reader As TextReader) As TokenStream
>
>        If fieldName = "item_code" Then
>            Return New Lucene.Net.Analysis.KeywordTokenizer(reader)
>        Else
>            Dim x = New Lucene.Net.Analysis.WhitespaceTokenizer(reader)
>            Return New Lucene.Net.Analysis.Snowball.SnowballFilter(New
> SynonymFilter(x), "English")
>        End If
>
>
>
>    End Function
> </code>
>
>
> -----Original Message-----
> From: Jokin Cuadrado [mailto:jokin.c@gmail.com]
> Sent: Tuesday, May 12, 2009 4:11 AM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Snowball Filter and Quotes
>
> Could you post the code where you construct your analyzer? you use the
> whitespacefilter, but what tokenizer are you using?
>
>
> On Tue, May 12, 2009 at 2:58 AM, Heath Aldrich <haldrich@aes2.com>
> wrote:
>> Sorry in advance if this should be in the dev list...
>>
>>
>>
>> I have a index generator that uses the Snowball filter.
>>
>> It also uses the Whitespace filter so as to not remove anything but
>> white space.
>>
>>
>>
>> When I look at the raw data in Luke, it seems like all the quotes in
> my
>> data have been stripped out.
>>
>> Just trying to find out if anyone else has seen this, and if anyone
>> knows if the Snowball filter is responsible.
>>
>>
>>
>> Thanks in advance.
>>
>> Heath
>>
>>
>
>
>
> --
> Jokin
>
>
>



-- 
Jokin

Mime
View raw message