lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Naresh <nnar...@gmail.com>
Subject Re: Searching for keywords .net,c#,...
Date Mon, 25 Feb 2013 06:17:49 GMT
Hi,
You can write your own token-filter to split on some characters (comma, |
etc.,) and then build an analyzer using the WhiteSpaceTokenizer,
LowerCaseFilter and your CustomTokenFilter.

See
http://stackoverflow.com/questions/9015348/lucene-custom-analyzer/9015658#9015658

On Mon, Feb 25, 2013 at 11:24 AM, kumar <x10179@gmail.com> wrote:

> Hello all
>
> I am a lucene novice and trying to setup lucene in a .net app using
> lucene.net for searching through documents
> So far it has been fantastic, however given that the users expectations
> are for "google"-like search,
> running into issues searching for .net and c#
>
> Initially tried the StandardAnalyzer which of course does not work for
> searching - .net & c#
> Changed that to a custom analyzer       using WhitespaceTokenizer and
> LowerCaseFilter and it works
> however some of the documents have the keywords as
>
> oracle,.net,C#,java etc. ( i.e. separated by commas without any space )
>
> and this custom analyzer fails here
>
> Looking for suggestions on how this might work as i'm sure it's possible
> considering both lucene and .net/c# have been around for a long long while
>
> It looks like PatternAnalyzer might be of some use in this case, however
> i'm not quite sure how to use it and have found scant references to it
>
>
> Any help is appreciated
>
> Thanks
> kumar
>
>


-- 
Regards
Naresh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message