Return-Path: X-Original-To: apmail-lucene-lucene-net-user-archive@www.apache.org Delivered-To: apmail-lucene-lucene-net-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 85C197DF7 for ; Tue, 6 Sep 2011 20:04:44 +0000 (UTC) Received: (qmail 86916 invoked by uid 500); 6 Sep 2011 20:04:44 -0000 Delivered-To: apmail-lucene-lucene-net-user-archive@lucene.apache.org Received: (qmail 86792 invoked by uid 500); 6 Sep 2011 20:04:43 -0000 Mailing-List: contact lucene-net-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: lucene-net-user@lucene.apache.org Delivered-To: mailing list lucene-net-user@lucene.apache.org Received: (qmail 86783 invoked by uid 99); 6 Sep 2011 20:04:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2011 20:04:42 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of digydigy@gmail.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-ew0-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Sep 2011 20:04:37 +0000 Received: by ewy22 with SMTP id 22so4086734ewy.35 for ; Tue, 06 Sep 2011 13:04:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=from:to:references:in-reply-to:subject:date:message-id:mime-version :content-type:x-mailer:thread-index:content-language; bh=1gwoUpvx2WGSiB2xg6b08X9L87OJ5nnmQCraw/N8T5M=; b=Ikni1J6oYtV7OBcwgscnTxjAdA7UueG6GgnhC2uGWN5oGOx786h8YSkjvHVKFhZcy4 SzzFpjjBFutG35cygoBlu4ZMqnevUOnQQXvqG02aagEirzs9/Urv/ssmGP1IxgyUhMt5 vu9ZxFRGaqep58DIB0y7A5l4mo0wDVwDLLVTI= Received: by 10.14.14.7 with SMTP id c7mr1904065eec.158.1315339455153; Tue, 06 Sep 2011 13:04:15 -0700 (PDT) Received: from NEWPC ([81.213.206.230]) by mx.google.com with ESMTPS id q50sm2118436eef.9.2011.09.06.13.04.11 (version=SSLv3 cipher=OTHER); Tue, 06 Sep 2011 13:04:13 -0700 (PDT) From: "Digy" To: References: <003a01cc6cc5$c75957a0$560c06e0$@com> In-Reply-To: Date: Tue, 6 Sep 2011 23:04:09 +0300 Message-ID: <004501cc6cd0$257f5140$707df3c0$@com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0046_01CC6CE9.4ACC8940" X-Mailer: Microsoft Office Outlook 12.0 thread-index: AcxsyC9RQ61ELT0tRgCoEhehWQtptAABuzUw Content-Language: tr Subject: RE: [Lucene.Net] How to index/search a file name ------=_NextPart_000_0046_01CC6CE9.4ACC8940 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable That can be a starting point (Just play a little bit with with = tokenizers & filters ) =20 public class ModifiedStandardAnalyzer : Analyzer { public override TokenStream TokenStream(System.String fieldName, = System.IO.TextReader reader) { StandardTokenizer tokenStream =3D new = StandardTokenizer(reader, true); TokenStream result =3D new StandardFilter(tokenStream); result =3D new LowerCaseFilter(result); result =3D new ASCIIFoldingFilter(result); return result; } } =20 DIGY =20 -----Original Message----- From: Gustavo Poll [mailto:gkpoll@gmail.com]=20 Sent: Tuesday, September 06, 2011 10:06 PM To: lucene-net-user@lucene.apache.org Subject: Re: [Lucene.Net] How to index/search a file name =20 thanks again... Ok, it is not.. =20 standard analyzer: =20 [name.surname@gmail.com] [123.456] [3,5] [at&t] = [g=C3=BCs=C4=B1=C3=B6=C3=A7] [g=C3=BCsi=C3=B6=C3=A7] [a=C3=9F?de?] [??????] [ss=C3=9F] =20 UnaccentedWordAnalyzer: =20 [name] [surname] [gmail] [com] [123] [456] [3] [5] [at] [t] [gusioc] [gusioc] [a=C3=9F?de?] [??????] [ssss] =20 =20 StandardAnalyzer would be perfect to my application if it was accent insensitive... Can anyone tell me please, the easiest way to code such analyzer? (accent insensitive Standard Analyzer) =20 I hear it is not a good idea to make a class that inherits = StandardAnalyzer cause StandardAnalyzer should be a final class.. Is this coherent? =20 Appreciate any help please... Gustavo Poll =20 =20 =20 =20 2011/9/6 Digy =20 > A function is worth a thousand words J >=20 >=20 >=20 >=20 >=20 > void Test() >=20 > { >=20 > Analyzer[] analyzers =3D new Analyzer[] { new = StandardAnalyzer(), > new Lucene.Net.Analysis.Ext.UnaccentedWordAnalyzer() }; >=20 > string input =3D "Name.Surname@gmail.com 123.456 3,5 AT&T > = =C4=9F=C3=BC=C5=9F=C4=B1=C3=B6=C3=A7%=C4=9E=C3=9C=C5=9E=C4=B0=C3=96=C3=87= $=CE=91=CE=92=CE=93=CE=94=CE=95=CE=96#=D0=90=D0=91=D0=92=D0=93=D0=94=D0=95= SS=C3=9F"; >=20 >=20 >=20 > foreach (Analyzer analyzer in analyzers) >=20 > { >=20 > TokenStream ts =3D analyzer.TokenStream("", new > StringReader(input)); >=20 > Lucene.Net.Analysis.Token t =3D ts.Next(); >=20 > while (t !=3D null) >=20 > { >=20 > Console.Write("[" + t.TermText() + "] "); >=20 > t =3D ts.Next(); >=20 > } >=20 > Console.WriteLine(); Console.WriteLine(); >=20 >=20 >=20 > } >=20 > } >=20 >=20 >=20 > DIGY >=20 >=20 >=20 >=20 >=20 > -----Original Message----- > From: Gustavo Poll [mailto:gkpoll@gmail.com] > Sent: Tuesday, September 06, 2011 9:00 PM > To: lucene-net-user@lucene.apache.org > Subject: Re: [Lucene.Net] How to index/search a file name >=20 >=20 >=20 > thanks DIGY, I have interest in that too... Let me see if i = understood: >=20 >=20 >=20 > UnaccentedWordAnalyzer is like Standard Analyzer, but accent = insensitive? >=20 >=20 >=20 > Thanks! >=20 > Gustavo Poll >=20 >=20 >=20 >=20 >=20 > 2011/9/6 digy digy >=20 >=20 >=20 > > That may help >=20 > > >=20 > > UnaccentedWordAnalyzer @ >=20 > > >=20 > > > = https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/contrib/C= ore/Analysis/Ext/Analysis.Ext.cs >=20 > > >=20 > > >=20 > > DIGY >=20 > > >=20 > > On Tue, Sep 6, 2011 at 12:31 PM, Floyd Wu = wrote: >=20 > > >=20 > > > Hi everyone, >=20 > > > >=20 > > > I have a question that annoying me many times. my situation is = that I >=20 > > need >=20 > > > to index file name and need to be searchable using partial file = name. >=20 > > > >=20 > > > example--> 2009&2010Q2_ABCD_Report.xls (the file name) >=20 > > > >=20 > > > When I shot queries >=20 > > > >=20 > > > filename:ABCD no match return. >=20 > > > >=20 > > > filename:2010Q2_ABCD match >=20 > > > >=20 > > > filename:Report* match >=20 > > > >=20 > > > I'm using StandardAnalyzer and Lucene.Net version is 2.9.3. = Current >=20 > > > filename >=20 > > > field is set to tokenized/indexed/store >=20 > > > >=20 > > > What I want is when user type any part of file name that = lucene.Net can >=20 > > > match. >=20 > > > (string like 2009 or 2010Q2 or ABCD or Report or xls or = Report.xls) >=20 > > > >=20 > > > Please help on this or kindly direct me a way to solve it. >=20 > > > >=20 > > > Floyd >=20 > > > >=20 > > >=20 >=20 >=20 > ----- >=20 > Bu iletide vir=C3=BCs bulunamad=C4=B1. >=20 > AVG taraf=C4=B1ndan kontrol edildi - www.avg.com >=20 > S=C3=BCr=C3=BCm: 2012.0.1796 / Vir=C3=BCs Veritaban=C4=B1: 2082/4480 - = S=C3=BCr=C3=BCm Tarihi: 06.09.2011 >=20 >=20 =20 ----- Bu iletide vir=C3=BCs bulunamad=C4=B1. AVG taraf=C4=B1ndan kontrol edildi - www.avg.com S=C3=BCr=C3=BCm: 2012.0.1796 / Vir=C3=BCs Veritaban=C4=B1: 2082/4480 - = S=C3=BCr=C3=BCm Tarihi: 06.09.2011 ------=_NextPart_000_0046_01CC6CE9.4ACC8940--