lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Banga <iamrohitba...@gmail.com>
Subject Re: Lucene fields not analyzed
Date Tue, 09 Feb 2010 09:16:03 GMT
moreover, search for Mr. Arun Kumar also matches other names because Mr.
matches.
i am ready to use Mr. as a stop word in an analyzer.

Rohit Banga


On Tue, Feb 9, 2010 at 2:42 PM, Rohit Banga <iamrohitbanga@gmail.com> wrote:

> i'll try using Luke.
>
> how i want to use Lucene?
>
> there is a sentence that may contain the names of some people from among
> those in a list. the names may be incomplete or may have spelling mistakes.
>
> so i created a lucene index, with each person as a document.
>
> eg.
>
> Mr. Arun Kumar
>
> with a hit highlighter i get
>
> <B>Mr</B>. <B>Arun</B> <B>Kumar</B>
>
> what i want is
> <B>Mr. Arun Kumar</B>
>
> even when there are spelling mistakes.
>
>
> Rohit Banga
>
>
>
> On Tue, Feb 9, 2010 at 1:57 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
>
>> If you don't get it working that way, then you have to ask you the
>> question: Why do you want it indexed that way? Is it because you don't want
>> to find all people in that field when you add ony "Mr." to a search query?
>> It looks like you use StandardAnalyzer, and in this case, I would add "mr",
>> not "mr!", to the stop word list and index the name field as any other
>> field. Before doing this, it would be good to explain, what you are
>> intending to do/prevent by indexing with NOT_ANALYZED, which is the source
>> of your problem.
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>
>> > -----Original Message-----
>> > From: Rohit Banga [mailto:iamrohitbanga@gmail.com]
>> > Sent: Tuesday, February 09, 2010 9:03 AM
>> > To: java-user@lucene.apache.org
>> > Subject: Re: Lucene fields not analyzed
>> >
>> > let us assume this is the only field that is relevant (others are
>> > stored and
>> > not indexed).
>> > i tried termquery and it does not work.
>> > i also tried keyword analyzer and still could not make it work.
>> >
>> > @Mark
>> > i cannot escape the spaces in my query as i am using Lucene to identify
>> > occurences of names among other things in the unstructured sentence.
>> > so while adding names to the index, i used keyword analyzer and changed
>> > the
>> > name to be added to the index to "Mr.\\ Kumar"
>> > but still couldn't get it to work.
>> >
>> >
>> >
>> >
>> >
>> >
>> > Rohit Banga
>> >
>> >
>> > On Tue, Feb 9, 2010 at 1:06 PM, Mark Harwood
>> > <markharw00d@yahoo.co.uk>wrote:
>> >
>> > > I suspect it is because QueryParser uses space characters to separate
>> > > different clauses in a query string while you want the space to
>> > represent
>> > > some content in your "name" field. Try escaping the space character.
>> > >
>> > > Cheers
>> > > Mark
>> > >
>> > >
>> > >
>> > > On 9 Feb 2010, at 07:26, Rohit Banga wrote:
>> > >
>> > > > Hello
>> > > >
>> > > > i have a field that stores names of people. i have used the
>> > NOT_ANALYZED
>> > > > parameter to index the names.
>> > > >
>> > > > this is what happens during indexing
>> > > >
>> > > >    doc.add(new Field("name", "\"" + name + "\"", Field.Store.YES,
>> > > > Field.Index.NOT_ANALYZED));
>> > > >
>> > > >
>> > > >
>> > > > when i search it, i create a query parser using standardanalyzer
>> > and
>> > > append
>> > > > ~0.5 to the search query.
>> > > >
>> > > > the problem is that if the indexed name is "Mr. Kumar", my search
>> > does
>> > > not
>> > > > work for "Mr. Kumar" while it does work for "Mr.Kumar" (without the
>> > > space).
>> > > >
>> > > > // searching code
>> > > >        File index_directory = new File(INDEX_DIR_PATH);
>> > > >        IndexReader reader =
>> > > > IndexReader.open(FSDirectory.open(index_directory), true);
>> > > >        Searcher searcher = new IndexSearcher(reader);
>> > > >
>> > > >        Analyzer analyzer = new
>> > StandardAnalyzer(Version.LUCENE_CURRENT);
>> > > >
>> > > >        QueryParser parser = new QueryParser(Version.LUCENE_CURRENT,
>> > > "name",
>> > > > analyzer);
>> > > >
>> > > >        Query query;
>> > > >        query = parser.parse(text + "~0.5");
>> > > >
>> > > > how to make it work?
>> > > >
>> > > > Rohit Banga
>> > >
>> > >
>> > > ---------------------------------------------------------------------
>> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > > For additional commands, e-mail: java-user-help@lucene.apache.org
>> > >
>> > >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message