lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lu" <chris...@gmail.com>
Subject Re: Search Problem
Date Fri, 02 Jan 2009 10:36:43 GMT
Basically Lucene stores analyzed tokens, and looks up for the matches based
on the tokens.
"Amin" after StandardAnalyzer is "amin", so you need to use new Term("body",
"amin"), instead of new Term("body", "Amin"), to search.

-- 
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
DBSight customer, a shopping comparison site, (anonymous per request) got
2.6 Million Euro funding!

On Thu, Jan 1, 2009 at 11:30 PM, Amin Mohammed-Coleman <aminmc@gmail.com>wrote:

> Hi
>
> Sorry I was using the StandardAnalyzer in this instance.
>
> Cheers
>
>
>
>
> On 2 Jan 2009, at 00:55, Chris Lu wrote:
>
>  You need to let us know the analyzer you are using.
>> -- Chris Lu
>> -------------------------
>> Instant Scalable Full-Text Search On Any Database/Application
>> site: http://www.dbsight.net
>> demo: http://search.dbsight.com
>> Lucene Database Search in 3 minutes:
>>
>> http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes
>> DBSight customer, a shopping comparison site, (anonymous per request) got
>> 2.6 Million Euro funding!
>>
>> On Thu, Jan 1, 2009 at 1:11 PM, Amin Mohammed-Coleman <aminmc@gmail.com
>> >wrote:
>>
>>
>>>
>>>  Hi
>>>>
>>>> I have created a RTFHandler which takes a RTF file and creates a lucene
>>>> Document which is indexed.  The RTFHandler looks like something like
>>>> this:
>>>>
>>>> if (bodyText != null) {
>>>>                      Document document = new Document();
>>>>                      Field field = new
>>>> Field(MetaDataEnum.BODY.getDescription(), bodyText.trim(),
>>>> Field.Store.YES,
>>>> Field.Index.ANALYZED);
>>>>                      document.add(field);
>>>>
>>>>
>>>> }
>>>>
>>>> I am using Java Built in RTF text extraction.  When I run my test to
>>>> verify that the document contains text that I expect this works fine.  I
>>>> get
>>>> the following when I print the document:
>>>>
>>>> Document<stored/uncompressed,indexed,tokenized<body:This is a test
rtf
>>>> document that will be indexed.
>>>>
>>>> Amin Mohammed-Coleman>
>>>> stored/uncompressed,indexed<path:rtfDocumentToIndex.rtf>
>>>> stored/uncompressed,indexed<name:rtfDocumentToIndex.rtf>
>>>> stored/uncompressed,indexed<type:RTF_INDEXER>
>>>> stored/uncompressed,indexed<summary:This is a >>
>>>>
>>>>
>>>> The problem is when I use the following to search I get no result:
>>>>
>>>>      MultiSearcher multiSearcher = new MultiSearcher(new Searchable[]
>>>> {rtfIndexSearcher});
>>>>                      Term t = new Term("body", "Amin");
>>>>                      TermQuery termQuery = new TermQuery(t);
>>>>                      TopDocs topDocs = multiSearcher.search(termQuery,
>>>> 1);
>>>>                      System.out.println(topDocs.totalHits);
>>>>                      multiSearcher.close();
>>>>
>>>> RftIndexSearcher is configured with the directory that holds rtf
>>>> documents.  I have used Luke to look at the document and what I am
>>>> finding
>>>> in the overview tab is the following for the document:
>>>>
>>>> 1       body    test
>>>> 1       id      1234
>>>> 1       name    rtfDocumentToIndex.rtf
>>>> 1       path    rtfDocumentToIndex.rtf
>>>> 1       summary This is a
>>>> 1       type    RTF_INDEXER
>>>> 1       body    rtf
>>>>
>>>>
>>>> However on the Document tab I am getting (in the body field):
>>>>
>>>> This is a test rtf document that will be indexed.
>>>>
>>>> Amin Mohammed-Coleman
>>>>
>>>>
>>>> I would expect to get a hit using "Amin" or even "document".  I am not
>>>> sure whether the
>>>> line:
>>>> TopDocs topDocs = multiSearcher.search(termQuery, 1);
>>>>
>>>> is incorrect as I am not too sure of the meaning of "Finds the top n
>>>> hits
>>>> for query." for search (Query query, int n) according to java docs.
>>>>
>>>> I would be grateful if someone may be able to advise on what I may be
>>>> doing wrong.  I am using Lucene 2.4.0
>>>>
>>>>
>>>> Cheers
>>>> Amin
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message