struts-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amin Mohammed-Coleman <ami...@gmail.com>
Subject Re: Search Problem
Date Thu, 01 Jan 2009 21:10:25 GMT
oh man! Sorry about that...lack of sleep due to new baby in the house...




On 1 Jan 2009, at 20:41, Nils-Helge Garli Hegvik wrote:

> Maybe you should try posting to a Lucene mailing list?
>
> Nils-H
>
> On Thu, Jan 1, 2009 at 9:28 PM, Amin Mohammed-Coleman <aminmc@gmail.com 
> > wrote:
>> Hi
>>
>> I have created a RTFHandler which takes a RTF file and creates a  
>> lucene
>> Document which is indexed.  The RTFHandler looks like something  
>> like this:
>>
>> if (bodyText != null) {
>>                       Document document = new Document();
>>                       Field field = new
>> Field(MetaDataEnum.BODY.getDescription(), bodyText.trim(),  
>> Field.Store.YES,
>> Field.Index.ANALYZED);
>>                       document.add(field);
>>
>>
>> }
>>
>> I am using Java Built in RTF text extraction.  When I run my test  
>> to verify
>> that the document contains text that I expect this works fine.  I  
>> get the
>> following when I print the document:
>>
>> Document<stored/uncompressed,indexed,tokenized<body:This is a test  
>> rtf
>> document that will be indexed.
>>
>> Amin Mohammed-Coleman>
>> stored/uncompressed,indexed<path:rtfDocumentToIndex.rtf>
>> stored/uncompressed,indexed<name:rtfDocumentToIndex.rtf>
>> stored/uncompressed,indexed<type:RTF_INDEXER>
>> stored/uncompressed,indexed<summary:This is a >>
>>
>>
>> The problem is when I use the following to search I get no result:
>>
>>       MultiSearcher multiSearcher = new MultiSearcher(new  
>> Searchable[]
>> {rtfIndexSearcher});
>>                       Term t = new Term("body", "Amin");
>>                       TermQuery termQuery = new TermQuery(t);
>>                       TopDocs topDocs =  
>> multiSearcher.search(termQuery, 1);
>>                       System.out.println(topDocs.totalHits);
>>                       multiSearcher.close();
>>
>> RftIndexSearcher is configured with the directory that holds rtf  
>> documents.
>> I have used Luke to look at the document and what I am finding in the
>> overview tab is the following for the document:
>>
>> 1       body    test
>> 1       id      1234
>> 1       name    rtfDocumentToIndex.rtf
>> 1       path    rtfDocumentToIndex.rtf
>> 1       summary This is a
>> 1       type    RTF_INDEXER
>> 1       body    rtf
>>
>>
>> However on the Document tab I am getting (in the body field):
>>
>> This is a test rtf document that will be indexed.
>>
>> Amin Mohammed-Coleman
>>
>>
>> I would expect to get a hit using "Amin" or even "document".  I am  
>> not sure
>> whether the
>> line:
>> TopDocs topDocs = multiSearcher.search(termQuery, 1);
>>
>> is incorrect as I am not too sure of the meaning of "Finds the top  
>> n hits
>> for query." for search (Query query, int n) according to java docs.
>>
>> I would be grateful if someone may be able to advise on what I may  
>> be doing
>> wrong.  I am using Lucene 2.4.0
>>
>>
>> Cheers
>> Amin
>>
>>
>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@struts.apache.org
> For additional commands, e-mail: user-help@struts.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@struts.apache.org
For additional commands, e-mail: user-help@struts.apache.org


Mime
View raw message