lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Colella <stefan.cole...@saint-paul.lu>
Subject Re: search result problem
Date Mon, 21 May 2007 09:25:07 GMT
hello,
thx for u reply, i used the explain method and i understand now why some 
documents are returned.

I am using the same Analyzer for indexing and searching.

I tried to only add the content of the page where that expression can be 
found (instead of the whole document) and then  the search works.

Do i have to split my pdf text into more field? Or what could be the 
problem?


Grant Ingersoll wrote:
> Try using the explain() method to see why the documents that were 
> returned scored the way they did.
>
> If I am understanding correctly, you are saying that Luke shows that 
> those words aren't actually in your index?  Can you elaborate on what 
> your analysis process is?  Are you searching using the same Analyzer 
> as you are indexing with?  I would try to isolate the problem down to 
> some unit tests, if possible.
>
> Cheers,
> Grant
>
> On May 18, 2007, at 8:12 AM, Stefan Colella wrote:
>
>> Hello,
>>
>> My application is working with PDF files so i use lucene with PdfBox 
>> to create a little search engine. I am new to lucene.
>>
>> All seemed to work fine but after some tests I saw that some 
>> expressions like "stock option" where never found (or returns the 
>> wrong documents) even if it exist in my PDF files. I searched in the 
>> mail archive and found that I have to use the "French Analyser" but 
>> that didn't work too.
>>
>> I found that there is a tool named Luke to check the lucene index. I 
>> could see that the original text contains those words but nothing in 
>> the tokenizer.
>>
>> Anybody who can help or can explain where I can start to look ?
>>
>> thanks
>>
>
> --------------------------
> Grant Ingersoll
> Center for Natural Language Processing
> http://www.cnlp.org/tech/lucene.asp
>
> Read the Lucene Java FAQ at 
> http://wiki.apache.org/jakarta-lucene/LuceneFAQ
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message