jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cheng Zhang <zhangyongji...@yahoo.com>
Subject Re: search results
Date Sun, 04 Jan 2009 01:08:44 GMT
It turns out that the org.apache.jackrabbit.extractor.HTMLParser eats all digits. in method
filterAndJoin, all non-letters are removed. 
Does anybody has any idea why we do so? imo, index "hf100" makes more sense than indexing
"hf". Or is there anyway I can configure to use my HTMLParser instead of the default?

best,
kevin





----- Original Message ----
From: Cheng Zhang <zhangyongjiang@yahoo.com>
To: users@jackrabbit.apache.org
Sent: Saturday, January 3, 2009 3:02:51 PM
Subject: search results

Hi, 

I have a html file as below stored in the repository.


<html><body>Manufacture: CANON<br/>
Model: HF100<br/>
Title: Canon VIXIA hf100 Flash Memory High Definition Camcorder with 12x Optical Image Stabilized
Zoom<br/>
</body></html>

However, if I search for 'hf100', it returns nothing.

Any suggestion?

Thanks a lot,
Kevin

Mime
View raw message